You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This package provides load support for Parquet files under the FileIO.jl package.
Installation
Use ] add ParquetFiles in Julia to install ParquetFiles and its dependencies.
Usage
Load a Parquet file
To read a Parquet file into a DataFrame, use the following julia code:
using ParquetFiles, DataFrames
df =DataFrame(load("data.parquet"))
The call to load returns a struct that is an IterableTable.jl, so it can be passed to any function that can handle iterable tables, i.e. all the sinks in IterableTable.jl. Here are some examples of materializing a Parquet file into data structures that are not a DataFrame:
using ParquetFiles, IndexedTables, TimeSeries, Temporal, VegaLite
# Load into an IndexedTable
it =IndexedTable(load("data.parquet"))
# Load into a TimeArray
ta =TimeArray(load("data.parquet"))
# Load into a TS
ts =TS(load("data.parquet"))
# Plot directly with Gadfly@vlplot(:point, data=load("data.parquet"), x=:a, y=:b)
Using the pipe syntax
load also support the pipe syntax. For example, to load a Parquet file into a DataFrame, one can use the following code:
using ParquetFiles, DataFrame
df =load("data.parquet") |> DataFrame
The pipe syntax is especially useful when combining it with Query.jl queries, for example one can easily load a Parquet file, pipe it into a query, then pipe it to the save function to store the results in a new file.