You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The package currently provides working implementations for in-memory data sources, but will eventually be able to translate queries into e.g. SQL. There is a prototype implementation of such a "query provider" for SQLite in the package, but it is experimental at this point and only works for a very small subset of queries.
Query is heavily inspired by LINQ, in fact right now the package is largely an implementation of the LINQ part of the C# specification. Future versions of Query will most likely add features that are not found in the original LINQ design.
Alternatives
Query.jl is not the only julia initiative for querying data, there are many other packages that have similar goals. Take a look at DataFramesMeta.jl, and SplitApplyCombine.jl. If I missed other initiatives, please let me know and I'll add them to this list!
Please ask any usage question in the Data Domain on the julia Discourse forum. If you find a bug or have an improvement suggestion for this package, please open an issue in this github repository.
Highlights
Query is an almost complete implementation of the query expression section of the C# specification, with some additional julia specific features added in.
The package supports a large number of data sources: DataFrames, DataStreams (including CSV, Feather, SQLite, ODBC), DataTables, IndexedTables, TimeSeries, Temporal, TypedTables, DifferentialEquations (any DESolution), arrays any type that can be iterated.
The results of a query can be materialized into a range of different data structures: iterators, DataFrames, DataTables, IndexedTables, TimeSeries, Temporal, TypedTables, arrays, dictionaries or any DataStream sink (this includes CSV and Feather files).
One can mix and match almost all sources and sinks within one query. For example, one can easily perform a join of a DataFrame with a CSV file and write the results into a Feather file, all within one query.
The type instability problems that one can run into with DataFrames do not affect Query, i.e. queries against DataFrames are completely type stable.
There are three different APIs that package authors can use to make their data sources queryable with this package. The most simple API only requires a data source to provide an iterator. Another API provides a data source with a complete graph representation of the query and the data source can e.g. rewrite that query graph as a SQL statement to execute the query. The final API allows a data source to provide its own data structures that can represent a query graph.