You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DataFusion is an extensible query engine written in Rust that
uses Apache Arrow as its in-memory format.
This crate provides libraries and binaries for developers building fast and
feature rich database and analytic systems, customized to particular workloads.
See use cases for examples. The following related subprojects target end users:
DataFusion Python offers a Python interface for SQL and DataFrame
queries.
DataFusion Ray provides a distributed version of DataFusion that scales
out on Ray clusters.
DataFusion Comet is an accelerator for Apache Spark based on
DataFusion.
"Out of the box,"
DataFusion offers [SQL] and [Dataframe] APIs, excellent performance,
built-in support for CSV, Parquet, JSON, and Avro, extensive customization, and
a great community.
DataFusion features a full query planner, a columnar, streaming, multi-threaded,
vectorized execution engine, and partitioned data sources. You can
customize DataFusion at almost all points including additional data sources,
query languages, functions, custom operators and more.
See the Architecture section for more details.
DataFusion is great for building projects such as domain specific query engines, new database platforms and data pipelines, query languages and more.
It lets you start quickly from a fully working engine, and then customize those features specific to your use. Click Here to see a list known users.
backtrace: include backtrace information in error messages
pyarrow: conversions between PyArrow and DataFusion types
serde: enable arrow-schema's serde feature
DataFusion API Evolution and Deprecation Guidelines
Public methods in Apache DataFusion evolve over time: while we try to maintain a
stable API, we also improve the API over time. As a result, we typically
deprecate methods before removing them, according to the deprecation guidelines.
Dependencies and Cargo.lock
Following the guidance on committing Cargo.lock files, this project commits
its Cargo.lock file.
CI uses the committed Cargo.lock file, and dependencies are updated regularly
using Dependabot PRs.