| CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 183
Releases: xtdb/xtdb
v2.1.0
Such a lot to tell you about since our 2.0 release!
As always, see the milestone for the full list of issues closed and PRs merged. Thank you to everyone who's been involved in this release, whether that be by raising issues, helping us repro, helping us benchmark or contributing code - it's massively appreciated 🙏 Particular thanks go out to our clients and Design Partners, who once again have been heavily involved in the direction of XTDB, as well as providing invaluable real-world testing and feedback.
I'd also like to give a special mention to Jacob O'Bryant, who has very kindly contributed an OLTP benchmark based on his Yakread dataset and workload. This has been hugely helpful in guiding our performance work for 2.1 and beyond - working with him, we've already been able to land significant OLTP gains here as a result. Thank you Jacob!
"Multi-DB"
2.1 brings a significant (but still largely backwards-compatible) change to the architecture of XTDB - the introduction of secondary databases!
The database in XT has always been the combination of two core, shared components: a transaction log, and an object-store. This change allows one XTDB node to index and reference multiple tx-logs and object-stores.
Specifically, this decoupling of databases (storage) and clusters (compute) enables a data-mesh architecture - organise your databases around business domains (orders, customers, products), while each application team runs their own XTDB compute cluster. Teams can attach secondary databases to access shared domain data, aligning your data model with your organization's structure while keeping compute independent.
Queries can span multiple databases, enabling powerful cross-domain analytics and insights:
-- attach the secondary databases
ATTACH DATABASE user_preferences WITH $$
log: !Kafka
cluster: 'my-kafka'
topic: 'xtdb.user-preferences'
storage: !S3
bucket: 'my-bucket'
path: 'user-preferences'
$$
-- query across it in a single query - what notifications to send?
FROM orders o
JOIN users_preferences.prefs up -- use db_name.table_name
ON o.user_id = up._id
WHERE o.created_at > CURRENT_DATE - INTERVAL 'P1D'
SELECT o._id, up.notification_settingsWe've even made a small scale-factor TPC-H data-set available for you to play with using our 'Play' UI
For more information, and how to get started attaching secondary databases, see 'Databases in XTDB'.
We're really keen to see what you build with this - we think it's a really powerful way to decouple your data and applications.
This has meant a couple of minor breaking configuration changes - see:
- Log configuration changelog
- Kafka configuration changelog
- Storage configuration changelog
- EDN configuration changelog
Additionally, we've made some changes to repeatable queries to support multiple databases:
WATERMARK->AWAIT_TOKEN- see 'Transaction consistency'- We've added
SNAPSHOT_TOKENin addition toSNAPSHOT_TIME- we'd recommend using the former for repeatable queries where possible.
Our current roadmap for this feature is as follows (usual 'subject to change' caveat):
- Multi-partition tx-logs for secondary databases - horizontal write scaling.
- Read-only secondaries - just listening in.
- Removing the requirement to have XT-specific transaction logs - bring your own topics
Client driver support
We've been hard at work improving the support for XTDB through language-native PostgreSQL drivers - we now support ten languages: C/C++, C#, Clojure, Elixir, Go, Java, Kotlin, Node.js, PHP, Python and Ruby.
See 'Language Drivers' for the up-to-date list, and also our 'driver-examples' repository.
OpenID Connect (OIDC) authentication
2.1 adds support for OpenID Connect (OIDC) authentication to XTDB's built-in authentication system - you can now configure XTDB to authenticate users via an OIDC provider, such as Keycloak, Auth0, or Okta. This is likely to become the primary authentication method in XTDB going forward, so that users can leverage existing identity infrastructure rather than attempting to mirror and maintain roles in XTDB.
We'll be adding support for more OIDC authentication methods, as well as OIDC-based authorization/role-mapping in future releases.
See OIDC for more information and how to get started.
Observability
We've heard your feedback regarding observability in XTDB loud and clear, and so 2.1 brings a number of improvements here.
XTDB now supports OpenTelemetry-backed tracing for query introspection and performance analysis. Traces are sent via the OTLP (OpenTelemetry Protocol) HTTP endpoint to your tracing backend (e.g., Grafana Tempo, Jaeger, etc).
tracer:
# -- required
# Enable OpenTelemetry tracing.
enabled: true
# OTLP HTTP endpoint for sending traces.
# (Can be set as an !Env value)
endpoint: "https://localhost:4318/v1/traces"
# -- optional
# Service name identifier for traces.
# (Can be set as an !Env value)
# serviceName: "xtdb"Tracing provides detailed introspection into query execution, including:
- Per-query execution times for performance analysis.
- Information on which queries were executed, available through the xtdb.query span attributes.
- Lower-level operation timings, revealing how time is distributed across individual query operations.
See the tracing guide for details on how to get started.
Within the database itself, we've made the EXPLAIN plans much prettier, and added EXPLAIN ANALYZE, which provides detailed timing information for each step of the query plan:
EXPLAIN ANALYZE
SELECT o.o_orderpriority, COUNT(*) AS order_count
FROM orders AS o
WHERE o.o_orderdate >= DATE '1993-07-01'
AND o.o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
FROM lineitem AS l
WHERE l.l_orderkey = o.o_orderkey
AND l.l_commitdate < l.l_receiptdate
)
ORDER BY o.o_orderpriority;
-- depth | op | total_time | time_to_first_block | block_count | row_count
-- ----------------------+-----------+---------------+---------------------+-------------+-----------
-- -> | project | "PT0.757754S" | "PT0.757717S" | 1 | 5
-- -> | order-by | "PT0.757752S" | "PT0.757713S" | 1 | 5
-- -> | project | "PT0.757475S" | "PT0.757423S" | 1 | 5
-- -> | group-by | "PT0.757474S" | "PT0.75742S" | 1 | 5
-- -> | project | "PT0.757327S" | "PT0.670638S" | 256 | 2539
-- -> | semi-join | "PT0.757259S" | "PT0.670606S" | 256 | 2539
-- -> | project | "PT0.656491S" | "PT0.000828S" | 1024 | 189646
-- -> | rename | "PT0.656324S" | "PT0.000816S" | 1024 | 189646
-- -> | select | "PT0.655886S" | "PT0.000792S" | 1024 | 189646
-- -> | scan | "PT0.65443S" | "PT0.00054S" | 1024 | 299814
-- -> | rename | "PT0.087767S" | "PT0.000705S" | 256 | 2765
-- -> | scan | "PT0.087687S" | "PT0.000694S" | 256 | 2765Transaction Metadata
You can now annotate transactions with metadata for stronger auditability, for example, to track which user last modified a record ([Play](https://play.xtdb.com/?version=2.x-SNAPSHOT&type=sql-v2&enc=2&txs=NobwRAzgnhAuCmBbAtLAlo%2BYBcA7ArgDaEA0YsAHhDmAEICiA4gJIByABAEr0CCAIuwDqnZgBV6QsQAl2ACgCy9Ufx7L2AXnYh8EeACds7AOQAjAPYmjAXwCUAbgA6uNgGV6nUezaiA8uwAmZgDGEFz0AMI%2BnHwuWgD6aP6GAIwk7EFmhMmGpgCGetaOuJHy8mJ2YFYk4ACO%2BPpQNLB69WBklNTYYG4AMhGiTuwBwRAAdAn%2B7LmhgUGJJIPsFLCjHeOJU6GU84sAVOz0ABrhPQCqfBITTgBinD7ywyGLAFI%2BbEsrHew%2BHLKzY3FoHAkHEAGZ6MyI...
Assets 2
v2.1.0-rc0
Such a lot to tell you about since our 2.0 release!
As always, see the milestone for the full list of issues closed and PRs merged. Thank you to everyone who's been involved in this release, whether that be by raising issues, helping us repro, helping us benchmark or contributing code - it's massively appreciated 🙏 Particular thanks go out to our clients and Design Partners, who once again have been heavily involved in the direction of XTDB, as well as providing invaluable real-world testing and feedback.
I'd also like to give a special mention to Jacob O'Bryant, who has very kindly contributed an OLTP benchmark based on his Yakread dataset and workload. This has been hugely helpful in guiding our performance work for 2.1 and beyond - working with him, we've already been able to land significant OLTP gains here as a result. Thank you Jacob!
While I have your attention, I'd also like to shamelessly plug Jeremy's and my upcoming talk as part of Carnegie Mellon University Database Group's "Future Data Systems" seminar series - "Reconstructing History in XTDB".
Monday November 24th, 16:30 Eastern, 21:30Z on Zoom - we'd love to see you there!
Right, on with the release notes:
"Multi-DB"
2.1 brings a significant (but still largely backwards-compatible) change to the architecture of XTDB - the introduction of secondary databases!
The database in XT has always been the combination of two core, shared components: a transaction log, and an object-store. This change allows one XTDB node to index and reference multiple tx-logs and object-stores.
Specifically, this decoupling of databases (storage) and clusters (compute) enables a data-mesh architecture - organise your databases around business domains (orders, customers, products), while each application team runs their own XTDB compute cluster. Teams can attach secondary databases to access shared domain data, aligning your data model with your organization's structure while keeping compute independent.
Queries can span multiple databases, enabling powerful cross-domain analytics and insights:
-- attach the secondary databases
ATTACH DATABASE user_preferences WITH $$
log: !Kafka
cluster: 'my-kafka'
topic: 'xtdb.user-preferences'
storage: !S3
bucket: 'my-bucket'
path: 'user-preferences'
$$
-- query across it in a single query - what notifications to send?
FROM orders o
JOIN users_preferences.prefs up -- use db_name.table_name
ON o.user_id = up._id
WHERE o.created_at > CURRENT_DATE - INTERVAL 'P1D'
SELECT o._id, up.notification_settingsWe've even made a small scale-factor TPC-H data-set available for you to play with using our 'Play' UI
For more information, and how to get started attaching secondary databases, see 'Databases in XTDB'.
We're really keen to see what you build with this - we think it's a really powerful way to decouple your data and applications.
This has meant a couple of minor breaking configuration changes - see:
- Log configuration changelog
- Kafka configuration changelog
- Storage configuration changelog
- EDN configuration changelog
Additionally, we've made some changes to repeatable queries to support multiple databases:
WATERMARK->AWAIT_TOKEN- see 'Transaction consistency'- We've added
SNAPSHOT_TOKENin addition toSNAPSHOT_TIME- we'd recommend using the former for repeatable queries where possible.
Our current roadmap for this feature is as follows (usual 'subject to change' caveat):
- Multi-partition tx-logs for secondary databases - horizontal write scaling.
- Read-only secondaries - just listening in.
- Removing the requirement to have XT-specific transaction logs - bring your own topics
Client driver support
We've been hard at work improving the support for XTDB through language-native PostgreSQL drivers - we now support ten languages: C/C++, C#, Clojure, Elixir, Go, Java, Kotlin, Node.js, PHP, Python and Ruby.
See 'Language Drivers' for the up-to-date list, and also our 'driver-examples' repository.
OpenID Connect (OIDC) authentication
2.1 adds support for OpenID Connect (OIDC) authentication to XTDB's built-in authentication system - you can now configure XTDB to authenticate users via an OIDC provider, such as Keycloak, Auth0, or Okta. This is likely to become the primary authentication method in XTDB going forward, so that users can leverage existing identity infrastructure rather than attempting to mirror and maintain roles in XTDB.
We'll be adding support for more OIDC authentication methods, as well as OIDC-based authorization/role-mapping in future releases.
See OIDC for more information and how to get started.
Observability
We've heard your feedback regarding observability in XTDB loud and clear, and so 2.1 brings a number of improvements here.
XTDB now supports OpenTelemetry-backed tracing for query introspection and performance analysis. Traces are sent via the OTLP (OpenTelemetry Protocol) HTTP endpoint to your tracing backend (e.g., Grafana Tempo, Jaeger, etc).
tracer:
# -- required
# Enable OpenTelemetry tracing.
enabled: true
# OTLP HTTP endpoint for sending traces.
# (Can be set as an !Env value)
endpoint: "https://localhost:4318/v1/traces"
# -- optional
# Service name identifier for traces.
# (Can be set as an !Env value)
# serviceName: "xtdb"Tracing provides detailed introspection into query execution, including:
- Per-query execution times for performance analysis.
- Information on which queries were executed, available through the xtdb.query span attributes.
- Lower-level operation timings, revealing how time is distributed across individual query operations.
See the tracing guide for details on how to get started.
Within the database itself, we've made the EXPLAIN plans much prettier, and added EXPLAIN ANALYZE, which provides detailed timing information for each step of the query plan:
EXPLAIN ANALYZE
SELECT o.o_orderpriority, COUNT(*) AS order_count
FROM orders AS o
WHERE o.o_orderdate >= DATE '1993-07-01'
AND o.o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
FROM lineitem AS l
WHERE l.l_orderkey = o.o_orderkey
AND l.l_commitdate < l.l_receiptdate
)
ORDER BY o.o_orderpriority;
-- depth | op | total_time | time_to_first_block | block_count | row_count
-- ----------------------+-----------+---------------+---------------------+-------------+-----------
-- -> | project | "PT0.757754S" | "PT0.757717S" | 1 | 5
-- -> | order-by | "PT0.757752S" | "PT0.757713S" | 1 | 5
-- -> | project | "PT0.757475S" | "PT0.757423S" | 1 | 5
-- -> | group-by | "PT0.757474S" | "PT0.75742S" | 1 | 5
-- -> | project | "PT0.757327S" | "PT0.670638S" | 256 | 2539
-- -> | semi-join | "PT0.757259S" | "PT0.670606S" | 256 | 2539
-- -> | project | "PT0.656491S" | "PT0.000828S" | 1024 | 189646
-- -> | rename | "PT0.656324S" | "PT0.000816S" | 1024 | 189646
-- -> | select | "PT0.655886S" | "PT0.000792S" | 1024 | 189646
-- -> | scan | "PT0.65443S" | "PT0.00054S" | 1024 | 299814
-- -> | rename | "PT0.087767S" | "PT0.000705S" | 256 | 2765
-- -> | scan | "PT0.087687S" | "PT0.000694S" | 256 | 2765Stability/Performance
All that said, the majority of our work in 2.1 has been focused on the stability and performance of XT. As is...
Assets 2
2.0.0 - we're live!
We're live! 🚀
We're really pleased to announce the generally-available release of XTDB 2.0.0:
- 'HTAP': 'hybrid transactional/analytical processing', OLTP + OLAP - reduce your dependency on fragile networks of ETL jobs.
- Full, across-time queries: using SQL:2011's bitemporal primitives and/or 'XTQL' to answer "what did we know, and when?".
- No history/audit tables, triggers required: use it as a regular update-in-place database (you're free to
UPDATEandDELETEnormally again!), safe in the knowledge that your history is there when you need it. - Zero-cost, full database snapshots: immediately go back to any point in time without needing to schedule or store periodic snapshots/copies.
- Postgres wire-compatible: use all of your existing client drivers (e.g.
psql, JDBC, Postgres.js, VS Code's SQLTools) and BI tooling (e.g. Metabase). - Decoupled storage/compute: current/frequently accessed data stays hot on the compute nodes; historical data stays available on cheap, durable, commodity object storage (e.g. AWS S3).
- Full nested-data support with schema inference: arrays and maps are first-class citizens in XTDB - no distinct JSON type, timestamps-as-strings etc, just insert your documents and we'll infer the schema!
- New columnar storage & query engine, open data format: XTDB's files are in the open Apache Arrow format, optimised for data locality and minimal (de-)serialisation.
- Free and open source: MPL licensed (OSI/FSF approved).
Get started by either:
- heading to https://play.xtdb.com - our online playground where you can try XTDB in your browser
- using the XTDB standalone Docker image, and your usual Postgres tooling:
docker run -d -p 5432:5432 --name xtdb ghcr.io/xtdb/xtdb:2.0.0 psql -h localhost xtdb
Then:
- To understand more about the value XTDB brings, visit the main XTDB website at https://xtdb.com
- Developer documentation is at https://docs.xtdb.com
- Issues/PRs and the source code are on our GitHub repo, as well as our public forum - subscribe here for release notifications.
- Join us on the XTDB Discord server for community support.
- Email us at hello@xtdb.com for bespoke support provided by JUXT, a leading software consultancy.
If you're interested in how this all works under-the-hood, I recently wrote a three-part blog series, "Building a Bitemporal Database":
I'd like to take this opportunity to give a massive thanks to everyone who's helped out through the early access programme - our Design Partners, the XTDB open-source community, and, of course, the XTDB team themselves.
If you've any questions or thoughts, please do get in touch - we'd love to hear from you!
James, Jeremy and the XTDB team
For those upgrading from the latest beta, a couple of changes to tell you about:
- We've removed the legacy
/statusendpoint on the HTTP server - this is now much better served by the metrics available on the healthz server (default: https://xtdb-host:8080/healthz). - GCP and Azure are out of labs - their Maven group is now
com.xtdb(artifactsxtdb-google-cloudandxtdb-azurerespectively). - HTTP server is into labs:
com.xtdb.labs:xtdb-http-server, reflecting our development focus on the Postgres wire-compatible API.
Assets 2
v2.0.0-beta9
A couple of breaking changes to tell you about here, but first: COPY!
COPY
XTDB now supports Postgres's COPY functionality, adding the ability to quickly load large quantities of data into XT.
COPY my_table FROM STDIN WITH (FORMAT = 'transit-json')- If you're using JDBC, you'll want to check out the CopyManager documentation, accessible through the XTDB connection using
getCopyAPI.- If you're using Clojure, and XT's JDBC wrapper, you can use
xtdb.next.jdbc/copy-in
- If you're using Clojure, and XT's JDBC wrapper, you can use
- If you're using Postgres.js, check out its copy API
XTDB supports Transit initially - either 'transit-json' or 'transit-msgpack'. We're looking to add support for Arrow too - this will reduce the (de-)serialisation required in transaction submission to near-zero.
Clojure API implementation
Previously, the XTDB Clojure API used an internal API to submit transactions and run queries. This worked well, but it meant that there were two code-paths - one that you'd use on an in-process node, and one when you graduated to a remote node - opening up the possibility for differences between the two.
We now implement the Clojure API as a client of the Postgres wire protocol, so that you're running the same code regardless of whether your node is local or remote. As a result, we've also been able to remove some of the abstractions that were previously required because of the two code-paths.
There are two related breaking changes to tell you about:
- The Clojure HTTP client is no longer available - we'd encourage users to expose the Postgres wire-server port instead and connect via
xt/client. (The server is still available, for users connecting directly via HTTP.) xt/execute-txwill now throw an exception if the transaction fails to commit, in line with both JDBC and next.jdbc.
Elsewhere:
A couple of smaller changes to draw your attention to:
- XT now has
xt.metrics_gauges,xt.metrics_timersandxt.metrics_counterstables containing various monitoring metrics for the node. In a production system, you'll most likely still want to set up Prometheus with your choice of dashboard - but, at a pinch, these tables contain the raw metrics. - (breaking) We now send
java.util.Dates over pgwire as timestamps-with-tz. This is a deviation from JDBC tradition, which treats them as timestamps-without-tz. - XT now uses a lesser-known Cognitect library called 'anomalies' to return more structured errors - you should see this either as
ex-datafrom thextdb.apifunctions, or in the 'detail' messages in your Postgres error responses.
As always, for a full list of closed cards, see the milestone.
Cheers!
James & the XT team
Assets 2
v2.0.0-beta8
Here's beta8 🚀
We've got some XTQL changes to tell you about, some improvements to the type-handling in the XT JDBC driver, and some small breaking changes.
Let's get started!
XTQL-in-SQL
Previously, SQL and XTQL have lived in two separate worlds in XTDB. With this release, XTQL is brought back onto the main stage: XTQL queries are now a first-class part of our SQL syntax, and can be sent through the JDBC driver!
Here we're taking advantage of SQL's relational algebra roots - the ability to compose larger queries out of smaller sub-queries. Each relational algebra operator (e.g. select, project, join, aggregate) has a common structure - relation(s) in, relation out - and it's this homogeneity (or, to give it the algebraic term, 'closure' under its operations) that means we can simply add XTQL as another source operator.
It's included as a string within SQL - most commonly, using dollar-delimited strings:
-- as a top-level query - like SQL's `VALUES`
XTQL $$
(-> (from :my-table [...])
...)
$$
-- as part of a larger SQL query - again, like SQL's `VALUES`
FROM (XTQL $$
(-> (from :my-table [...])
...)
$$) my_table
JOIN ...
WHERE ...
ORDER BY ...This means that you can use XTQL in your favourite SQL BI tooling - like Metabase:
We'd love to hear how you make use of this new XTQL reach!
To integrate XTQL into SQL queries, we've had to make one non-trivial breaking change to XTQL syntax: given SQL's parameters are positional rather than named, we've brought XTQL in line too. In your query, you now use fn to name the parameters, as follows:
(xt/q node ['(fn [uid]
(-> (from :users [{:xt/id uid} given-name family-name])
...))
"jms"])
;; or, using Clojure's `#{}` reader macro, in simple cases:
(xt/q node ['#(-> (from :users [{:xt/id %} given-name family-name])
...)
"jms"])XTDB JDBC driver
Previously, the XTDB JDBC driver was simply a very thin wrapper around the Postgres JDBC driver - an indulgence so that you could specify jdbc:xtdb://... as your connection string rather than jdbc:postgresql://....
Now, it actually earns its keep!
Through the changes in beta8, our JDBC driver now understands the wide range of types available in XTDB:
- post Java 8
java.timetypes, bringing JDBC into the 21st century - including timestamps that preserve timezone information. - XTDB's first-class collections - maps, vectors, sets.
- XTDB's extension types: keywords, URIs.
If you're using our JDBC driver - either directly or through libraries like Clojure's next.jdbc, you can now submit these directly as query arguments, and you should also transparently see these types in your result set.
Finally, if you're using an in-memory node, this now implements javax.sql.DataSource, so you can pass this directly to libraries like next.jdbc or HikariCP.
(require '[next.jdbc :as jdbc]
'[xtdb.node :as xtn])
(with-open [node (xtn/start-node {})]
(jdbc/execute! node ["INSERT INTO users RECORDS ?"
{:xt/id "jms", :given-name "James", :family-name "Henderson"}])
(jdbc/execute! node ["SELECT * FROM users WHERE _id = ?" "jms"]))This also involved a couple of breaking changes:
- Obviously the change itself is breaking - if you're calling
.getObjecton the returned result-set, you'll get a richer type thanPgObjectorjava.sql.Timestamp. - If you were using our
next.jdbchelper functions, you'll likely no longer need to - and indeed, most of them have been removed (particularly->pg-obj). - Our JDBC driver is now part of the
xtdb-apiartifact - you'll need to removextdb-jdbcfrom your dependency manager.
Operational improvements
In beta8, we've also been working on operational improvements, largely around backup/restore and disaster recovery - these are now documented in the operations manual.
One to draw to your attention, though, is the new 'log epoch' safety net: if your transaction log is corrupted/lost/misplaced, XTDB will now smoke-test the log topic on startup, and refuse to start if there are messages missing.
To resolve this, you now have the option to keep whatever data has been persisted in XTDB's object store, but start with a new, empty topic. Let the XTDB cluster know what's happened by incrementing the log epoch, and XTDB will continue from the most recently persisted block.
SQL changes
There are a few quality-of-life SQL changes to tell you about:
-
(breaking)
FROM my_table FOR VALID_TIME FROM NULL TO <to>,FROM my_table FOR VALID_TIME TO <to>(i.e.FROM <from>elided), and similarly forUPDATE|DELETE <table> FOR PORTION OF VALID_TIME FROM ... TO ...will now return/update data from the start of time, rather than from now.If the whole clause is elided (e.g.
FROM my_table,UPDATE|DELETE <table> SET ...), the default remains for VT 'now -> ∞', as before. -
FOR VALID_TIME AS OF|FROMnow support parameters and/or full expressions - e.g.:FROM my_table FOR VALID_TIME AS OF ?FROM my_table FOR VALID_TIME FROM (NOW - INTERVAL 'P30D') TO NOW
-
In
BEGIN, you can set various options within the scope of the transaction:BEGIN READ WRITE WITH (system_time = ?, timezone = 'Europe/Berlin'); ... COMMIT; BEGIN READ ONLY WITH (snapshot_time = ?, clock_time = '2025-05-01T00:00:00Z'); ... COMMIT;
These, too, support parameters and/or full expressions.
-
We've added URI functions in the standard library - no need to regex your way out of this one any more!
Elsewhere
So much for beta8 being a small release!
As always, for a full list of closed cards, see the milestone.
Cheers!
James & the XT team
Assets 2
v2.0.0-beta7
We're really proud to announce the release of XTDB 2.0.0-beta7 - this is a significant change to our underlying indices which expands the variety of temporal datasets that XT can performantly support.
Our primary focus in beta7 has been to improve XTDB's support for frequently changing entities, in order to better handle time-series shaped data, and there are also some removals to tell you about.
Frequently-changing entities (aka 'time series') support - migration required
From the beginning of the XT 2.x line, we have largely prioritised update patterns where each individual entity typically only undergoes a handful of updates through their lifetime - e.g. trades/orders progressing through a small state machine, or user profiles where the user may infrequently update their details - optimising for large numbers of those entities.
In this release, we've now significantly improved support for entities where this assumption doesn't hold.
Particularly:
- sensor readings - where we'd model each sensor with an ID; each reading as a version of that entity with a small validity window.
- trade pricing feeds - where we'd model each ticker with an ID; each individual update would be valid from now until corrected, but expecting the next correction to come very shortly afterwards.
To do this, we've made significant changes to our underlying indices.
XTDB nodes collaborate to maintain a log-structured merge tree (LSM tree) in the shared object storage, in order to balance write and read speed. Previously, we sharded deeper levels of this tree by the primary key (PK) of each entity - the deeper the level, the more shards. This doesn't work for these use cases, though: by only sharding by PK, we coalesce together all the versions of each entity - fine when each entity has (say) fewer than twenty versions; not so great when each entity has hundreds, thousands, millions of versions.
In beta7, at level 1 of the LSM tree, we've swapped the PK sharding for partitioning by the 'recency' of the entity versions. This takes inspiration from generational garbage collection, which posits that new data is likely to be quickly superseded; older data that's still current is likely to remain so into the future as well - newer data is collected quickly; older data doesn't have to be collected so frequently. (Level 2 and deeper continue to be sharded by PK, as before.)
The impact of this is that the LSM tree is effectively split into two halves:
- A historical half, predominantly sharded by the recency of the version. This is likely to be relatively shallow but wide, with lots of time buckets that are seldom written to after their window expires.
- A current half, sharded by PK. This is likely to grow deeper rather than wide, and is likely to be non-existent in the above use cases - no individual version ever remains current for long enough to make it into this half of the tree.
As a result, sensor reading and trade pricing feed data should now find that we can very quickly prune the majority of the historical versions, focusing on the specific window of interest to each query.
Migration
Given this is a change to XT's storage format, users upgrading to beta7 are required to migrate their existing clusters. First, ensure your nodes are upgraded to beta6.6 for forwards compatibility.
To migrate, there are two options:
-
If you have an infinite-retention log, you can simply stand beta7 nodes up in the usual green/blue manner. These will then re-index the log from scratch, which may take a while on large datasets.
This is the easiest approach, assuming you have a relatively small dataset.
-
If you have a larger dataset, or a finite-retention log, you'll need to run our migration tool. This is executed as a one time task - you can leave your beta6 nodes running and run the migration tool in the background.
To run the migration tool:
- If you run XTDB through its CLI, run the beta7 artifact supplying your usual configuration arguments, but additionally add
--migrate-from 5. - If you run XTDB through a Docker container orchestration platform (e.g. Kubernetes), run the beta7 version of your usual Docker image as a one-shot task, and override the Docker command to add the
--migrate-from 5flag.
Once the migration tool has finished (it'll notify you, and then exit), you can deploy your beta7 nodes and decommission your beta6 nodes in the usual green/blue manner.
- If you run XTDB through its CLI, run the beta7 artifact supplying your usual configuration arguments, but additionally add
The rollback procedure is as follows:
- Turn off any beta7 nodes if you've started them
- Turn on beta6.6 nodes
- Delete the
v06directory in your object store.
Arbitrary precision decimals
We've also added initial support for arbitrary-precision decimals (BigDecimals, in JVM parlance) - these can be inserted as BigDecimal/BigInteger values in your prepared statements.
This was prioritised thanks to requests from our Design Partners - if you have features/functionality you'd like to see, please do consider joining our Design Partner programme to help us shape the future of XT.
Removals
There are a couple of removals in beta7 - these were experimental features for which we haven't seen any take-up from our design partners or community, so we've elected to reduce our API surface area ahead of GA.
-
The XTQL DML operators (
:insert-into,:delete,:erase,:assert-exists,:assert-not-exists) have been removed due to lack of uptake - we've received barely any (if any) questions and issues related to these.:put-docs,:delete-docsand:erase-docs, as well as query-side XTQL will remain into GA.Longer term, we currently intend to inline XTQL within our SQL language (e.g.
FROM XTQL (-> (unify ...) (return ...) (limit 10))) to allow it to be queried through the various Postgres drivers, rather than requiring a separate Clojure API. -
Transaction function support has also been removed - of the users who've gotten in touch with us, people who previously had no choice but to use transaction functions (in XTDB 1.x, because we didn't have anything else) now prefer to use standard SQL DML (e.g.
UPDATE,ASSERT). -
We no longer support Java <21 in the in-process XTDB nodes nor our client libraries - if you use Java <21, you are still able to use our (Postgres-compatible) JDBC driver against an XTDB Docker image/dev-container.
We have very much been guided by our Design Partners and active community members in making these decisions - if you would like to get involved in shaping the future direction of XTDB, please do get in touch through the below channels!
Elsewhere
As always, for a complete list of the issues resolved in this release (all 85 of them!), see the associated milestone.
We'd love to hear what you think - please do get in touch:
- via email: hello@xtdb.com
- on the web: https://discuss.xtdb.com
- on our new Discord server
As always, a massive thanks for your support throughout this 2.x line - we're turning onto the home straight now!
James & the XT Team
Assets 2
1.24.5
What's new in 1.24.5:
- RocksDB within the
xtdb-rocksdbmodule has been upgraded to 9.10.0 (was 7.7.3) - Introduction of an experimental
kv-cachemodule which users can use in combination with a kv store (e.g. another local RocksDB instance) as a durable disk cache for the document store, for situations where the default in-memory cache is too constraining
Cheers!
XT Team
Assets 2
2.0.0-beta6
We're pleased to announce the release of XTDB 2.0.0-beta6 🚀
Coming up in the release notes: we have some breaking configuration changes, but largely this release has been driven by the real-world practical usage of our Design Partners (thanks folks!) - the sorts of changes that don't make for exciting release notes, but the ones you appreciate when you're actually working with the database day-to-day.
2.0.0-beta7, however, is a different story - we'll be bringing you the first database migration since we released beta1, in order to better support a wider variety of temporal data shapes. More details below!
While you're here, one date for you diary: Jeremy will be giving a talk on "Streamlining Regulatory Compliance and Reporting with XTDB: The Future of Bitemporal Data Management" - 19th February at 15:00 UK/16:00 CET/10:00 ET. Come join us for a deep dive into how XTDB can transform your data management strategy - making data reconciliation, reporting, and analysis simpler, more accurate, and cost-effective. Hope to see you there, of course - but if you can't make it, do still sign up, and we'll send you the recording.
Breaking change: Kafka configuration
As part of this release, we've simplified the log by merging the two log topics together.
-
If you're configuring using YAML:
txLog: !Kafka txTopic: "xtdb-tx-topic" filesTopic: "xtdb-files-topic" autoCreateTopics: true # becomes log: !Kafka topic: "xtdb-log-topic" autoCreateTopic: true
-
If you're using the environment variables in the various Docker images,
XTDB_TX_TOPIC->XTDB_LOG_TOPIC;XTDB_FILES_TOPICis removed.
You can then safely remove the file topic.
Breaking change: BEGIN AT SYSTEM_TIME
We've streamlined the UX for specifying the 'basis' of SQL transactions.
tl;dr: breaking change is BEGIN AT SYSTEM_TIME TIMESTAMP '...' -> BEGIN READ WRITE WITH (SYSTEM_TIME = TIMESTAMP '...')
'Basis' is an important concept in XTDB - we take great pride that our queries run on a completely immutable snapshot of the database, so that you can re-run queries later and know that you'll receive the same results. This isn't just within a single transaction, nor does it require creating expensive snapshots manually ahead-of-time, nor does it require separate nodes - at any point, you can ask the same XTDB node to run as if it were last Tuesday at 4pm, even if you didn't know you'd need to at the time.
When a transaction starts, we fix a basis for all of the queries/operations that run within it. It has sensible defaults - you shouldn't need to specify it for most transactions - but you can manually set them if need be.
There are two dimensions to the basis: firstly, CLOCK_TIME - this ensures that any references to wall-clock time within the queries can be repeated. This defaults to the wall-clock time on the node at the start of the transaction, and you can see the current value with SHOW CLOCK_TIME. As well as being used for NOW() et al, this is also used for the default valid-time for each referenced table.
Secondly, SNAPSHOT_TIME - this influences which transactions are visible to the queries. This is a strict upper bound - no matter if you ask for a later system time in your query, you will not see any updates later than this time.
SNAPSHOT_TIME defaults to the latest completed transaction on the node serving the query, and can be read with SHOW SNAPSHOT_TIME.
In summary, by default, you'll see new transactions and scheduled updates come through as you'd expect, but when the time comes(!) you have control over exactly what's visible if need be.
To set these values, simply start your transaction with BEGIN READ ONLY WITH (SNAPSHOT_TIME = TIMESTAMP '...', CLOCK_TIME = TIMESTAMP '...').
(h/t to Rich Hickey's 2012 "The Database as a Value" talk which was very inspirational to us in this area 🙏)
Elsewhere
-
EXPLAIN <SQL query>now gets you the query plan for the query. It's not in the most end-user readable format at the moment, we can definitely make improvements here - but if you've got an ill-performant query it's a good start. -
We've added support for a 'read-only' port - for use-cases where you know you'll only want to read data (e.g. Metabase or similar), you can expose the read-only port and lock down the normal read-write one.
server: port: 5432 readOnlyPort: 5433
-
We've relaxed the SQL timestamp literal syntax - you can now optionally omit the time-part if it's midnight. For example,
TIMESTAMP '2025-01-01Z'gets you a TZ-aware timestamp. -
In the unlikely case of an unrecoverable error that halts transaction processing (to prevent corruption of your data) XT will now save a crash log to your object store. We obviously hope nobody ever sees this one - but if you do you/we should have more diagnostic information to get back up and running.
-
Full list (44 cards), as always, in the milestone
Beta7 preview
In this iteration, we're particularly looking at optimising data where the number of distinct entities is relatively low, but each entity has a significant number of updates through its history - think market pricing data or sensor readings. The proposed index structure makes a cleaner separation of current and historical data (both explicitly superseded and data that's still valid but not as recent, like old orders/trades), so we're expecting to see improved OLTP as-of-now query performance as well as being able to quickly identify the partitions required to serve a historical query.
We're also naturally taking this opportunity to apply all of the disk format simplifications and improvements that we've had queued up for a while, like keeping more metadata about the data contained within each file and more table-level statistics, so that should speed things up as well!
There'll likely be two options for migration:
- (easier) If your log still contains all of your transactions - it's no longer mandatory in XT2, but we're aware some folks still do - just spin up a new node in the background and let it replay.
- Otherwise, we'll be releasing a migration tool from beta6 -> beta7 - spin this up in the background as a one-shot task, let it run, then you'll be able to do the usual green/blue release for the nodes themselves.
If you'd like early access to these changes, please do let us know!
Beyond that, we're very much in the final stretch towards our General Availability (GA) release - hence the iteration on our underlying data structures before launch. Thank you to everyone who's worked with us throughout this beta phase - your feedback and guidance has been instrumental in steering XTDB v2 towards becoming what we hope is a very exciting and valuable open-source tool.
Get in touch!
As always, we'd love to hear your thoughts - you can get in touch either through GitHub, https://discuss.xtdb.com, or hello@xtdb.com.
Cheers!
James & the XT Team
Assets 2
2.0.0-beta5
We're pleased to announce the release of 2.0.0-beta5! 🚀
In amongst the usual array of bugfixes and minor improvements, 2.0.0-beta5 adds "template-friendly SQL", as well as a 4x reduction in per-transaction overhead for small transactions through the Postgres wire-server.
Template-friendly SQL
Hands up if you've ever had to get the right number of commas or ANDs when generating a SQL SELECT/WHERE clause respectively 😆 🖐
SQL was originally intended to be a language written by hand, and was created long (long) before modern languages had decent support for string templating. While SQL itself is incredibly powerful, adapting it to dynamic contexts in application code has often felt like trying to fit a square peg into a round hole. From string concatenation bugs to unexpected syntax errors caused by stray commas, many developers have experienced the frustration of working with SQL in dynamic templates.
A common workaround has been to introduce tools or libraries to generate SQL programmatically, but these come with their own set of challenges. Developers often find themselves unsure about the exact SQL being generated and sometimes need to cajole these tools into producing the desired query, adding extra complexity and reducing transparency.
With XTDB, we're introducing template-friendly SQL, a modern take on making SQL generation smoother and more forgiving in templated environments, particularly for languages and libraries that already offer strong support for string templating. Whether you're using Kotlin, Python, JavaScript, or Clojure libraries like YeSQL and HugSQL which emphasize 'getting back to SQL', this feature integrates seamlessly into workflows focused on dynamic SQL generation.
Here's what's new:
-
Flexible Comma Placement: Forget the days of meticulously managing commas in your SELECT and WHERE clauses. Taking inspiration from Clojure's "commas as whitespace," trailing commas, multiple commas in a row, or even empty predicates are gracefully handled without causing syntax errors. You can now safely generate SQL without needing additional libraries or worrying about “did I add that comma in the right place?”
query = f""" SELECT * FROM my_table WHERE {"col1 = 'value1'," if condition1 else ''} {"col2 = 'value2'," if condition2 else ''} """ # The extra commas are ignored, making templates more forgiving and human-readable.
-
Query Pipelining: SQL's top-level structure is relatively constrained - it must only have one SELECT, FROM, WHERE, GROUP BY, and if you require multiple steps, you need to use subqueries.
Take this query to calculate a frequency distribution, for example:
SELECT order_count, COUNT(*) AS freq FROM (SELECT customer, COUNT(*) AS order_count FROM orders GROUP BY customer) counts GROUP BY order_count ORDER BY order_count DESC
With XTDB, you can now:
- optionally move the SELECT to the place in the query where it logically runs
- run multiple aggregations in a pipeline
- elide the GROUP BY in this common case
FROM orders SELECT customer, COUNT(*) AS order_count SELECT order_count, COUNT(*) AS freq ORDER BY order_count DESC
To clarify: as always, the standard SQL structure still works (so your existing tooling shouldn't be affected) - this is an opt-in feature.
These enhancements align with XTDB's philosophy of developer-first tooling, reducing friction and making your code cleaner and more robust. Whether you're dynamically generating queries or simply writing complex SQL templates, template-friendly SQL lets you focus on the logic rather than the syntax.
We'd love to hear how this feature improves your workflow. Try it out and share your feedback with us!
Transaction ingestion performance
We've also spent some time improving XTDB's ingestion performance through the Postgres wire-server.
Our recommendation has always been (and continues to be) to batch your inserts wherever possible - the right granularity will obviously depend on your exact use-case but in practice we've found batches of the order of magnitude of ~1k seem to be a sweet spot. (This generally applies to whatever database you're using - XT is no different!)
Of course, though, there will be times when this isn't possible, and we'd like to make that case faster too.
So, since beta4, we've added in a micro-benchmark to measure this performance - ingesting 100k small documents at various batch sizes. On my desktop, we've seen the following improvements:
- batches of 10: 24s -> 6s
- batches of 100: 8s -> 1.8s
- batches of 1000: 6s -> 1.2s
How? Well:
- we found a case in Apache Arrow where one of the vectors had a O(n²) operation in its copy implemention
- we've spent some time optimising our low-level Postgres wire implementation - down to the buffers, bytes and sockets
- we're now able to recognise another class of INSERT queries that don't need to read the database to make their writes.
Again, do give it a spin, and let us know what you find on your hardware!
Elsewhere
Two breaking changes to tell you about:
-
If you are using XT in your Clojure process, we've moved our time reader-macros off the
#timenamespace to not clash with Henry's library.You can either depend on that library specifically, or replace
#time/with#xt/. -
The Postgres wire-server and FlightSQL server now start on an unused port by default (rather than 5432/9832), to avoid clashing with other processes you may have running.
To restore the old behaviour, you'll need to explicitly specify the port in your configuration. (The official Docker images remain on the original ports.)
As always, we'd love to hear your thoughts - you can get in touch either through GitHub, https://discuss.xtdb.com, or hello@xtdb.com.
Cheers!
James & the XT Team
Assets 2
2.0.0-beta4
Welcome to 2.0.0-beta4 🚀
This release contains a couple of larger changes for us to tell you about, a few breaking changes, and a few bugfixes. As always, for a full list, see the GitHub milestone.
PATCH
This release introduces a new PATCH operation: give it some documents; existing documents will be merged, new documents will be inserted:
INSERT INTO users RECORDS {_id: 'jms', given_name: 'James'}
PATCH INTO users RECORDS {_id: 'jms', last_name: 'Henderson'} [, ...]
SELECT * FROM users;
-- => {_id: 'jms', given_name: 'James', last_name: 'Henderson'}[:patch-docs :users {:xt/id "jms", :last-name "Henderson"}, & ...]Patch also works across time - given PATCH INTO users FOR VALID_TIME FROM <time> TO <time> RECORDS ... ([:patch-docs {:into :users, :valid-from <time>, :valid-to <time>} ...]) it will upsert documents for the given period of time.
Monitoring
We've been hard at work on XT's operational story - have a look at these two guides for AWS and Azure.
Here's a couple of examples of Grafana dashboards that are available in the repo:
Breaking changes:
- [#3683] Connections through Postgres tools must now specify
xtdbas the database name. - [#3864] XT 2.0.0-beta4 nodes now cope with the incoming system time not always increasing. As a result, in production deployments where this may happen, this release must not be upgraded in a blue/green manner - shut down all your beta3 nodes, then start up beta4 nodes.
submit-txnow returns just the transaction id, because the system-time of the transaction isn't fixed until it's indexed -execute-txstill returns the system-time.:after-tx->:after-tx-id
- [#3877]
:at-tx->:snapshot-time, accepting only the system-time;SETTING BASIS->SETTING SNAPSHOT_TIME
Elsewhere:
- [#3852] We've added initial authentication support through
CREATE USER! We're currently planning authorization support with our design partners (expect a bigger announcement once that's landed) - please do chime in if you'd like to join the discussion. - [#3844]
xtdb-jdbclibrary of helper functions for use with Clojure's next.jdbc - [#3855]
PREPAREandEXECUTEcommands to prepare statements in clients that don't otherwise support it. - [#3242] Handling multiple statements in a single message (
;-delimited) - [#3917]
SHOW WATERMARKandSET WATERMARKto show and set the watermark of a connection. /healthz/startedendpoint waits for the node to catch up before returning a 200 status.- and many more...
As always, let us know what you think, or if you run into any issues! And, if we don't speak before, hope you all have a great Christmas and a happy and healthy 2025 🎄
James, Jeremy & the XT Team



