k_yrs_go
is a database server for YJS documents. It works on top of Postgres and Redis.
k_yrs_go
uses binary redis queues as I/O buffers for YJS document updates, and uses the following PG table to store the updates:
CREATE TABLE IF NOT EXISTS k_yrs_go_yupdates_store (
id TEXT PRIMARY KEY,
doc_id TEXT NOT NULL,
data BYTEA NOT NULL
);
CREATE INDEX IF NOT EXISTS k_yrs_go_yupdates_store_doc_id_idx ON k_yrs_go_yupdates_store (doc_id);
Rows in k_yrs_go_yupdates_store
undergo compaction when fetching the state for a document if the number of
rows of updates for the doc_id
are > 100 in count. Compaction happens in a serializable transaction, and the
combined-yupdate is inserted in the table only when the number of deleted yupdates is equal to what was fetched
from the db.
Even the Reads and Writes happen in serializable transactions. From what all I have read about databases, Reads, Writes, and Compactions in k_yrs_go
should be consistent with each other.
Max document size supported is 1 GB. The test suite contains a 'large doc'
test which tests persistence for 100MB docs.
import axios from 'axios';
const api = axios.create({ baseURL: env.SERVER_URL }); // `env.SERVER_URL is where `k_yrs_go/server` is deployed
const docId = uuid();
const ydoc = new Y.Doc();
// WRITE
ydoc.on('update', async (update: Uint8Array, origin: any, doc: Y.Doc) => {
api.post<Uint8Array>(`/docs/${docId}/updates`, update, {headers: {'Content-Type': 'application/octet-stream'}})
});
// READ
const response = await api.get<ArrayBuffer>(`/docs/${docId}/updates`, { responseType: 'arraybuffer' });
const update = new Uint8Array(response.data);
const ydoc2 = new Y.Doc();
Y.applyUpdate(ydoc2, update);
git clone --recurse-submodules git@github.com:kapv89/k_yrs_go.git
- Install docker and docker-compose
- Make sure you can run
docker
withoutsudo
. - Install go
- Install rust
- Install node.js v20.10.0+
- Install tsx globally
npm i -g tsx
- Install turbo
npm i -g turbo
cd k_yrs_go
npm ci
turbo run dev
turbo run test
latencies.png & system_config.png
Seems to be very fast on my system.
Tests are written in typescript with actual YJS docs. They can be found in test/test.ts
.
To run the test on prod binary:
- First start the dev infra:
turbo run dev#dev
- Run the production binary on dev infra:
turbo run server
- Run the test suite:
turbo run test
If you want to supply custom env-params to tests:
- First start the dev infra:
turbo run dev#dev
- Run the production binary on dev infra:
turbo run server
cd test
- Supply env-params and run the
npm run test
command. Example:RW_ITERS=3 COMPACTION_ITERS=0 CONSISTENCY_SIMPLE_ITERS=0 CONSISTENCY_LOAD_TEST_ITERS=0 npm run test
Available env params to tweak tests are (with default values):
{
RW_ITERS: 1,
RW_Y_OPS_WAIT_MS: 0,
COMPACTION_ITERS: 1,
COMPACTION_YDOC_UPDATE_INTERVAL_MS: 0,
COMPACTION_YDOC_UPDATE_ITERS: 10000,
COMPACTION_Y_OPS_WAIT_MS: 0,
CONSISTENCY_SIMPLE_ITERS: 1,
CONSISTENCY_SIMPLE_READTIMEOUT_MS: 0,
CONSISTENCY_SIMPLE_YDOC_UPDATE_ITERS: 10000,
CONSISTENCY_LOAD_TEST_ITERS: 1,
CONSISTENCY_LOAD_YDOC_UPDATE_ITERS: 10000,
CONSISTENCY_LOAD_YDOC_UPDATE_TIMEOUT_MS: 2,
CONSISTENCY_LOAD_READ_PER_N_WRITES: 5,
CONSISTENCY_LOAD_YDOC_READ_TIMEOUT_MS: 3,
}
There are 5 types of tests:
Read-Write test tests for persistence of 2 operations on a simple list, and ensures that reading them back is consistent. Relevant env params (with default values) are:
{
RW_ITERS: 1, // number of times the read-write test suite should be run
RW_Y_OPS_WAIT_MS: 0, // ms of wait between (rw) operations on yjs docs
}
Compaction test writes a large number of updates to a yjs doc, the performs the following checks:
- Checks that the number of rows in the
k_yrs_go_yupdates_store
table for the testdoc_id
are > 100 after the writes. - Fetches 2 yjs updates for
doc_id
within the same millisecond (while compaction is happening), loads them in 2 other yjs docs and checks that this new yjs doc is consistent with the original yjs doc and that both responses within the same millisecond are consistent with each other. - Checks that the number of rows in
k_yrs_go_yupdates_store
table for the testdoc_id
are <= 100 (compaction has happened).
Relevant env params (with default values) are:
{
COMPACTION_ITERS: 1, // number of times the compaction test-suite should be run
COMPACTION_YDOC_UPDATE_INTERVAL_MS: 0, // ms of wait between performing 2 update operations to the test yjs doc
COMPACTION_YDOC_UPDATE_ITERS: 10000, // number of updates to be performed on the test yjs doc
COMPACTION_Y_OPS_WAIT_MS: 0, // ms of wait between different compaction stages
}
In this test, the following steps happen in sequence in a loop:
- An update is written to test yjs doc, and gets persisted to the db server
- State of doc is read back from the db server, and applied to a new yjs doc
- The new yjs doc is compared to be consistent with the test yjs doc
- Go back to #1
Relevant env params (with default values) are:
{
CONSISTENCY_SIMPLE_ITERS: 1, // number of times the simple consistency test should be run
CONSISTENCY_SIMPLE_READTIMEOUT_MS: 0, // ms to wait before reading yjs doc state from db server after a write to test yjs doc
CONSISTENCY_SIMPLE_YDOC_UPDATE_ITERS: 10000, // number of updates to be applied to the test yjs doc
}
This test tries to get to the limits of how consistent writes and reads are for a frequently updated document which is also frequently fetched. This is important for scenarios where new user can try to request a document which is being frequently updated by multiple other users and you need to ensure that they get the latest state.
Relevant env params (with default values) are:
{
CONSISTENCY_LOAD_TEST_ITERS: 1, // number of times the load consistency test should be run
CONSISTENCY_LOAD_YDOC_UPDATE_ITERS: 10000, // number of updates to be applied to the test yjs doc
CONSISTENCY_LOAD_YDOC_UPDATE_TIMEOUT_MS: 2, // ms to wait before applying an update to the test yjs doc
CONSISTENCY_LOAD_READ_PER_N_WRITES: 5, // number of writes after which consistency of a read from the db server should be checked
CONSISTENCY_LOAD_YDOC_READ_TIMEOUT_MS: 3, // ms to wait after an update before reading yjs doc state from db server and verifying its consistency
}
I wasn't able to reach a better (and stable) consistency under load numbers than this on my local machine.
- This tests writes data to the test yjs doc till it becomes 100MB in size, all the while persisting updates to server.
- Then it reads the data back from server twice - once before compaction, and once after compaction, and creates 2 new yjs docs.
- The 3 yjs docs are compared to be consistent with each other.
Relevant env params (with default values) are:
{
LARGE_DOC_TEST_ITERS: 1, // number of times the large doc test should be run
LARGE_DOC_MAX_DOC_SIZE_MB: 100, // max size of test yjs doc. the test doc will be written to till it reaches this size
LARGE_DOC_CHECK_DOC_SIZE_PER_ITER: 10000, // checking yjs doc size becomes an expensive operation quickly as it grows more than 10MB, hence it is checked per these many iters
LARGE_DOC_YDOC_WRITE_INTERVAL_MS: 0, // interval between 2 write operations to the test yjs doc
LARGE_DOC_YDOC_READ_TIMEOUT_MS: 0 // time to wait before reading yjs doc back after all write requests have completed
}
If you are running the dev setup, stop it. It's gonna be useless after build
runs because the C ffi files will get refreshed.
turbo run build
Server binary will be available at server/server
. You can deploy this server binary in a horizontally scalable manner
like a normal API server over a Postgres DB and a Redis DB and things will work correctly.
You can see an example of running in prod in server/server.sh. Tweak it however you like.
You'll also need the following generated files co-located with the server
binary in a directory named db
server/db/libyrs.a
server/db/libyrs.h
server/db/libyrs.so
Directory structure for running prod binary using server.sh
should look something like this:
deployment/
|- .env
|- server
|- server.sh
|- db/
|- libyrs.a
|- libyrs.h
|- libyrs.so
If you want to run the prod binary with default dev infra, you can do the following:
- Spin up dev infra:
turbo run dev#dev
- Run the prod server binary
turbo run server
- Optionally, run the test-suite
turbo run test
See the file server/.env
. You can tweak it however you want.
To make sure tests run after your tweaking server/.env
, you'd need to tweak test/.env
.
Relevant ones are:
SERVER_PORT=3000
PG_URL=postgres://dev:dev@localhost:5432/k_yrs_dev?sslmode=disable
REDIS_URL=redis://localhost:6379
DEBUG=true
REDIS_QUEUE_MAX_SIZE=1000