graft sqld #465

MarinPostma · 2023-10-17T10:20:19Z

graft sqld into libsql-server

This is needed for Atlas integration. Signed-off-by: Piotr Jastrzebski <[email protected]>

Co-authored-by: Doug Stevenson <[email protected]>

Signed-off-by: Piotr Jastrzebski <[email protected]>

Improve readmes

When registering WAL methods and later opening a connection with them, sqld is currently subject to race condition, where multiple threads can register methods by the same name, and then look up the same instance of them. That's dangerous, because our methods hold state. The short-term solution for this problem is to make register+find globally atomic. A proper solution will allow libsql_open() to take a pointer to user state, making any race conditions moot.

If the WAL methods are not deregistered from a global libSQL table before dropping them, we end up with having a dangling pointer in the libSQL structure. It often worked, because the freed memory just didn't manage to get overwritten yet, but it was a time bomb nonetheless. The scenario was as follows: 1. Fiber A registers WAL methods W1 2. Fiber A frees WAL methods W1, but does not deregister them from the libSQL global data structure 3. Fiber B registers WAL methods W2, and registration involves iterating over the data structure that contains a dangling pointer, after the methods were freed by (2.)

Secure mode has a performance and overhead penalty, and we don't really need to be robust against memory reuse attacks in sqld. Simple measurements show that secure mode can bump resident memory usage as much as 250%, plus there's the cpu cost of encrypting freelists and other features that they offer.

Ref: #169

libSQL has no notion of checking how much physical memory is available, if there's swap space configured, etc. Normally it limits its own allocations to 2GiB, but on small edge machines it can already prove to be too much to handle. To remedy that, two command-line options are added to sqld - `soft_heap_limit_mb` and `hard_heap_limit_mb`. Soft heap limit hints to libSQL that it should not allocate once the limit is reached and put pressure on shrinking caches, but it will choose to overallocate rather than fail with SQLITE_NOMEM error. Hard limit bails out once an allocation is attempted past the limit. While we could automatically detect the hardware setup on boot, command-line options are good enough for a start to avoid out-of-memory (OOM) conditions. A good rule of thumb would be to set the soft limit to, say, 70% of available physical memory, and hard limit to 90%, leaving some leeway for the operating system. Local tests imply that libSQL is really good at keeping allocations above even soft limit, with a negligible error margin.

Due to how sqld, one of the prime users of bottomless, works, we should ignore passive checkpoints instead of upgrading them to full checkpoints. Reason is, sqld uses short-standing connections per each request, and we certainly don't want to checkpoint multiple times per second.

It can be recovered once we decide to make it production-ready, and meanwhile let's drop it so that we end up with cleaner code.

* make WalHook stateless The WalHook cannot be statefull. Instead, it ties a WalHook implementation to some context, and provides a way to safely register methods, and retrieve the passed context fromt the Wal pointer. * update LibsqlDb to use new WalHooks * logger hook * replica hook * glue stuff

* Propose a stateful Hrana-over-HTTP API * Separate WebSocket specific parts of Hrana into a submodule * Implement the stream mechanism for HTTP API v2 * Implement stream expiration * Refactor and document * Report a nice error when wrong step index is used in BatchCond * Small fixes * Address comments from code review * Fixup Cargo.lock * Add the --http-self-url command-line argument

Since the expected size of the values vector is known, let's allocate its memory once instead of relying on reallocations to grow the vector.

* wal_hook: add on_savepoint_undo and on_checkpoint They will be overwritten along with bottomless integration * treewide: integrate bottomless as WAL hook From now on, bottomless can be enabled in sqld as a regular WalHook. TODO: the main functionality, i.e. actually replicating frames is not hooked to anything yet. It will be done, but it should be heavily considered to just queue writes with an unbounded channel (which allows sync insertions), so that everything happens in the background. * treewide: replicate and checkpoint frames with bottomless Right now the replication and checkpointing is sync and blocking - that will need to be rewritten to async later.

* introduce QueryResultBuilder * hrana query result builder * http query result builder * posgres result builder * update Database trait * update all modules to use QueryResultBuilder * add tests * cleanup unused types * bump toolchain * cargo fmt * remove postgres * review edits

* add savepoint and release stmt categories * forbid savepoints in legacy http

This reverts be7386 and removes decoding limits for the `ReplicationLogProxyClient` so that when replication requests get forwarded to the primary and it returns a full batch of `1024` frames it does not trigger the 4mb decoding limit set by default for tonic gRPC clients.

* Group all checks together in the same workflow Reusing artifacts ends up being faster than parallelizing with the current CI resources. * Set same RUSTFLAGS in both jobs && remove unnecessary steps * Cache rust artifacts

This will allow ARMv7 architecture users to compile sqld. Wasmtime currently does not have support for that arch.

* wip * add prometheus metrics probes * expose metrics admin endpoint * move metric route to /metrics * use histogram macro

* Optimize runtime container layers * Use Github Actions cache for Docker * Move rm -rf /var/lib/apt/lists/* to its own line

* add same thread injector * refactor Frame API; primary return commit frame for snapshots * hook new injector * add test assets * handle fatal replication error * use persistent replication table * add doc commen to snapshot frames_iter_from * review fixes * fmt * handle potential injector execute error

While testing multitenancy, it's more convenient to use a wildcard domain rather than edit /etc/hosts manually.

* connection: fix various comment, code var and doc typos * replication: fix various comment, code var and doc typos * namespace: fix various comment, code var and doc typos * metrics: fix typo in metric docstring * hrana/stmt: fix code typo verion -> version * query_result_builder: fix code typo weigths -> weights * h2c: fix typo in docstring * query_analysis: fix typo in docstring

* Add embedded replica test * sqld: Add `extended_code` support

For specific edge cases, a 32MiB threshold is a little small. Let's bump it to at least 256MiB of data and inform about vacuuming with info log level.

This reverts commit 61b0406.

There's no reason I'm aware of to only make checkpoint-interval-s option effective if bottomless is enabled. Botomlessless instances should also be able to checkpoint periodically.

haaawk and others added 30 commits May 17, 2023 14:18

Preserve original SQL string for create table (#401)

fdad86f

This is needed for Atlas integration. Signed-off-by: Piotr Jastrzebski <[email protected]>

Added --version flag (#402)

31a1d00

Co-authored-by: Doug Stevenson <[email protected]>

Bump sqld version to 0.15.0

e8f608c

Signed-off-by: Piotr Jastrzebski <[email protected]>

Improve readmes

8809db9

Merge pull request #404 from CodingDoug/rework-readmes

974c67e

Improve readmes

fix atomic file swap on osx (#413)

1f2bd23

trace SQL commands (#416)

dd8e3b4

sqld: bump rusqlite to revision without memory leak for Wasm UDF (#408)

b9da6b5

Ref: #169

Improve user guide (#418)

c8241fd

Improve formatting of SQLite errors in Hrana (#417)

5f14c5b

split large transactions (#423)

1ef830c

treewide: temporarily nuke mwal_backend (#429)

a1d8510

It can be recovered once we decide to make it production-ready, and meanwhile let's drop it so that we end up with cleaner code.

add debug tools (#430)

bc84a41

remove bottomless tests (#431)

91d5f83

busyproof sqld (#436)

3bd099a

fix off by one replication bug (#440)

4f13fac

libsql: allocate a vector for column values once (#425)

1dfa2f6

Since the expected size of the values vector is known, let's allocate its memory once instead of relying on reallocations to grow the vector.

Prebuilt docker container instructions (#435)

ff0ceb2

bump sqlite3 parser to 0.8.0 (#443)

5642ae3

remove cargo chef from Dockerfile (#441)

c2661e8

avoid panicking when adress already in use (#442)

42bf2a6

MarinPostma and others added 25 commits October 6, 2023 13:59

remove useless docker build in CI (#746)

22fad81

Remove encoding message limit (#747)

2b7b2d4

hrana: Properly catch stmt errors in batch (#748)

1e2d7d1

bump sqld to v0.21.8 (#749)

cd80e60

enable savepoint (#472)

8ff275c

* add savepoint and release stmt categories * forbid savepoints in legacy http

add retry to s3 uploads in bottomles (#750)

4b109fe

bump sqld to v0.21.9 (#753)

7f0fdcf

verify after db restore is done (#751)

57ba103

Make CI faster (#754)

c685bb6

* Group all checks together in the same workflow Reusing artifacts ends up being faster than parallelizing with the current CI resources. * Set same RUSTFLAGS in both jobs && remove unnecessary steps * Cache rust artifacts

sqld: make wasm opt-out (#758)

6f841a6

This will allow ARMv7 architecture users to compile sqld. Wasmtime currently does not have support for that arch.

use anyhow bail to return error (#756)

81bdd50

prom metrics (#760)

9312755

* wip * add prometheus metrics probes * expose metrics admin endpoint * move metric route to /metrics * use histogram macro

disable metrics for tests (#769)

e83af25

Make cargo-chef effective for CI builds (#763)

6cc36c7

* Optimize runtime container layers * Use Github Actions cache for Docker * Move rm -rf /var/lib/apt/lists/* to its own line

docs: add a wildcard domain mention to the user guide (#771)

e855161

While testing multitenancy, it's more convenient to use a wildcard domain rather than edit /etc/hosts manually.

Update README.md (#772)

73b9203

sqld: Add extended_code support (#768)

2417e6f

* Add embedded replica test * sqld: Add `extended_code` support

error: fix typo and normalize messages with a Title starting word (#773)

7cf962c

libsql: vacuum less often (#778)

8652fe6

For specific edge cases, a 32MiB threshold is a little small. Let's bump it to at least 256MiB of data and inform about vacuuming with info log level.

Revert "Same-thread frame injection (#695)" (#775)

85b3358

This reverts commit 61b0406.

sqld: honor checkpoint-interval-s even without bottomless (#781)

e2aeecd

There's no reason I'm aware of to only make checkpoint-interval-s option effective if bottomless is enabled. Botomlessless instances should also be able to checkpoint periodically.

Merge remote-tracking branch 'sqld/main'

98b0b2b

MarinPostma added this pull request to the merge queue Oct 17, 2023

Merged via the queue into main with commit 37b695b Oct 17, 2023
7 checks passed

MarinPostma deleted the graft-sqld branch October 17, 2023 10:22

MarinPostma restored the graft-sqld branch October 17, 2023 10:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

graft sqld #465

graft sqld #465

MarinPostma commented Oct 17, 2023 •

edited

Loading

graft sqld #465

graft sqld #465

Conversation

MarinPostma commented Oct 17, 2023 • edited Loading

MarinPostma commented Oct 17, 2023 •

edited

Loading