-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
graft sqld #465
Merged
Merged
graft sqld #465
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is needed for Atlas integration. Signed-off-by: Piotr Jastrzebski <[email protected]>
Co-authored-by: Doug Stevenson <[email protected]>
Signed-off-by: Piotr Jastrzebski <[email protected]>
Improve readmes
When registering WAL methods and later opening a connection with them, sqld is currently subject to race condition, where multiple threads can register methods by the same name, and then look up the same instance of them. That's dangerous, because our methods hold state. The short-term solution for this problem is to make register+find globally atomic. A proper solution will allow libsql_open() to take a pointer to user state, making any race conditions moot.
If the WAL methods are not deregistered from a global libSQL table before dropping them, we end up with having a dangling pointer in the libSQL structure. It often worked, because the freed memory just didn't manage to get overwritten yet, but it was a time bomb nonetheless. The scenario was as follows: 1. Fiber A registers WAL methods W1 2. Fiber A frees WAL methods W1, but does not deregister them from the libSQL global data structure 3. Fiber B registers WAL methods W2, and registration involves iterating over the data structure that contains a dangling pointer, after the methods were freed by (2.)
Secure mode has a performance and overhead penalty, and we don't really need to be robust against memory reuse attacks in sqld. Simple measurements show that secure mode can bump resident memory usage as much as 250%, plus there's the cpu cost of encrypting freelists and other features that they offer.
libSQL has no notion of checking how much physical memory is available, if there's swap space configured, etc. Normally it limits its own allocations to 2GiB, but on small edge machines it can already prove to be too much to handle. To remedy that, two command-line options are added to sqld - `soft_heap_limit_mb` and `hard_heap_limit_mb`. Soft heap limit hints to libSQL that it should not allocate once the limit is reached and put pressure on shrinking caches, but it will choose to overallocate rather than fail with SQLITE_NOMEM error. Hard limit bails out once an allocation is attempted past the limit. While we could automatically detect the hardware setup on boot, command-line options are good enough for a start to avoid out-of-memory (OOM) conditions. A good rule of thumb would be to set the soft limit to, say, 70% of available physical memory, and hard limit to 90%, leaving some leeway for the operating system. Local tests imply that libSQL is really good at keeping allocations above even soft limit, with a negligible error margin.
Due to how sqld, one of the prime users of bottomless, works, we should ignore passive checkpoints instead of upgrading them to full checkpoints. Reason is, sqld uses short-standing connections per each request, and we certainly don't want to checkpoint multiple times per second.
It can be recovered once we decide to make it production-ready, and meanwhile let's drop it so that we end up with cleaner code.
* make WalHook stateless The WalHook cannot be statefull. Instead, it ties a WalHook implementation to some context, and provides a way to safely register methods, and retrieve the passed context fromt the Wal pointer. * update LibsqlDb to use new WalHooks * logger hook * replica hook * glue stuff
* Propose a stateful Hrana-over-HTTP API * Separate WebSocket specific parts of Hrana into a submodule * Implement the stream mechanism for HTTP API v2 * Implement stream expiration * Refactor and document * Report a nice error when wrong step index is used in BatchCond * Small fixes * Address comments from code review * Fixup Cargo.lock * Add the --http-self-url command-line argument
Since the expected size of the values vector is known, let's allocate its memory once instead of relying on reallocations to grow the vector.
* wal_hook: add on_savepoint_undo and on_checkpoint They will be overwritten along with bottomless integration * treewide: integrate bottomless as WAL hook From now on, bottomless can be enabled in sqld as a regular WalHook. TODO: the main functionality, i.e. actually replicating frames is not hooked to anything yet. It will be done, but it should be heavily considered to just queue writes with an unbounded channel (which allows sync insertions), so that everything happens in the background. * treewide: replicate and checkpoint frames with bottomless Right now the replication and checkpointing is sync and blocking - that will need to be rewritten to async later.
* introduce QueryResultBuilder * hrana query result builder * http query result builder * posgres result builder * update Database trait * update all modules to use QueryResultBuilder * add tests * cleanup unused types * bump toolchain * cargo fmt * remove postgres * review edits
* add savepoint and release stmt categories * forbid savepoints in legacy http
This reverts be7386 and removes decoding limits for the `ReplicationLogProxyClient` so that when replication requests get forwarded to the primary and it returns a full batch of `1024` frames it does not trigger the 4mb decoding limit set by default for tonic gRPC clients.
* Group all checks together in the same workflow Reusing artifacts ends up being faster than parallelizing with the current CI resources. * Set same RUSTFLAGS in both jobs && remove unnecessary steps * Cache rust artifacts
This will allow ARMv7 architecture users to compile sqld. Wasmtime currently does not have support for that arch.
* wip * add prometheus metrics probes * expose metrics admin endpoint * move metric route to /metrics * use histogram macro
* Optimize runtime container layers * Use Github Actions cache for Docker * Move rm -rf /var/lib/apt/lists/* to its own line
* add same thread injector * refactor Frame API; primary return commit frame for snapshots * hook new injector * add test assets * handle fatal replication error * use persistent replication table * add doc commen to snapshot frames_iter_from * review fixes * fmt * handle potential injector execute error
While testing multitenancy, it's more convenient to use a wildcard domain rather than edit /etc/hosts manually.
* connection: fix various comment, code var and doc typos * replication: fix various comment, code var and doc typos * namespace: fix various comment, code var and doc typos * metrics: fix typo in metric docstring * hrana/stmt: fix code typo verion -> version * query_result_builder: fix code typo weigths -> weights * h2c: fix typo in docstring * query_analysis: fix typo in docstring
* Add embedded replica test * sqld: Add `extended_code` support
For specific edge cases, a 32MiB threshold is a little small. Let's bump it to at least 256MiB of data and inform about vacuuming with info log level.
There's no reason I'm aware of to only make checkpoint-interval-s option effective if bottomless is enabled. Botomlessless instances should also be able to checkpoint periodically.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
graft sqld into
libsql-server