diff --git a/COPYING b/COPYING index edc06c580..8a666e874 100644 --- a/COPYING +++ b/COPYING @@ -1,12 +1,7 @@ -Code in this repository is non-free. Portions of this code include or -are derivate works of code published under the BSD 3-clause license. The -license below applies ONLY TO THOSE PORTIONS. Code authored by employees -or contractors of EQ Alpha Technology are not licensed for use without express -written permission of EQ Alpha Technology. All rights are reserved. - Copyright (c) 2006-2020, Salvatore Sanfilippo Copyright (C) 2019-2021, John Sully Copyright (C) 2020-2021, EQ Alpha Technology Ltd. +Copyright (C) 2022 Snap Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: diff --git a/README.md b/README.md index b00867b41..f69ef58ba 100644 --- a/README.md +++ b/README.md @@ -2,16 +2,20 @@ ![CI](https://github.com/JohnSully/KeyDB/workflows/CI/badge.svg?branch=unstable) [![StackShare](http://img.shields.io/badge/tech-stack-0690fa.svg?style=flat)](https://stackshare.io/eq-alpha-technology-inc/eq-alpha-technology-inc) +##### KeyDB is now a part of Snap Inc! Check out the announcement [here](https://docs.keydb.dev/news/2022/05/12/keydb-joins-snap) + +##### [Release v6.3.0](https://github.com/EQ-Alpha/KeyDB/releases/tag/v6.3.0) is here with major improvements as we consolodate our Open Source and Enterprise offerings into a single BSD-3 licensed project. See our [roadmap](https://docs.keydb.dev/docs/coming-soon) for details. + ##### Want to extend KeyDB with Javascript? Try [ModJS](https://github.com/JohnSully/ModJS) ##### Need Help? Check out our extensive [documentation](https://docs.keydb.dev). -##### NEW!!! KeyDB now has a Slack Community Workspace. Click [here](https://docs.keydb.dev/slack/) to learn more and join the KeyDB Community Slack workspace. +##### KeyDB is on Slack. Click [here](https://docs.keydb.dev/slack/) to learn more and join the KeyDB Community Slack workspace. What is KeyDB? -------------- -KeyDB is a high performance fork of Redis with a focus on multithreading, memory efficiency, and high throughput. In addition to multithreading, KeyDB also has features only available in Redis Enterprise such as [Active Replication](https://github.com/JohnSully/KeyDB/wiki/Active-Replication), [FLASH storage](https://github.com/JohnSully/KeyDB/wiki/FLASH-Storage) support, and some not available at all such as direct backup to AWS S3. +KeyDB is a high performance fork of Redis with a focus on multithreading, memory efficiency, and high throughput. In addition to performance improvements, KeyDB offers features such as Active Replication, FLASH Storage and Subkey Expires. KeyDB has a MVCC architecture that allows you to execute queries such as KEYS and SCAN without blocking the database and degrading performance. KeyDB maintains full compatibility with the Redis protocol, modules, and scripts. This includes the atomicity guarantees for scripts and transactions. Because KeyDB keeps in sync with Redis development KeyDB is a superset of Redis functionality, making KeyDB a drop in replacement for existing Redis deployments. @@ -30,48 +34,75 @@ KeyDB has a different philosophy on how the codebase should evolve. We feel tha Because of this difference of opinion features which are right for KeyDB may not be appropriate for Redis. A fork allows us to explore this new development path and implement features which may never be a part of Redis. KeyDB keeps in sync with upstream Redis changes, and where applicable we upstream bug fixes and changes. It is our hope that the two projects can continue to grow and learn from each other. +Project Support +------------------- + +The KeyDB team maintains this project as part of Snap Inc. KeyDB is used by Snap as part of its caching infrastructure and is fully open sourced. There is no separate commercial product and no paid support options available. We really value collaborating with the open source community and welcome PRs, bug reports, and open discussion. For community support or to get involved further with the project check out our community support options [here](https://docs.keydb.dev/docs/support) (slack, forum, meetup, github issues). Our team monitors these channlels regularly. + + Additional Resources -------------------- -Check out KeyDB's [Docker Image](https://hub.docker.com/r/eqalpha/keydb) +Try the KeyDB [Docker Image](https://hub.docker.com/r/eqalpha/keydb) Join us on [Slack](https://docs.keydb.dev/slack/) -Post to the [Community Forum](https://community.keydb.dev) +Learn more using KeyDB's extensive [documentation](https://docs.keydb.dev) -Learn more through KeyDB's [Documentation & Learning Center](https://docs.keydb.dev) +Post to our [Community Forum](https://community.keydb.dev) + +See the [KeyDB Roadmap](https://docs.keydb.dev/docs/coming-soon) to see what's in store Benchmarking KeyDB ------------------ -Please note keydb-benchmark and redis-benchmark are currently single threaded and too slow to properly benchmark KeyDB. We recommend using a redis cluster benchmark tool such as [memtier](https://github.com/RedisLabs/memtier_benchmark). Please ensure your machine has enough cores for both KeyDB and memteir if testing locally. KeyDB expects exclusive use of any cores assigned to it. +Please note keydb-benchmark and redis-benchmark are currently single threaded and too slow to properly benchmark KeyDB. We recommend using a redis cluster benchmark tool such as [memtier](https://github.com/RedisLabs/memtier_benchmark). Please ensure your machine has enough cores for both KeyDB and memtier if testing locally. KeyDB expects exclusive use of any cores assigned to it. -For more details on how we benchmarked KeyDB along with performance numbers check out our blog post: [Redis Should Be Multithreaded](https://medium.com/@john_63123/redis-should-be-multi-threaded-e28319cab744?source=friends_link&sk=7ce8e9fe3ec8224a4d27ef075d085457) New Configuration Options ------------------------- -With new features comes new options: +With new features comes new options. All other configuration options behave as you'd expect. Your existing configuration files should continue to work unchanged. +``` server-threads N server-thread-affinity [true/false] +``` +The number of threads used to serve requests. This should be related to the number of queues available in your network hardware, *not* the number of cores on your +machine. Because KeyDB uses spinlocks to reduce latency; making this too high will reduce performance. We recommend using 4 here. By default this is set to two. -The number of threads used to serve requests. This should be related to the number of queues available in your network hardware, *not* the number of cores on your machine. Because KeyDB uses spinlocks to reduce latency; making this too high will reduce performance. We recommend using 4 here. By default this is set to one. - - scratch-file-path /path +``` +min-clients-per-thread 50 +``` +The minimum number of clients on a thread before KeyDB assigns new connections to a different thread. Tuning this parameter is a tradeoff between locking overhead and distributing the workload over multiple cores -If you would like to use the [FLASH backed](https://github.com/JohnSully/KeyDB/wiki/FLASH-Storage) storage this option configures the directory for KeyDB's temporary files. This feature relies on snapshotting to work so must be used on a BTRFS filesystem. ZFS may also work but is untested. With this feature KeyDB will use RAM as a cache and page to disk as necessary. NOTE: This requires special compilation options, see Building KeyDB below. - - db-s3-object /path/to/bucket +``` +replica-weighting-factor 2 +``` +KeyDB will attempt to balance clients across threads evenly; However, replica clients are usually much more expensive than a normal client, and so KeyDB will try to assign fewer clients to threads with a replica. The weighting factor below is intented to help tune this behavior. A replica weighting factor of 2 means we treat a replica as the equivalent of two normal clients. Adjusting this value may improve performance when replication is used. The best weighting is workload specific - e.g. read heavy workloads should set this to 1. Very write heavy workloads may benefit from higher numbers. -If you would like KeyDB to dump and load directly to AWS S3 this option specifies the bucket. Using this option with the traditional RDB options will result in KeyDB backing up twice to both locations. If both are specified KeyDB will first attempt to load from the local dump file and if that fails load from S3. This requires the AWS CLI tools to be installed and configured which are used under the hood to transfer the data. +``` +active-client-balancing yes +``` +Should KeyDB make active attempts at balancing clients across threads? This can impact performance accepting new clients. By default this is enabled. If disabled there is still a best effort from the kernel to distribute across threads with SO_REUSEPORT but it will not be as fair. By default this is enabled +``` active-replica yes - +``` If you are using active-active replication set `active-replica` option to “yes”. This will enable both instances to accept reads and writes while remaining synced. [Click here](https://docs.keydb.dev/docs/active-rep/) to see more on active-rep in our docs section. There are also [docker examples]( https://docs.keydb.dev/docs/docker-active-rep/) on docs. -All other configuration options behave as you'd expect. Your existing configuration files should continue to work unchanged. +``` +multi-master-no-forward no +``` +Avoid forwarding RREPLAY messages to other masters? WARNING: This setting is dangerous! You must be certain all masters are connected to eachother in a true mesh topology or data loss will occur! This command can be used to reduce multimaster bus traffic + + +``` + db-s3-object /path/to/bucket +``` +If you would like KeyDB to dump and load directly to AWS S3 this option specifies the bucket. Using this option with the traditional RDB options will result in KeyDB backing up twice to both locations. If both are specified KeyDB will first attempt to load from the local dump file and if that fails load from S3. This requires the AWS CLI tools to be installed and configured which are used under the hood to transfer the data. + Building KeyDB -------------- @@ -104,6 +135,10 @@ To append a suffix to KeyDB program names, use: ***Note that the following dependencies may be needed: % sudo apt-get install autoconf autotools-dev libnuma-dev libtool +To buik=ld with TLS support, use: + + % make BUILD_TLS=yes + Running the tests with TLS enabled (you will need `tcl-tls` installed): @@ -270,24 +305,6 @@ KeyDB works by running the normal Redis event loop on multiple threads. Network Unlike most databases the core data structure is the fastest part of the system. Most of the query time comes from parsing the REPL protocol and copying data to/from the network. -Future work: - - Allow rebalancing of connections to different threads after the connection - - Allow multiple readers access to the hashtable concurrently - -Docker Build ------------- -Build the latest binaries from the github unstable branch within a docker container. Note this is built for Ubuntu 18.04. -Simply make a directory you would like to have the latest binaries dumped in, then run the following commmand with your updated path: -``` -$ docker run -it --rm -v /path-to-dump-binaries:/keydb_bin eqalpha/keydb-build-bin -``` -You should receive the following files: keydb-benchmark, keydb-check-aof, keydb-check-rdb, keydb-cli, keydb-sentinel, keydb-server - -If you are looking to enable flash support with the build (make MALLOC=memkind) then use the following command: -``` -$ docker run -it --rm -v /path-to-dump-binaries:/keydb_bin eqalpha/keydb-build-bin:flash -``` -Please note that you will need libcurl4-openssl-dev in order to run keydb. With flash version you may need libnuma-dev and libtool installed in order to run the binaries. Keep this in mind especially when running in a container. For a copy of all our Dockerfiles, please see them on [docs]( https://docs.keydb.dev/docs/dockerfiles/). Code contributions ----------------- diff --git a/pkg/deb/master_changelog b/pkg/deb/master_changelog index c180411db..5c90577c0 100644 --- a/pkg/deb/master_changelog +++ b/pkg/deb/master_changelog @@ -1,3 +1,20 @@ +<<<<<<< HEAD +======= +keydb (6:6.3.0-1distribution_placeholder) codename_placeholder; urgency=medium + + * This release open sources KeyDB Enterprise features into the open source project along with PSYNC for active replication + * Partial synchronization for active replication is introduced + * MVCC introduced into codebase from KeyDB Enterprise + * Async commands added: GET, MGET. These will see perf improvements + * KEYS and SCAN commands will no longer be blocking calls + * Async Rehash implemented for additional stability to perf + * IStorage interface added + * In-process background saving (forkless) to comply with maxmemory setting + * See v6.3.0 tagged release notes on github for a detailed explanation of these changes + +-- Ben Schermel Wed, 11 May 2022 20:00:37 +0000 + +>>>>>>> public/main keydb (6:6.2.2-1distribution_placeholder) codename_placeholder; urgency=medium * Acquire lock in module.cpp to fix module test break diff --git a/pkg/rpm/keydb_build/keydb.spec b/pkg/rpm/keydb_build/keydb.spec index b3f7a2639..9cdb16a03 100755 --- a/pkg/rpm/keydb_build/keydb.spec +++ b/pkg/rpm/keydb_build/keydb.spec @@ -27,7 +27,11 @@ getent group keydb &> /dev/null || \ groupadd -r keydb &> /dev/null getent passwd keydb &> /dev/null || \ useradd -r -g keydb -d /var/lib/keydb -s /sbin/nologin \ +<<<<<<< HEAD -c 'KeyDB Enterprise Database Server' keydb &> /dev/null +======= +-c 'KeyDB Database Server' keydb &> /dev/null +>>>>>>> public/main exit 0 #postinstall scriptlet (using /bin/sh): diff --git a/src/asciilogo.h b/src/asciilogo.h index f4fbd360e..8cc69a76a 100644 --- a/src/asciilogo.h +++ b/src/asciilogo.h @@ -32,7 +32,7 @@ const char *ascii_logo = " _ \n" " _-(+)-_ \n" " _-- / \\ --_ \n" -" _-- / \\ --_ KeyDB Enterprise %s (%s/%d) %s bit \n" +" _-- / \\ --_ KeyDB %s (%s/%d) %s bit \n" " __-- / \\ --__ \n" " (+) _ / \\ _ (+) Running in %s mode\n" " | -- / \\ -- | Port: %d\n" diff --git a/src/config.cpp b/src/config.cpp index eaa1e2dd9..fc7799e17 100644 --- a/src/config.cpp +++ b/src/config.cpp @@ -2777,7 +2777,7 @@ standardConfig configs[] = { createBoolConfig("disable-thp", NULL, MODIFIABLE_CONFIG, g_pserver->disable_thp, 1, NULL, NULL), createBoolConfig("cluster-allow-replica-migration", NULL, MODIFIABLE_CONFIG, g_pserver->cluster_allow_replica_migration, 1, NULL, NULL), createBoolConfig("replica-announced", NULL, MODIFIABLE_CONFIG, g_pserver->replica_announced, 1, NULL, NULL), - createBoolConfig("enable-async-commands", NULL, MODIFIABLE_CONFIG, g_pserver->enable_async_commands, 1, NULL, NULL), + createBoolConfig("enable-async-commands", NULL, MODIFIABLE_CONFIG, g_pserver->enable_async_commands, 0, NULL, NULL), createBoolConfig("multithread-load-enabled", NULL, MODIFIABLE_CONFIG, g_pserver->multithread_load_enabled, 0, NULL, NULL), createBoolConfig("active-client-balancing", NULL, MODIFIABLE_CONFIG, g_pserver->active_client_balancing, 1, NULL, NULL), diff --git a/src/db.cpp b/src/db.cpp index 5b67a4198..0ad3df99f 100644 --- a/src/db.cpp +++ b/src/db.cpp @@ -3263,10 +3263,11 @@ bool redisDbPersistentData::prefetchKeysAsync(client *c, parsed_command &command dictEntry **table; __atomic_load(&c->db->m_pdict->ht[iht].table, &table, __ATOMIC_RELAXED); if (table != nullptr) { - dictEntry *de = table[hT]; + dictEntry *de; + __atomic_load(&table[hT], &de, __ATOMIC_ACQUIRE); while (de != nullptr) { _mm_prefetch(dictGetKey(de), _MM_HINT_T2); - de = de->next; + __atomic_load(&de->next, &de, __ATOMIC_ACQUIRE); } } if (!dictIsRehashing(c->db->m_pdict)) diff --git a/src/dict.cpp b/src/dict.cpp index e280d6071..b29c0e24b 100644 --- a/src/dict.cpp +++ b/src/dict.cpp @@ -128,6 +128,7 @@ int _dictInit(dict *d, dictType *type, d->pauserehash = 0; d->asyncdata = nullptr; d->refcount = 1; + d->noshrink = false; return DICT_OK; } @@ -204,7 +205,7 @@ int dictMerge(dict *dst, dict *src) if (dictSize(dst) == 0) { - std::swap(*dst, *src); + dict::swap(*dst, *src); std::swap(dst->pauserehash, src->pauserehash); return DICT_OK; } @@ -212,7 +213,7 @@ int dictMerge(dict *dst, dict *src) size_t expectedSize = dictSize(src) + dictSize(dst); if (dictSize(src) > dictSize(dst) && src->asyncdata == nullptr && dst->asyncdata == nullptr) { - std::swap(*dst, *src); + dict::swap(*dst, *src); std::swap(dst->pauserehash, src->pauserehash); } @@ -402,7 +403,7 @@ int dictRehash(dict *d, int n) { dictAsyncRehashCtl::dictAsyncRehashCtl(struct dict *d, dictAsyncRehashCtl *next) : dict(d), next(next) { queue.reserve(c_targetQueueSize); - __atomic_fetch_add(&d->refcount, 1, __ATOMIC_RELEASE); + __atomic_fetch_add(&d->refcount, 1, __ATOMIC_ACQ_REL); this->rehashIdxBase = d->rehashidx; } @@ -446,6 +447,9 @@ dictAsyncRehashCtl *dictRehashAsyncStart(dict *d, int buckets) { } void dictRehashAsync(dictAsyncRehashCtl *ctl) { + if (ctl->abondon.load(std::memory_order_acquire)) { + ctl->hashIdx = ctl->queue.size(); + } for (size_t idx = ctl->hashIdx; idx < ctl->queue.size(); ++idx) { auto &wi = ctl->queue[idx]; wi.hash = dictHashKey(ctl->dict, dictGetKey(wi.de)); @@ -455,6 +459,9 @@ void dictRehashAsync(dictAsyncRehashCtl *ctl) { } bool dictRehashSomeAsync(dictAsyncRehashCtl *ctl, size_t hashes) { + if (ctl->abondon.load(std::memory_order_acquire)) { + ctl->hashIdx = ctl->queue.size(); + } size_t max = std::min(ctl->hashIdx + hashes, ctl->queue.size()); for (; ctl->hashIdx < max; ++ctl->hashIdx) { auto &wi = ctl->queue[ctl->hashIdx]; @@ -465,6 +472,23 @@ bool dictRehashSomeAsync(dictAsyncRehashCtl *ctl, size_t hashes) { return ctl->hashIdx < ctl->queue.size(); } + +void discontinueAsyncRehash(dict *d) { + // We inform our async rehashers and the completion function the results are to be + // abandoned. We keep the asyncdata linked in so that dictEntry's are still added + // to the GC list. This is because we can't gurantee when the other threads will + // stop looking at them. + if (d->asyncdata != nullptr) { + auto adata = d->asyncdata; + while (adata != nullptr && !adata->abondon.load(std::memory_order_relaxed)) { + adata->abondon = true; + adata = adata->next; + } + if (dictIsRehashing(d)) + d->rehashidx = 0; + } +} + void dictCompleteRehashAsync(dictAsyncRehashCtl *ctl, bool fFree) { dict *d = ctl->dict; assert(ctl->done); @@ -786,6 +810,8 @@ int _dictClear(dict *d, dictht *ht, void(callback)(void *)) { if (callback && (i & 65535) == 0) callback(d->privdata); if ((he = ht->table[i]) == NULL) continue; + dictEntry *deNull = nullptr; + __atomic_store(&ht->table[i], &deNull, __ATOMIC_RELEASE); while(he) { nextHe = he->next; if (d->asyncdata && (ssize_t)i < d->rehashidx) { diff --git a/src/dict.h b/src/dict.h index 72d50dd2c..9fbe8ce4f 100644 --- a/src/dict.h +++ b/src/dict.h @@ -110,12 +110,16 @@ struct dictAsyncRehashCtl { std::atomic abondon { false }; dictAsyncRehashCtl(struct dict *d, dictAsyncRehashCtl *next); + dictAsyncRehashCtl(const dictAsyncRehashCtl&) = delete; + dictAsyncRehashCtl(dictAsyncRehashCtl&&) = delete; ~dictAsyncRehashCtl(); }; #else struct dictAsyncRehashCtl; #endif +void discontinueAsyncRehash(dict *d); + typedef struct dict { dictType *type; void *privdata; @@ -125,6 +129,24 @@ typedef struct dict { dictAsyncRehashCtl *asyncdata; int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */ uint8_t noshrink = false; + +#ifdef __cplusplus + dict() = default; + dict(dict &) = delete; // No Copy Ctor + + static void swap(dict& a, dict& b) { + discontinueAsyncRehash(&a); + discontinueAsyncRehash(&b); + std::swap(a.type, b.type); + std::swap(a.privdata, b.privdata); + std::swap(a.ht[0], b.ht[0]); + std::swap(a.ht[1], b.ht[1]); + std::swap(a.rehashidx, b.rehashidx); + // Never swap refcount - they are attached to the specific dict obj + std::swap(a.pauserehash, b.pauserehash); + std::swap(a.noshrink, b.noshrink); + } +#endif } dict; /* If safe is set to 1 this is a safe iterator, that means, you can call diff --git a/src/evict.cpp b/src/evict.cpp index 20ebc9058..719e7a761 100644 --- a/src/evict.cpp +++ b/src/evict.cpp @@ -423,7 +423,7 @@ int getMaxmemoryState(size_t *total, size_t *logical, size_t *tofree, float *lev if (fPreSnapshot) maxmemory = static_cast(maxmemory * 0.9); // derate memory by 10% since we won't be able to free during snapshot if (g_pserver->FRdbSaveInProgress()) - maxmemory = static_cast(maxmemory*1.5); + maxmemory = static_cast(maxmemory*1.2); /* We may return ASAP if there is no need to compute the level. */ int return_ok_asap = !maxmemory || mem_reported <= maxmemory; diff --git a/src/networking.cpp b/src/networking.cpp index e8e929a20..5b1fe5894 100644 --- a/src/networking.cpp +++ b/src/networking.cpp @@ -1810,6 +1810,8 @@ int writeToClient(client *c, int handler_installed) { is a replica, so only attempt to do so if that's the case. */ if (c->flags & CLIENT_SLAVE && !(c->flags & CLIENT_MONITOR) && c->replstate == SLAVE_STATE_ONLINE) { std::unique_lock repl_backlog_lock (g_pserver->repl_backlog_lock); + // Ensure all writes to the repl backlog are visible + std::atomic_thread_fence(std::memory_order_acquire); while (clientHasPendingReplies(c)) { long long repl_end_idx = getReplIndexFromOffset(c->repl_end_off); @@ -2077,8 +2079,6 @@ int handleClientsWithPendingWrites(int iel, int aof_state) { * that may trigger write error or recreate handler. */ if ((flags & CLIENT_PROTECTED) && !(flags & CLIENT_SLAVE)) continue; - //std::unique_locklock)> lock(c->lock); - /* Don't write to clients that are going to be closed anyway. */ if (c->flags & CLIENT_CLOSE_ASAP) continue; @@ -2096,6 +2096,7 @@ int handleClientsWithPendingWrites(int iel, int aof_state) { /* If after the synchronous writes above we still have data to * output to the client, we need to install the writable handler. */ + std::unique_locklock)> lock(c->lock); if (clientHasPendingReplies(c)) { if (connSetWriteHandlerWithBarrier(c->conn, sendReplyToClient, ae_flags, true) == C_ERR) { freeClientAsync(c); diff --git a/src/rdb.cpp b/src/rdb.cpp index 9c74914d2..397701abe 100644 --- a/src/rdb.cpp +++ b/src/rdb.cpp @@ -1222,6 +1222,8 @@ int rdbSaveInfoAuxFields(rio *rdb, int rdbflags, rdbSaveInfo *rsi) { sdsstring val = sdsstring(sdsempty()); for (auto &msi : rsi->vecmastersaveinfo) { + if (msi.masterhost == nullptr) + continue; val = val.catfmt("%s:%I:%s:%i:%i;", msi.master_replid, msi.master_initial_offset, msi.masterhost.get(), @@ -3047,7 +3049,9 @@ void rdbLoadProgressCallback(rio *r, const void *buf, size_t len) { (r->keys_since_last_callback >= g_pserver->loading_process_events_interval_keys))) { rdbAsyncWorkThread *pwthread = reinterpret_cast(r->chksum_arg); - bool fUpdateReplication = (g_pserver->mstime - r->last_update) > 1000; + mstime_t mstime; + __atomic_load(&g_pserver->mstime, &mstime, __ATOMIC_RELAXED); + bool fUpdateReplication = (mstime - r->last_update) > 1000; if (fUpdateReplication) { listIter li; diff --git a/src/replication.cpp b/src/replication.cpp index fd713e55b..0ef43485b 100644 --- a/src/replication.cpp +++ b/src/replication.cpp @@ -264,9 +264,8 @@ void resizeReplicationBacklog(long long newsize) { newsize = CONFIG_REPL_BACKLOG_MIN_SIZE; if (g_pserver->repl_backlog_size == newsize) return; - std::unique_lock repl_backlog_lock (g_pserver->repl_backlog_lock); - if (g_pserver->repl_backlog != NULL) { + std::unique_lock repl_backlog_lock(g_pserver->repl_backlog_lock); /* What we actually do is to flush the old buffer and realloc a new * empty one. It will refill with new data incrementally. * The reason is that copying a few gigabytes adds latency and even @@ -357,7 +356,7 @@ void freeReplicationBacklog(void) { void feedReplicationBacklog(const void *ptr, size_t len) { serverAssert(GlobalLocksAcquired()); const unsigned char *p = (const unsigned char*)ptr; - + std::unique_lock repl_backlog_lock(g_pserver->repl_backlog_lock, std::defer_lock); if (g_pserver->repl_batch_idxStart >= 0) { /* We are lower bounded by the lowest replica offset, or the batch offset start if not applicable */ @@ -417,6 +416,8 @@ void feedReplicationBacklog(const void *ptr, size_t len) { // We need to update a few variables or later asserts will notice we dropped data g_pserver->repl_batch_offStart = g_pserver->master_repl_offset + len; g_pserver->repl_lowest_off = -1; + if (!repl_backlog_lock.owns_lock()) + repl_backlog_lock.lock(); // we need to acquire the lock if we'll be overwriting data that writeToClient may be reading } } } @@ -5599,6 +5600,9 @@ void flushReplBacklogToClients() serverAssert(g_pserver->master_repl_offset - g_pserver->repl_batch_offStart <= g_pserver->repl_backlog_size); serverAssert(g_pserver->repl_batch_idxStart != g_pserver->repl_backlog_idx); + // Repl backlog writes must become visible to all threads at this point + std::atomic_thread_fence(std::memory_order_release); + listIter li; listNode *ln; listRewind(g_pserver->slaves, &li); diff --git a/src/server.cpp b/src/server.cpp index f356bb96e..a480476e3 100644 --- a/src/server.cpp +++ b/src/server.cpp @@ -73,8 +73,8 @@ #endif int g_fTestMode = false; -const char *motd_url = "http://api.keydb.dev/motd/motd_server_pro.txt"; -const char *motd_cache_file = "/.keydb-enterprise-server-motd"; +const char *motd_url = "http://api.keydb.dev/motd/motd_server.txt"; +const char *motd_cache_file = "/.keydb-server-motd"; /* Our shared "common" objects */ @@ -598,7 +598,7 @@ struct redisCommand redisCommandTable[] = { 0,NULL,1,1,1,0,0,0}, {"hget",hgetCommand,3, - "read-only fast async @hash", + "read-only fast @hash", 0,NULL,1,1,1,0,0,0}, {"hmset",hsetCommand,-4, @@ -606,7 +606,7 @@ struct redisCommand redisCommandTable[] = { 0,NULL,1,1,1,0,0,0}, {"hmget",hmgetCommand,-3, - "read-only fast async @hash", + "read-only fast @hash", 0,NULL,1,1,1,0,0,0}, {"hincrby",hincrbyCommand,4, @@ -630,15 +630,15 @@ struct redisCommand redisCommandTable[] = { 0,NULL,1,1,1,0,0,0}, {"hkeys",hkeysCommand,2, - "read-only to-sort async @hash", + "read-only to-sort @hash", 0,NULL,1,1,1,0,0,0}, {"hvals",hvalsCommand,2, - "read-only to-sort async @hash", + "read-only to-sort @hash", 0,NULL,1,1,1,0,0,0}, {"hgetall",hgetallCommand,2, - "read-only random async @hash", + "read-only random @hash", 0,NULL,1,1,1,0,0,0}, {"hexists",hexistsCommand,3, @@ -650,7 +650,7 @@ struct redisCommand redisCommandTable[] = { 0,NULL,1,1,1,0,0,0}, {"hscan",hscanCommand,-3, - "read-only random async @hash", + "read-only random @hash", 0,NULL,1,1,1,0,0,0}, {"incrby",incrbyCommand,3, @@ -2109,8 +2109,10 @@ void databasesCron(bool fMainThread) { aeAcquireLock(); } - dictCompleteRehashAsync(serverTL->rehashCtl, true /*fFree*/); - serverTL->rehashCtl = nullptr; + if (serverTL->rehashCtl->done.load(std::memory_order_relaxed)) { + dictCompleteRehashAsync(serverTL->rehashCtl, true /*fFree*/); + serverTL->rehashCtl = nullptr; + } } serverAssert(serverTL->rehashCtl == nullptr); @@ -5532,6 +5534,8 @@ sds genRedisInfoString(const char *section) { } unsigned int lruclock = g_pserver->lruclock.load(); + ustime_t ustime; + __atomic_load(&g_pserver->ustime, &ustime, __ATOMIC_RELAXED); info = sdscatfmt(info, "# Server\r\n" "redis_version:%s\r\n" @@ -5574,7 +5578,7 @@ sds genRedisInfoString(const char *section) { supervised, g_pserver->runid, g_pserver->port ? g_pserver->port : g_pserver->tls_port, - (int64_t)g_pserver->ustime, + (int64_t)ustime, (int64_t)uptime, (int64_t)(uptime/(3600*24)), g_pserver->hz.load(), @@ -6272,10 +6276,7 @@ sds genRedisInfoString(const char *section) { if (sections++) info = sdscat(info,"\r\n"); info = sdscatprintf(info, "# KeyDB\r\n" - "variant:enterprise\r\n" - "license_status:%s\r\n" "mvcc_depth:%d\r\n", - "OK", mvcc_depth ); } @@ -7607,7 +7608,7 @@ int main(int argc, char **argv) { try { loadDataFromDisk(); } catch (ShutdownException) { - exit(EXIT_SUCCESS); + _Exit(EXIT_SUCCESS); } if (g_pserver->cluster_enabled) { @@ -7724,7 +7725,9 @@ int main(int argc, char **argv) { g_pserver->garbageCollector.shutdown(); delete g_pserver->m_pstorageFactory; - return 0; + // Don't return because we don't want to run any global dtors + _Exit(EXIT_SUCCESS); + return 0; // Ensure we're well formed even though this won't get hit } /* The End */ diff --git a/src/server.h b/src/server.h index 386f8e754..2a56255ec 100644 --- a/src/server.h +++ b/src/server.h @@ -1080,7 +1080,7 @@ class dict_iter : public dict_const_iter dict_iter() : dict_const_iter(nullptr) {} - explicit dict_iter(nullptr_t) + explicit dict_iter(std::nullptr_t) : dict_const_iter(nullptr) {} explicit dict_iter(dict *d, dictEntry *de) @@ -1904,7 +1904,8 @@ struct MasterSaveInfo { selected_db = 0; } masterport = mi.masterport; - masterhost = sdsstring(sdsdup(mi.masterhost)); + if (mi.masterhost) + masterhost = sdsstring(sdsdup(mi.masterhost)); masterport = mi.masterport; } diff --git a/src/snapshot.cpp b/src/snapshot.cpp index de3398818..dca1071d4 100644 --- a/src/snapshot.cpp +++ b/src/snapshot.cpp @@ -26,17 +26,6 @@ class LazyFree : public ICollectable std::vector vecde; }; -void discontinueAsyncRehash(dict *d) { - if (d->asyncdata != nullptr) { - auto adata = d->asyncdata; - while (adata != nullptr) { - adata->abondon = true; - adata = adata->next; - } - d->rehashidx = 0; - } -} - const redisDbPersistentDataSnapshot *redisDbPersistentData::createSnapshot(uint64_t mvccCheckpoint, bool fOptional) { serverAssert(GlobalLocksAcquired());