This is a back-ported release of a soft durability flush interval feature.
Funding for this was provided by ZeroTier, a company that makes network virtualization software that allows networks to be created and managed effortlessly over both LAN and WAN.
This release is called v2.4.0-srh-extra. Its cluster protocol is not compatible with the official 2.4.0 binaries. The file format is, however, downgrade compatible. You can switch to this new version, try it out, and then switch back if you don't like it.
You may also upgrade from v2.3.6-srh-extra or v2.3.5-srh-extra and keep your settings.
What's new? The same as in the previous v2.3.6-srh-extra release, except for 2.4.0, plus a bugfix:
- Everything in the official 2.4.0 release is here
- Soft durability writes now get their data flushed every 5 seconds, instead of being flushed immediately. If you write and rewrite the same documents over and over again, you might see your I/O bandwidth decrease by a factor of 1000. Or more! (Or less.)
- A new field in the table configuration called
'user_value'
has been added. It's initially set to the empty object,{ }
. You can do the following things with it:- Set it to
{ "srh/flush_interval": 23.3 }
. This will change the flush interval to 23.3 seconds, instead of 5 seconds. - Set it to
{ "srh/flush_interval": 0.01 }
, if you want your flushes to happen relatively quickly. - Set it to
{ "srh/flush_interval": "default" }
. This sets the flush interval to the default value, 5 seconds. - Set it to
{ "srh/flush_interval": "never" }
. This makes flushing "never" happen, under ideal conditions. - Set it to whatever you like, as long as it's an object. Add new fields to the object for your own operational needs.
- Set it to
- The bugfix: In v2.3.6-srh-extra, the server would crash (and then crash on startup!) if you set
'user_value'
to something other than an object. In this release, the server both (a) recovers if'user_value'
got set to something other than an object (which might permit you to upgrade from v2.3.6-srh-extra after putting that into a broken state), and (b) prohibits you from setting the field to anything but an object. The rationale is paternalistic: You don't want to walk yourself into a corner by setting'user_value'
to something you can't add another field to. - Another change: the fix for rethinkdb#6819 is included.
Under cache memory pressure, a flush could still happen sooner than the flush interval demands (even if set to "never"). Setting "never", or a very long flush interval, is only a good idea for tables that comfortably fit into memory. Otherwise, those tables will tend to hog memory that could be put to better use.
Example queries (in JavaScript) for configuring the flush interval of a table:
r.table('foo').config().update({'user_value': {'srh/flush_interval': 23.3}})
r.table('foo').config().update({'user_value': {'srh/flush_interval': 0.01}})
r.table('foo').config().update({'user_value': {'srh/flush_interval': 'default'}})
r.table('foo').config().update({'user_value': {'srh/flush_interval': 'never'}})
An example that sets the flush interval to 23.3 seconds, with some other user data:
r.table('foo').config().update({
'user_value': {'srh/flush_interval': 23.3, 'john/blah': "john's data"}
})
It's not possible to set the user value to whatever you want:
r.table('foo').config().update({'user_value': "hello world"}) /* error message */
To set the user value back to what it was initially:
r.table('foo').config().update({'user_value': { }})
You might also want to set the default durability to "soft":
r.table('foo').config().update({
'durability': 'soft',
'user_value': {'srh/flush_interval': 'never'}
})
Warning: If you set the user value, then downgrade to RethinkDB 2.4.0, and then upgrade back to 2.4.0-srh-extra, you might lose your user value configuration. (By design, you will lose your user value configuration if you modify the table's configuration at all while running RethinkDB 2.4.0.)
You might also notice some performance improvements in secondary index creation, bringing up new replicas, and running unit tests.
You might also notice performance regressions. Please report them to https://github.com/srh/rethinkdb/issues. My take is, if you're running the official RethinkDB 2.4.0, this release is far more likely to improve performance than hurt it, so it's worth using 2.4.0-srh-extra.
Some users (running certain workloads) of v2.3.6-srh-extra reported severe memory leaks. Note that users have also reported memory leaks with v2.3.6 proper. That might be exacerbated by this release somehow. Or it might be fixed in v2.4.0. Time will tell.
Binaries are currently available for Ubuntu 19.10/18.04/16.04/14.04, Debian Buster/Stretch/Jessie, and CentOS 6/7/8. Let me know if you'd like your platform to be added. And let me know if you even noticed this release and use it, or think you might be interested in using it.
Note: The CentOS 8 and Stretch builds are based off the tag https://github.com/srh/rethinkdb/releases/v2.4.0-srh-extra-2, which adds a simple compilation fix.
These changes will (presumably) get pushed upstream for RethinkDB 2.5.