Aggregate design (locks) #329
Unanswered
josecelano
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I want to ask some questions about the way Torrust Tracker handles concurrency.
Introduction
The torrust tracker uses locks to handle concurrency.
In this discussion, I'm going to focus only on the "tracker entries" which are stored in the
torrents
attribute:That attribute contains the core tracker data. Each entry contains information about a given torrent in the tracker. The entry has the following:
Even though the discussion is about this attribute, all the questions, conclusions, etcetera might apply to other attributes:
keys
,whitelist
, ...The
torrents
attribute is aRwLock<std::collections::BTreeMap<InfoHash, torrent::Entry>>
.Each entry contains the following:
Both the list of torrents and the list of torrent peers are BTreeMaps.
The whole data structure is wrapped with a RwLock, which means only one peer can announce simultaneously. When the app receives the
announce
request, the request is handled by theTracker
struct, which tries to acquire a write lock to add the peer to the peer list for the torrent announced by the peer.TL;DR
There are two main questions:
RwLock
is for all the torrent data. Could we reduce the data for the lock? For example, a lock for updating one peer in a torrent peer list. That would limit the number of concurrent updates to only one peer on the torrent peer list. And I think we do not need more than that because we should not get simultaneousannounce
requests from the same peer.Why do we need locks?
The application uses concurrency, which means some threads (or tasks) could be trying to announce peers at the same time.
There is only one big object in memory
RwLock<std::collections::BTreeMap<InfoHash, torrent::Entry>>
, and all requests try to update the same object. TheBTreeMap
allows only one updater simultaneously (Rust rule). You cannot callBTreeMap::get_mut()
twice on the same key in aBTreeMap
.To synchronize writers (requests that are trying to update the same peer), we use the
RwLock
. That lock guarantees that one request updates the same peer. The problem is we are forcing a write sequence for any update or insert, limiting the number of inserts/updates that could be done simultaneously.What kind of inconsistency are we trying to avoid?
There are many invariants:
0
bytes left to download.The nature of BTreeMaps and Rust enforces all rules. You cannot:
So, in this case, the locks do not enforce invariant directly, but they only coordinate writes to avoid panicking.
Problem: What's the problem with locks and the current solution?
There are two problems:
announce
requests at the same time from two different peers, and they are the first request. The second request has to wait until the first one is finished. Even if they are for different torrents.RwLock
s are better for scenarios where you have many readers and few writers, which is not the case. @Power2All has pointed me to an alternative solution like Crossbeam Channels.Process to find (hopefully) a better performance solution.
I think we could follow a multi-phase solution:
This should be a long-term goal. And keeping in mind that performance it's only one of the aspects we might be interested in improving.
Aggregate design
As I've mentioned, we could try changing the data structures to reduce the data we need the lock for. For example:
This would not be a big improvement because I do not see a reason to update stats without updating peers, but we could have a lock per key:
This option looks like you are allowed to update peers in parallel. The lock is only for a single key. But when you acquire the lock, and you reference an item as mutable, you are borrowing the whole
HashMap
, so actually, I think the result should be the same.The problem with Rust (if I'm not wrong) is you cannot have a dynamically sized data structure when you apply mutability to a subset of the data, at least for this kind of collection (HashMap, BTreeMap, ...).
So the question is, even if we design aggregates as small as possible to reduce contention, would there be a way to implement those aggregates in memory?
Looking at @Power2All implementation using crossbeam channels, I've realized that maybe crossbeam channels are indeed the aggregates. You have the BTreeMap always mutable and there is only one thread updating that data structure. Synchronization is implemented with the channel. The channel allow performing just one update at the same time. THe good think about this implementation is that you can update whatever you want ni the data structure because you only have one writer. The data structure is like a database table and the channel is the aggregate. I guess you can do the same with locks.
You can have one thread per aggregate, which is a kind of database table (in memory). The channel or lock is aggregate, you can only update the state throw the aggregate. I suppose channels are a more natural solution once you have to communicate from N threads to the main one (the aggregate).
Diagrams
I've created some diagrams to understand the current architecture and other alternatives.
Torrust Tracker Aggregates.pdf
cc @da2ce7 @WarmBeer @Power2All
Beta Was this translation helpful? Give feedback.
All reactions