Have a dummy in-RAM coordinator for the functional tests #402

darosior · 2022-05-10T20:19:17Z

#393 further increases the load on the coordinators and the Postgre backend. I have considered a number of things (a thread pool in the coordinator, polling vaults in parrallel in the sigfetcher, a cache for get_sigs result in the functional tests, ...). I think in the end we just have to face the fact that our protocol is (purposefully) inefficient and signature fetching is slow. Still, this is a small improvement on the state of things that helps to prevent timeouts in #393.

Using a single Postgre backend and a regular coordinator per instance
worked for a time, but it's getting to its limits. The functional tests,
when ran in parrallel, put an unreasonable load on the Postgre backend.
This leads to spurious failures and timeouts that are hard to deal with
in the coordinatord or revaultd binaries themselves without
optimizing for functional tests.. Which isn't a realistic load.

I tried to implement a cache to get_sigs requests using a proxy between
the revaultds and the Coordinator in a single instance. Although it
(partially) solved the timeouts in the RPC commands that hit the
coordinator, it slowed tests down by a lot.

Instead, here we implement a fully in-RAM coordinator for the lowest
possible response time. After all, we don't need to persist the data for
the functional tests and we can afford a few more MB of RAM usage.
Of course, we keep the possibility of running using the real
coordinator, but we would probably not do so in the CI that is already
too slow. 3 tests would be skipped, though. But not the most important
ones, and we don't only rely on CI anyways.
As a by-product, we can now run the functional tests without having to
start a Postgre backend beforehand. I think it can help new contributors
a lot.

darosior · 2022-05-10T20:35:22Z

Master, -n 20: 45 passed, 5 skipped in 335.05s (0:05:35)
Master, -n 10: 45 passed, 5 skipped in 412.12s (0:06:52)

This branch, -n 10: 45 passed, 5 skipped in 290.53s (0:04:50)
This branch, -n 20: 45 passed, 5 skipped in 251.27s (0:04:11)

darosior · 2022-05-10T20:43:14Z

TODO: update the README for the tests

tests/test_framework/revault_network.py

Using a single Postgre backend and a regular coordinator per instance worked for a time, but it's getting to its limits. The functional tests, when ran in parrallel, put an unreasonable load on the Postgre backend. This leads to spurious failures and timeouts that are hard to deal with in the `coordinatord` or `revaultd` binaries themselves without optimizing for functional tests.. Which isn't a realistic load. I tried to implement a cache to get_sigs requests using a proxy between the revaultds and the Coordinator in a single instance. Although it (partially) solved the timeouts in the RPC commands that hit the coordinator, it slowed tests down by a lot. Instead, here we implement a fully in-RAM coordinator for the lowest possible response time. After all, we don't need to persist the data for the functional tests and we can afford a few more MB of RAM usage. Of course, we keep the possibility of running using the real coordinator, but we would probably not do so in the CI that is already too slow. 3 tests would be skipped, though. But not the most important ones, and we don't only rely on CI anyways. As a by-product, we can now run the functional tests without having to start a Postgre backend beforehand. I think it can help new contributors a lot.

We can afford shorter intervals with the dummy coordinator.

darosior · 2022-05-19T10:05:18Z

tests/test_framework/coordinatord.py

+            if method == "sig":
+                # FIXME: the field name is a misimplementation!
+                # TODO: mutex
+                if params["id"] not in self.sigs:
+                    self.sigs[params["id"]] = {}
+                self.sigs[params["id"]][params["pubkey"]] = params["signature"]
+                # TODO: remove this useless response from the protocol
+                resp = {"result": {"ack": True}, "id": request["id"]}
+                self.send_msg(client_fd, client_noise, json.dumps(resp))
+
+            elif method == "get_sigs":
+                # FIXME: the field name is a misimplementation of the protocol!
+                txid = params["id"]
+                sigs = self.sigs.get(txid, {})
+                resp = {"result": {"signatures": sigs}, "id": request["id"]}
+                self.send_msg(client_fd, client_noise, json.dumps(resp))
+
+            elif method == "set_spend_tx":
+                for outpoint in params["deposit_outpoints"]:
+                    self.spend_txs[outpoint] = params["transaction"]
+                # TODO: remove this useless response from the protocol
+                resp = {"result": {"ack": True}, "id": request["id"]}
+                self.send_msg(client_fd, client_noise, json.dumps(resp))


Need to update the messages after revault_net update!

darosior force-pushed the dummy_coordinator branch 2 times, most recently from 028fe41 to f86fd92 Compare May 10, 2022 20:32

darosior force-pushed the dummy_coordinator branch 4 times, most recently from daa2f85 to 919f725 Compare May 12, 2022 10:01

darosior commented May 12, 2022

View reviewed changes

tests/test_framework/revault_network.py Outdated Show resolved Hide resolved

darosior force-pushed the dummy_coordinator branch 3 times, most recently from ee11b60 to 4ce781c Compare May 12, 2022 14:38

darosior added 8 commits May 13, 2022 10:34

ci: increase task timeout

28e670d

ci: don't run in VERBOSE, we don't use it

4a1480d

ci: increase the number of workers

dbcbcff

qa: stop daemons in parrallel in revault_network fixture

87e9b25

qa: decrease the coordinator poll interval

356389d

We can afford shorter intervals with the dummy coordinator.

qa: don't compile nor use the coordinator, run the dummy one

995e4e5

qa: update the README to make Postgre optional, and some fixups

2179813

darosior force-pushed the dummy_coordinator branch from 4ce781c to 2179813 Compare May 13, 2022 08:56

darosior commented May 19, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have a dummy in-RAM coordinator for the functional tests #402

Have a dummy in-RAM coordinator for the functional tests #402

darosior commented May 10, 2022 •

edited

Loading

darosior commented May 10, 2022

darosior commented May 10, 2022

darosior May 19, 2022

Have a dummy in-RAM coordinator for the functional tests #402

Are you sure you want to change the base?

Have a dummy in-RAM coordinator for the functional tests #402

Conversation

darosior commented May 10, 2022 • edited Loading

darosior commented May 10, 2022

darosior commented May 10, 2022

darosior May 19, 2022

Choose a reason for hiding this comment

darosior commented May 10, 2022 •

edited

Loading