Review discussion on new changes for stateless agents #1
0xsuryansh
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Moving the PR conversation here
How do we know what is up to date and what's not in the DB? And how do we know about intents that are created between agent restarts?
To solve these problems I think we might need to persist the last block that we have indexed and start from there on next run.
There would also need to be some mechanism to continue processing unfinished intents (or other events that we care about) after restart.
To me this looks more like building around an event sourcing pattern where it will be better to store a sequence of state-changing events. Applications persist events in an event store, which is a database of events.
Whenever the state of the entity changes, a new event is appended to the list of events. The application on reboot can reconstructs current state by replaying the events.
I would also like to add that maybe later when we scale we can also use a distributed cache with built-in replication features. This should ensures that if one cache node fails, another can take over with minimal impact.
In the case of complete cache failure we can fallback to the database.. maybe have a write back policy for the data and the block indexed.
Counter Argument to myself : I don't think we need to store the events (again), they are already stored in the blockchain. It's more useful to store the state I believe.
let in_progress_intents = self.db_client.get_in_progress_intents().await?;
It's a bit strange how this is implemented. A new (infinite?) intent stream is created for every in progress intent.
I think it should work like this:
it will query the db for any events whose handling are not finished, and handle them again.
(This will require that the event handling is actually idempotent, for there's a chance that the handling is actually finished but it's not recorded in the db.)
The indexer will start from the next block of the last block that it has indexed.
(There only needs to be one block number in the db, that is the last block that we have indexed.)
It's much like having a WAL.
On bounded channel:
Using a bounded channel between the indexer and the handler is desirable so the indexer can wait a bit if the handler cannot keep up. (This is often called backpressure.)
Artemis don't support bounded channel/backpressure (it uses pub/sub which will lose events if the handler cannot keep up). So it might be actually necessary to move away from Artemis here.
The WAL is for the agent/on the application level, it's not db WAL. It's also not a file but actually stored in the db as the e.g. in_progress field of intents.
There would be a bounded channel between the indexer and the handler:
The handler may also choose to restore/maintain some in-memory state (CCMM would
probably work like this). In this case it may be desirable to postpone starting
the indexer, so that db state doesn't change when the handler is restoring
state.
The handler may also save and emit tasks for the next stage of handlers to
perform.
On second thought, even for
get_unhandled_events
we should postpone starting the indexer so that we don't get duplicated events.Having a channel in between will make things a bit more complicated. If we don't need the concurrency, we can use just a single task:
This is all to make sure that on arbitrary interruption and restart, we won't miss any events or any handling of the events.
Beta Was this translation helpful? Give feedback.
All reactions