Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scope out possible options for state management services #293

Open
logan-markewich opened this issue Oct 5, 2024 · 3 comments
Open

Scope out possible options for state management services #293

logan-markewich opened this issue Oct 5, 2024 · 3 comments
Assignees

Comments

@logan-markewich
Copy link
Collaborator

There are several ways we can offer state management, and also several services that can perform the task

  • do we allow users to hook into any existing kvstore like mongodb or redis
  • do we keep the implementation internal/hidden from the user
  • if we keep it hidden
    • do we build a service/server to handle state management?
    • do we spin up something like a dedicated mongodb or redis instance ourselves
@logan-markewich logan-markewich moved this to Todo in Framework Oct 5, 2024
@logan-markewich logan-markewich self-assigned this Oct 5, 2024
@jonpspri
Copy link
Contributor

Is this related to #273 ? Given that we allow users to choose queuing technology we should probably also allow them to choose kv technology? My plan is to use Redis for both, for example. My more immediate concern is that we'll be double-writing the state -- once to the queue to return it to the control plane and then again to the kv store. We may want to support some sort of retrieval pointer so the service can retrieve (in)directly from the kv store. This approach adds some complexity, but could still be manageable.

@masci
Copy link
Member

masci commented Oct 11, 2024

Thanks for the comment @jonpspri yes, this is related to "context storaging" as described in #273

I am the advocate for separating the context store (let's not call it kv for now as I'm not sure this is an actual requirement) from the message queue and make it private. My reasoning:

  • I would like not to mulitply the effort like we already do to support different storage backends for the message queue
  • I would like to dictate the requirements for the context storage without having the limitations of providing a generic component that needs to work on top of different storage backends (that are supposed to implement a pubsub kind of system in the first place, and might not be ideal for saving the context)
  • I would like the storage to stay "private" in a dedicated instance, limiting the potential issues of sharing the storage backend with other apps/environments (this is a weak point as we already do this for the message queue)

I don't see any reason why the context storage couldn't be a generic component that provides one interface and several concrete implementations relying on the same set of storage backends of the message queue, but I'd like to clear up those points before.

@jonpspri
Copy link
Contributor

To be candid, I saw a PR to add a context storage capability to the Control Plane via the REST interface and it turned me away from the entire llama_deploy project. I preferred the clean approach where the message queue was the communication channel between the control plane and the services while the REST interface served as the communication interface for the rest of the world.

I also would argue that supporting storage backends will be necessary for a truly production-ready system. In my ideal world, I could work with and tune my existing data store standards rather than have to worry about yet another persistence system. I get that this is not ideal from a DevOps perspective, but it will likely end up as the end state.

Persistence is complex! --sigh--

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

3 participants