Enable context serialization in workflows #16250

logan-markewich · 2024-09-27T03:07:29Z

This is an initial stab at serializing the context in workflows

wf = Workflow(allow_pickle=True)

handler = wf.run()
_ = await handler

state_dict = get_state_dict(handler)
new_handler = get_handler_from_state_dict(wf, state_dict)

This would allow for a few use-cases (which both a prime candidates for usage in llama-deploy, and other deployment scenarios)

storing the context between runs
stoping and resuming a workflow mid-run (probably during stepwise execution)

Some considerations on this

the serialization logic could be its own module, to allow users to customize it. I'm not sure if we want to do that or not though
I'm not entirely convinced where this API should live -- right now its a util function hiding the actual operations on the context
maybe the pickling fallback should be opt-in?

llama-index-core/llama_index/core/workflow/handler.py

llama-index-core/llama_index/core/workflow/context.py

masci · 2024-10-01T14:56:11Z

llama-index-core/llama_index/core/workflow/context.py

@@ -40,6 +83,83 @@ def __init__(self, workflow: "Workflow", stepwise: bool = False) -> None:
        # Step-specific instance
        self._events_buffer: Dict[Type[Event], List[Event]] = defaultdict(list)

+        # keep track of all the event classes that are accepted by the workflow
+        self._event_classes: Dict[str, Type[Event]] = {}
+        for step_func in self._workflow._get_steps().values():


Instead of keeping a list of class objects, we could serialize a class with its qualified name (see https://github.com/deepset-ai/haystack/blob/main/haystack/core/serialization.py#L74) that later we can pass to importlib with something like https://github.com/deepset-ai/haystack/blob/main/haystack/core/serialization.py#L195

interesting, will take a look!

I have an approach working with this... slightly concerned about security issues, but I suppose if you are importing something already on your machine, and thats an issue for you, you have other problems 😅

llama-index-core/llama_index/core/workflow/context.py

logan-markewich · 2024-10-01T23:35:51Z

Ok, latest pushes address Messi's suggestions

make serializers multiple classes
don't attach serializers to class instances
don't hide the serialization machinery too much

I also added one unit test after merging in the latest changes to the stepwise api, but its broken (spooky CICD is caching stuff I guess?) -- I really want to serialize a run mid-way through, but so far, its not working 🤔 Although I haven't been able to track down why yet

nerdai · 2024-10-02T03:25:33Z

Looks like all check passed? Guess it wasn't any of the checked in unit tests that were failing? @logan-markewich

nerdai

Looks good to me! A couple minor nits.

nerdai · 2024-10-02T03:41:09Z

llama-index-core/llama_index/core/workflow/context_serializers.py

+                module_class = import_module_from_qualified_name(data["qualified_name"])
+                return module_class.from_dict(data["value"])
+        except Exception as e:
+            breakpoint()


do we need breakpoint here. Do we want to raise?

oh lol leftover debugging, whoops

nerdai · 2024-10-02T03:47:09Z

llama-index-core/llama_index/core/workflow/context.py

+                raise ValueError(f"Failed to deserialize value for key {key}: {e}")
+        return deserialized_globals
+
+    def to_dict(self, serializer: Optional[BaseSerializer] = None) -> Dict[str, Any]:


minor nit: wondering if it would be useful to make this into at least a TypedDict?

nerdai

Looks good to me! A couple minor nits.

logan-markewich added 3 commits September 25, 2024 15:28

wip

73b78d3

serialize the context

e2860c8

merge main

c3aaff8

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Sep 27, 2024

linting

6ded96b

logan-markewich changed the title ~~[WIP] Enable context serialization in workflows~~ Enable context serialization in workflows Sep 30, 2024

logan-markewich requested a review from masci September 30, 2024 17:26

logan-markewich added 2 commits September 30, 2024 15:24

make pickle opt-in

f43bd3d

update types

63e0436

logan-markewich requested a review from nerdai September 30, 2024 21:57

masci reviewed Oct 1, 2024

View reviewed changes

logan-markewich added 5 commits October 1, 2024 14:04

merge main

c603140

fix tests

0e6c838

use import for serialize/deserialize

31e4a0b

add (failing) test

e25dd5e

Refactors

9b624c4

nerdai approved these changes Oct 2, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Oct 2, 2024

logan-markewich added 2 commits October 2, 2024 09:02

remove breakpoint

d731115

fix continuing an in-progress run

6693c44

logan-markewich merged commit ba2cc90 into main Oct 2, 2024
10 checks passed

logan-markewich deleted the logan/workflow_serialize branch October 2, 2024 17:10

raspawar pushed a commit to raspawar/llama_index that referenced this pull request Oct 7, 2024

Enable context serialization in workflows (run-llama#16250)

93d51e6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable context serialization in workflows #16250

Enable context serialization in workflows #16250

logan-markewich commented Sep 27, 2024 •

edited

Loading

masci Oct 1, 2024

logan-markewich Oct 1, 2024

logan-markewich Oct 1, 2024

logan-markewich commented Oct 1, 2024 •

edited

Loading

nerdai commented Oct 2, 2024

nerdai left a comment

nerdai Oct 2, 2024

logan-markewich Oct 2, 2024

nerdai Oct 2, 2024

nerdai left a comment

Enable context serialization in workflows #16250

Enable context serialization in workflows #16250

Conversation

logan-markewich commented Sep 27, 2024 • edited Loading

masci Oct 1, 2024

Choose a reason for hiding this comment

logan-markewich Oct 1, 2024

Choose a reason for hiding this comment

logan-markewich Oct 1, 2024

Choose a reason for hiding this comment

logan-markewich commented Oct 1, 2024 • edited Loading

nerdai commented Oct 2, 2024

nerdai left a comment

Choose a reason for hiding this comment

nerdai Oct 2, 2024

Choose a reason for hiding this comment

logan-markewich Oct 2, 2024

Choose a reason for hiding this comment

nerdai Oct 2, 2024

Choose a reason for hiding this comment

nerdai left a comment

Choose a reason for hiding this comment

logan-markewich commented Sep 27, 2024 •

edited

Loading

logan-markewich commented Oct 1, 2024 •

edited

Loading