-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable context serialization in workflows #16250
Conversation
@@ -40,6 +83,83 @@ def __init__(self, workflow: "Workflow", stepwise: bool = False) -> None: | |||
# Step-specific instance | |||
self._events_buffer: Dict[Type[Event], List[Event]] = defaultdict(list) | |||
|
|||
# keep track of all the event classes that are accepted by the workflow | |||
self._event_classes: Dict[str, Type[Event]] = {} | |||
for step_func in self._workflow._get_steps().values(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of keeping a list of class objects, we could serialize a class with its qualified name (see https://github.com/deepset-ai/haystack/blob/main/haystack/core/serialization.py#L74) that later we can pass to importlib
with something like https://github.com/deepset-ai/haystack/blob/main/haystack/core/serialization.py#L195
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, will take a look!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an approach working with this... slightly concerned about security issues, but I suppose if you are importing something already on your machine, and thats an issue for you, you have other problems 😅
Ok, latest pushes address Messi's suggestions
I also added one unit test after merging in the latest changes to the stepwise api, but its broken (spooky CICD is caching stuff I guess?) -- I really want to serialize a run mid-way through, but so far, its not working 🤔 Although I haven't been able to track down why yet |
Looks like all check passed? Guess it wasn't any of the checked in unit tests that were failing? @logan-markewich |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! A couple minor nits.
module_class = import_module_from_qualified_name(data["qualified_name"]) | ||
return module_class.from_dict(data["value"]) | ||
except Exception as e: | ||
breakpoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need breakpoint
here. Do we want to raise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh lol leftover debugging, whoops
raise ValueError(f"Failed to deserialize value for key {key}: {e}") | ||
return deserialized_globals | ||
|
||
def to_dict(self, serializer: Optional[BaseSerializer] = None) -> Dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit: wondering if it would be useful to make this into at least a TypedDict
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! A couple minor nits.
This is an initial stab at serializing the context in workflows
This would allow for a few use-cases (which both a prime candidates for usage in llama-deploy, and other deployment scenarios)
Some considerations on this