Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message passing with more information #291

Merged
merged 17 commits into from
Dec 13, 2024

Conversation

unkcpz
Copy link
Member

@unkcpz unkcpz commented Dec 1, 2024

  • Simplipy _create_state_instance so it only need to do real object creation.
  • Killed state message is a message type
  • Make message not global variable and able to attach with more information.
  • pause/play/state all using build method to build the message.

@unkcpz unkcpz marked this pull request as draft December 1, 2024 11:30
@unkcpz
Copy link
Member Author

unkcpz commented Dec 1, 2024

A lot of single quotes to double quotes changes came from my IDE linter, should be reduced if #289 get merged and this PR will contains much less line changes.

@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch 2 times, most recently from ed2a043 to 96c5842 Compare December 1, 2024 21:12
Copy link

codecov bot commented Dec 1, 2024

Codecov Report

Attention: Patch coverage is 92.13483% with 7 lines in your changes missing coverage. Please review.

Project coverage is 90.90%. Comparing base (14b7c1a) to head (b1d9159).
Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
src/plumpy/processes.py 84.00% 4 Missing ⚠️
src/plumpy/base/state_machine.py 86.67% 2 Missing ⚠️
src/plumpy/process_comms.py 97.37% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #291      +/-   ##
==========================================
+ Coverage   90.77%   90.90%   +0.13%     
==========================================
  Files          22       22              
  Lines        2990     2999       +9     
==========================================
+ Hits         2714     2726      +12     
+ Misses        276      273       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch from 43e1520 to 737adf3 Compare December 1, 2024 23:56
@unkcpz unkcpz added pr/blocked PR is blocked by another PR that should be merged first and removed pr/blocked PR is blocked by another PR that should be merged first labels Dec 2, 2024
@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch from 737adf3 to 1b2864c Compare December 2, 2024 08:34
@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch 3 times, most recently from 2ffdab2 to d0e4e73 Compare December 2, 2024 08:41
@unkcpz unkcpz marked this pull request as ready for review December 2, 2024 08:41
@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch from b218122 to e3c2ae8 Compare December 2, 2024 08:50
src/plumpy/base/state_machine.py Outdated Show resolved Hide resolved
src/plumpy/process_comms.py Outdated Show resolved Hide resolved
@@ -300,16 +314,28 @@ def _fire_state_event(self, hook: Hashable, state: Optional[State]) -> None:
def on_terminated(self) -> None:
"""Called when a terminal state is entered"""

def transition_to(self, new_state: Union[Hashable, State, Type[State]], *args: Any, **kwargs: Any) -> None:
def transition_to(self, new_state: State | type[State] | None, **kwargs: Any) -> None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build new_state with Hashable and from State class are the same, I made the use of transition_to more consistent by allow only passing the State class or an existed state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong about this, using Hashable key to get the state type is more generic, it allows to redefine the state class dynamically for the specific state machine by overriding the STATES_MAP. I'll revert changes back.

src/plumpy/base/state_machine.py Outdated Show resolved Hide resolved
@unkcpz unkcpz requested a review from khsrali December 2, 2024 09:17
@unkcpz unkcpz requested a review from sphuber December 2, 2024 09:18
@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch from b1d9159 to e5c74ad Compare December 2, 2024 12:53
@unkcpz unkcpz marked this pull request as draft December 3, 2024 00:49
@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch from 7c105bd to 17a541a Compare December 3, 2024 09:53
Comment on lines 1218 to 1220
state_class = self.get_states_map()[process_states.ProcessState.KILLED]
new_state = self._create_state_instance(state_class, msg=msg)
self.transition_to(new_state)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare to the one liner of original implementation, the new one seems more verbose. The state object needs to be created from the state tag with the parameters directly and passed to the transition_to. But, in my opinion, this API is a bit readable. The transition_to now only need to deal with the case where the passed in state is an object, instead of decide whether needs to create the state instance case by case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, to make this line more readable:

state_class = self.get_states_map()[process_states.ProcessState.KILLED]

may I suggest to somehow simplify it?

maybe this way:

@classmethod
def get_states_class(cls, state) -> Type[State]:
    return cls.get_states_map()[state]

state_class = self.get_states_class(process_states.ProcessState.KILLED)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did it by putting it into _create_state_instance, and it gets better. THanks.

@unkcpz unkcpz marked this pull request as ready for review December 3, 2024 09:58
Copy link
Contributor

@khsrali khsrali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @unkcpz, I did a first round, mainly asked some questions.

@@ -31,7 +45,7 @@ class StateEntryFailed(Exception): # noqa: N818
Failed to enter a state, can provide the next state to go to via this exception
"""

def __init__(self, state: Hashable = None, *args: Any, **kwargs: Any) -> None:
def __init__(self, state: State, *args: Any, **kwargs: Any) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this change and what does it mean?
also please note the default value None is taken away, is this backward compatible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function inside plumpy is only called:

    @super_check
    def on_finish(self, result: Any, successful: bool) -> None:
        """Entering the FINISHED state."""
        if successful:
            validation_error = self.spec().outputs.validate(self.outputs)
            if validation_error:
                state_cls = self.get_states_map()[process_states.ProcessState.FINISHED]
                finished_state = state_cls(self, result=result, successful=False)
                raise StateEntryFailed(finished_state)

I did the same to create the state first and then pass to the function for consistency as transition_to.

also please note the default value None is taken away, is this backward compatible?

It depends, I think we don't have a clear public API list for plumpy. Since aiida-core didn't use this directly, it does not break backward compatibility of aiida-core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends, I think we don't have a clear public API list for plumpy. Since aiida-core didn't use this directly, it does not break backward compatibility of aiida-core.

ok, agreed, I'm also not so strict in this case, since so far aiida-core is the only known dependent of this plumpy

src/plumpy/base/state_machine.py Outdated Show resolved Hide resolved
) -> None:
# If we are creating, then reraise instead of failing.
if final_state == process_states.ProcessState.CREATED:
raise exception.with_traceback(trace)

self.transition_to(process_states.ProcessState.EXCEPTED, exception, trace)
state_class = self.get_states_map()[process_states.ProcessState.EXCEPTED]
new_state = self._create_state_instance(state_class, exception=exception, trace_back=trace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wherever self._create_state_instance is called also self.get_states_map() is called just before it.
Maybe it would make more sense to move that get_states_map() logic directly inside _create_state_instance() ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, I did this for future factoring after and in the end I arrive at only accept the ProcessState enum type as argument instead of accepting state object, you can have a look at https://github.com/unkcpz/plumpy/blob/6bfb87df69889f0082e8ecbe7732d24ad69e64c2/src/plumpy/base/state_machine.py#L159-L164

Is that more clear? I was plan to include change for a later release and am working on more refactoring for some interfaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thank you. It makes sense to write a comment in the definition of _create_state_instance, so to understand why you made this choice of design.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I do the other way as you suggested, passing the process_states.ProcessState.EXCEPTED directly to the _create_state_instance and having self.get_states_map call from inside.


class KillMessage:
@classmethod
def build(cls, message: str | None = None, force: bool = False) -> MessageType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still call it force_kill for clarity

Suggested change
def build(cls, message: str | None = None, force: bool = False) -> MessageType:
def build(cls, message: str | None = None, force_kill: bool = False) -> MessageType:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@unkcpz , maybe you forgot to update this 😅


class KillMessage:
@classmethod
def build(cls, message: str | None = None, force: bool = False) -> MessageType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note, aiida-core sends a kill message here and then here, how the flag should be send over?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message = 'Killed through `verdi process kill`'
control.kill_processes(processes, all_entries=all_entries, timeout=timeout, wait=wait, message=message)

For this one, the message should be build first in to a KillMessage and then send over.

message = KillMessage.build(message= 'Killed through `verdi process kill`')
control.kill_processes(processes, all_entries=all_entries, timeout=timeout, wait=wait, message=message)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the one in _perform_action, it is the same, but I'd prefer to have the argument passing close to the function call of kill_process

_perform_actions(processes, controller.kill_process, 'kill', 'killing', timeout, wait, msg=message)

change it to

message = KillMessage.build(message=message, force=force) 
_perform_actions(processes, functool.partial(controller.kill_process, msg=message), 'kill', 'killing', timeout, wait)

Copy link
Contributor

@khsrali khsrali Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but now aiida-core needs to import KillMessage from plumpy only for adding a flag. wouldn't be easier that msg itself could be a dictionary?

message = {'msg' : 'Killed through `verdi process kill`', 
           'options' : { 
                      'force_kill':True
                       }}
control.kill_processes(processes, controller.kill_process, .., msg=message)

and then plumpy would well receive all options listed, if any.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but now aiida-core needs to import KillMessage from plumpy only for adding a flag

Why it is a problem for importing KillMessage? I think message is a good interface that can flexibly support backward compatibility. We make all four message type public APIs of plumpy and when using process communicator it requires sending over through this message. In the future even if we want to change the real message to other type we can do it without break users' code.

wouldn't be easier that msg itself could be a dictionary?

It is a dictionary returned by build but construct a dictionary by hand is error-prone. Using the build function of the message, the function signature will tell developers which options are allowed to be passed. We can make this works also by dataclass, but then that is overkill for a message in my opinion.

Copy link
Contributor

@khsrali khsrali Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well this is the twisted things that we were always complaining about, import from plumpy to only make a dictionary only for an input of a plumpy function itself...
I mean you see my point 😄 🤌

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2¢'s: I think it is rarely preferable to have a free-form dictionary to specify inputs to an API, especially if it is nested, because you will always have to start reading the docs/code to know how it is structured. Using the other approach, (data) classes, explicit function/method arguments etc. these can be auto inspected by an IDE and are much easier to use. The fact that you have to import something is in my eyes not really a problem. Anyway to use a library you have to import something to do anything.

Now in this case, ideally the force_kill argument would have been directly available alongside the message to whatever method/function is called. But if this is not possible with the current design and it has to be incorporated in the message itself, than the KillMessage approach would be a good alternative. I would perhaps just consider making it a proper dataclass to get automatic type validation, e.g.

from dataclass import dataclass

@dataclass
class KillMessage:

    message: str | None = None
    force_kill: bool = False

    @classmethod
    def build(...):
        ....

Actually, if using a dataclass, why not just use the constructor to build the instance instead of using the build classmethod. KillMessage(message="something, force_kill=True) is nicer than KillMessage.build(message="something, force_kill=True) I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import from plumpy to only make a dictionary only for an input of a plumpy function itself

I won't say "only", encapsulate it in a function as a contract also serve as an input validator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @sphuber, I just saw your comment, I had my browser open for during the weekend and I directly reply to Ali's comment without refresh.

Yes, thought about dataclass, and I think it can work. I use build into a dict mostly for the reason to keep the message the same as what was passing as before, since this is the message that should also go through de/serialize by the en/decoder and I am not 100% confident with that.

I think msgpack will take care of the dataclass for sure, but since the en/decoder used was the yaml one, I am not so sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here 😄 I had the page open without refreshing, so I didn't notice Sebastiaan's comment.

KillMessage(message="something", force_kill=True) is nicer than KillMessage.build(message="something", force_kill=True) I think

I see and agree with this suggestion, ✔️

@unkcpz unkcpz requested a review from khsrali December 9, 2024 12:09
src/plumpy/base/state_machine.py Outdated Show resolved Hide resolved

class KillMessage:
@classmethod
def build(cls, message: str | None = None, force: bool = False) -> MessageType:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2¢'s: I think it is rarely preferable to have a free-form dictionary to specify inputs to an API, especially if it is nested, because you will always have to start reading the docs/code to know how it is structured. Using the other approach, (data) classes, explicit function/method arguments etc. these can be auto inspected by an IDE and are much easier to use. The fact that you have to import something is in my eyes not really a problem. Anyway to use a library you have to import something to do anything.

Now in this case, ideally the force_kill argument would have been directly available alongside the message to whatever method/function is called. But if this is not possible with the current design and it has to be incorporated in the message itself, than the KillMessage approach would be a good alternative. I would perhaps just consider making it a proper dataclass to get automatic type validation, e.g.

from dataclass import dataclass

@dataclass
class KillMessage:

    message: str | None = None
    force_kill: bool = False

    @classmethod
    def build(...):
        ....

Actually, if using a dataclass, why not just use the constructor to build the instance instead of using the build classmethod. KillMessage(message="something, force_kill=True) is nicer than KillMessage.build(message="something, force_kill=True) I think

Copy link
Contributor

@khsrali khsrali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @unkcpz, I added more minor comments. I think overall looks good, I think we should really consider the suggestion by Sebastiaan

KillMessage(message="something", force_kill=True)

I think that would simplify the api and also makes it more readable & usable in aiida-core, in general.

@@ -31,7 +45,7 @@ class StateEntryFailed(Exception): # noqa: N818
Failed to enter a state, can provide the next state to go to via this exception
"""

def __init__(self, state: Hashable = None, *args: Any, **kwargs: Any) -> None:
def __init__(self, state: State, *args: Any, **kwargs: Any) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends, I think we don't have a clear public API list for plumpy. Since aiida-core didn't use this directly, it does not break backward compatibility of aiida-core.

ok, agreed, I'm also not so strict in this case, since so far aiida-core is the only known dependent of this plumpy

src/plumpy/base/state_machine.py Outdated Show resolved Hide resolved
src/plumpy/base/state_machine.py Outdated Show resolved Hide resolved

class KillMessage:
@classmethod
def build(cls, message: str | None = None, force: bool = False) -> MessageType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here 😄 I had the page open without refreshing, so I didn't notice Sebastiaan's comment.

KillMessage(message="something", force_kill=True) is nicer than KillMessage.build(message="something", force_kill=True) I think

I see and agree with this suggestion, ✔️

) -> None:
# If we are creating, then reraise instead of failing.
if final_state == process_states.ProcessState.CREATED:
raise exception.with_traceback(trace)

self.transition_to(process_states.ProcessState.EXCEPTED, exception, trace)
state_class = self.get_states_map()[process_states.ProcessState.EXCEPTED]
new_state = self._create_state_instance(state_class, exception=exception, trace_back=trace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thank you. It makes sense to write a comment in the definition of _create_state_instance, so to understand why you made this choice of design.


class KillMessage:
@classmethod
def build(cls, message: str | None = None, force: bool = False) -> MessageType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@unkcpz , maybe you forgot to update this 😅

Comment on lines 1218 to 1220
state_class = self.get_states_map()[process_states.ProcessState.KILLED]
new_state = self._create_state_instance(state_class, msg=msg)
self.transition_to(new_state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, to make this line more readable:

state_class = self.get_states_map()[process_states.ProcessState.KILLED]

may I suggest to somehow simplify it?

maybe this way:

@classmethod
def get_states_class(cls, state) -> Type[State]:
    return cls.get_states_map()[state]

state_class = self.get_states_class(process_states.ProcessState.KILLED)

src/plumpy/process_comms.py Outdated Show resolved Hide resolved
src/plumpy/base/state_machine.py Outdated Show resolved Hide resolved
src/plumpy/base/state_machine.py Outdated Show resolved Hide resolved
@unkcpz
Copy link
Member Author

unkcpz commented Dec 11, 2024

KillMessage(message="something, force_kill=True) is nicer than KillMessage.build(message="something, force_kill=True) I think

According to chatgpt, if the dataclass used with yaml loader for serialize/deserialize, it need to be passed as dict. I think KillMessage.build is more versatile then dataclass if we want to change the message protocol in the future, for example to support msgpack. The build method can still be called but return the type that conform with the new message protocol. With using dataclass, we somehow limit ourself that message must be the dataclass, which is not ideal.

@unkcpz
Copy link
Member Author

unkcpz commented Dec 11, 2024

Discussed with @khsrali, I think we agree on the API which seems more user friendly for construct different messages. By having a collection type MessageBuilder and have different message construct as class methods, this bundle the messages in one place and provide a reasoning way to call MessageBuilder.kill to construct kill message for instance.

MessageType = Dict[str, Any]


class MessageBuilder:
    """MessageBuilder will construct different messages that can passing over communicator."""

    @classmethod
    def play(cls, message: str | None = None) -> MessageType:
        """The play message send over communicator."""
        return {
            INTENT_KEY: Intent.PLAY,
            MESSAGE_KEY: message,
        }

    @classmethod
    def pause(cls, message: str | None = None) -> MessageType:
        """The pause message send over communicator."""
        return {
            INTENT_KEY: Intent.PAUSE,
            MESSAGE_KEY: message,
        }

    @classmethod
    def kill(cls, message: str | None = None, force_kill: bool = False) -> MessageType:
        """The kill message send over communicator."""
        return {
            INTENT_KEY: Intent.KILL,
            MESSAGE_KEY: message,
            FORCE_KILL_KEY: force_kill,
        }

    @classmethod
    def status(cls, message: str | None = None) -> MessageType:
        """The status message send over communicator."""
        return {
            INTENT_KEY: Intent.STATUS,
            MESSAGE_KEY: message,
        }

@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch from b6cd8b3 to ba7c821 Compare December 11, 2024 17:52
@unkcpz unkcpz requested a review from khsrali December 11, 2024 17:52
@unkcpz unkcpz force-pushed the Message-passing-with-more-information branch 3 times, most recently from bcd64bc to d725a51 Compare December 11, 2024 19:39
@khsrali
Copy link
Contributor

khsrali commented Dec 11, 2024

Thanks a lot @unkcpz, looks good!
So now, we can use it like below in here in aiida-core:

msg = MessageBuilder.kill(text='killed through `verdi process kill` ', force_kill = True)
_perform_actions(processes, controller.kill_process, 'kill', 'killing', timeout, wait, msg=message)

Can you please open a matching PR in aiida-core ?
Thanks again

@unkcpz
Copy link
Member Author

unkcpz commented Dec 12, 2024

Can you please open a matching PR in aiida-core ?

Sure! I think we can merge this PR and continue with #288. Then we make a 0.23.0 release. I'll then open a PR and tested in aiida-core?

Copy link
Contributor

@khsrali khsrali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @unkcpz, looks very good to me.

If you want we can wait a bit, in case @sphuber has more comments..
Then we can merge it.

@unkcpz
Copy link
Member Author

unkcpz commented Dec 12, 2024

Thanks @khsrali. Yes, I'll wait till tomorrow if Seb has any more comments to add.

@unkcpz unkcpz merged commit f760b4a into aiidateam:master Dec 13, 2024
7 checks passed
@unkcpz unkcpz deleted the Message-passing-with-more-information branch December 13, 2024 09:10
unkcpz added a commit to unkcpz/plumpy that referenced this pull request Dec 14, 2024
…on (aiidateam#291)

The messages to passing over rabbitmq for process control is build dynamically and able to carry more information. In the old implementation, the messages ace global dictionary variables and when the message need to change by copy which is error-prone. 
This commit introduce the `MessageBuilder` with class methods for creating kill/pause/status/play messages. For "kill" message, I also add support for passing the `force_kill` option.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants