-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does atomate2 (or any other package specific to a job) need to be installed on the JFR server #41
Comments
I had this question too, I think the answer from @gpetretto / @davidwaroquiers at the time was that the runner is also intended to be run locally (i.e., the JFR server is just for the database). I find this a little counter-intuitive and think we need to do better separating compute environments from workflow environments. |
I should say that we usually don't have a separation between the "local" machine and the one running the daemon. So up to now we always had only two entities:
I guess the problem that you are raising is not so trivial. First, it does not happen during the check_out (since the state is CHECKED_OUT it means that the checkout already happened). The problem is when uploading. See:
The problem is that uploading requires resolving the job references, so the |
Ok, I think this should be possible to fix. It is relatively easy to get the references without needing to deserialise the whole thing. I might already have some code in jobflow to do that. I'll have a look where is the full job object is used. Perhaps we can get away without deserialising any attributes of the job. |
If you already have this part I will give it a try. Since the object is often deserialized in the Runner to access the content as properties and not navigating through nested dictionaries, I will need to go thorugh the points where this happens. I would also add one potential downside of splitting the machine where the jf command and the daemon are executed. In some cases I check if the Runner is active before proceeding with an action. e.g.: jobflow-remote/src/jobflow_remote/cli/job.py Line 263 in 138a7b1
Anyway, this is not a mandatory check, but the idea is helping the user to prevent running actions that may lead to inconsistencies. |
Here is the code that can find references. You can pass a serialised dict to this and it will return It is quite convenient to have a centralised server running the daemon, I suppose a future update could have the daemon also run a server that can be connected to. |
I have worked on avoiding to deserialize everything that required external packages for the Runner. There were several points that I needed to modify, as it was relying heavily on the object's attributes and methods. In some cases just for convenience in others to avoid reimplementing functionalities. --- a/src/jobflow/core/reference.py
+++ b/src/jobflow/core/reference.py
@@ -170,7 +170,6 @@ class OutputReference(MSONable):
data = cache[self.uuid][index]
# decode objects before attribute access
- data = MontyDecoder().process_decoded(data)
# re-cache data in case other references need it
cache[self.uuid][index] = data
@@ -180,7 +179,7 @@ class OutputReference(MSONable):
data = (
data[attr]
if attr_type == "i" or isinstance(data, dict)
- else getattr(data, attr)
+ else data.get(attr)
)
return data I tested it with a couple of workflows (including atomate2) but I am not sure if there are cases that I am not considering. I think that this changes in jobflow-remote are beneficial independently of the centralised server approach, since they will also contribute to reduce the |
Sounds a good to me, I would say such an addition in jobflow would be reasonable with the deserialize switch argument. What do you think @utf ? |
Thanks so much for looking into this. I think adding a deserialise option sounds like a good solution and should work in most cases. This won't work if the user tries to do something like |
To make things flexible (your example with structure.volume will not work if jobflow-remote is always avoiding the deserialization), then we should have a config option in jobflow-remote to say "I want to always serialize/deserialize objects (or not)" then. What should be the default ? I would say to avoid deserialization but I am not so sure. Consider when (in the future) some jobs (e.g. supercell generation, refine_structure or things like this) are actually executed on the jobflow-remote server (instead of submitted to a queue of a cluster). Then you would actually need things to be deserialized. |
Indeed, I did not think of derived properties. As @davidwaroquiers said, there is no way to tell beforehand whether the deserialization is needed or not.
--- a/src/jobflow/core/reference.py
+++ b/src/jobflow/core/reference.py
@@ -170,18 +170,13 @@ class OutputReference(MSONable):
data = cache[self.uuid][index]
# decode objects before attribute access
- data = MontyDecoder().process_decoded(data)
# re-cache data in case other references need it
cache[self.uuid][index] = data
for attr_type, attr in self.attributes:
# i means index else use attribute access
- data = (
- data[attr]
- if attr_type == "i" or isinstance(data, dict)
- else getattr(data, attr)
- )
+ data = data[attr]
return data If it fails because of a What do you think? |
This can be closed as it was fixed in #47 right ? @gpetretto |
I don't think this is completely solved. Quoting from the PR:
Some time has passed but I think we never decided how to handle the cases of derived properties. The point is that there are cases where it is mandatory to deserialize, or the references cannot be resolved. The options should still be those mentioned above: #41 (comment) |
My setup is as follows:
I thought I would only need to install atomate2 on the HPC remote and local computer, since all code specific to atomate2 will only get executed on these machines. However, when trying to run a workflow in this configuration, I get an error on checkout
Is this the intended behaviour?
The text was updated successfully, but these errors were encountered: