-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve error checking, StaticFW doesn't pass vasp input set params correctly(?) #295
Comments
I just did a quick grep and it seems like this is in This kind of thing would make for a good test and is a good example of how using atomate to make test files for atomate can be dangerous. Maybe the correct solution for this is for FiretaskBase to raise if any keyword arguments to the initializer of the subclass are not in |
Hi all
Thus, if you take the time to fill out Thoughts? |
I think that's part of the problem here. I think allowing flexibility in FireTaskBase in general is good, but in core atomate specifically can bite us. We could be more strict here. The calculations I was running this weekend I had to cross-reference with allowed kwargs from DictSet regardless, and this is very error prone (I missed a plural in the first instance, and no warning was raised). As part of #290, I'm still thinking about ways we can add additional checks ... maybe some kind of validate_dictionary_keys convenience function. For example, INCAR can only take a set of known keys, vasp_input_set_params can only take a set of known keys, we could validate both -- or at least warn if the key isn't known, while still allowing flexibility for power users. |
I agree with the second two points, but not the first. If required parameters don’t need to be specified, doesn’t that mean they aren’t actually required, but optional? Required parameters should be required and optional should be optional? It should be super simple and obvious to the user - no magic:
#2 would catch this issue. |
No, but I think you are getting at something similar. I am saying that |
I see what you mean now. I think it's reasonable to ask that Firetask developers be explicit about the Firetask's API in the form of required and optional parameters. It's a little extra effort when writing the Firetask, but that extra effort will be worth it if the Firetask will be relatively unchanged (as many are now) since they will be read by users (Firework developers) many more times than they were written. |
I already wrote why optional parameters should not be required, please read my explanation from before. Matt seems to get it. Note that there is a difference between requiring something in FireWorks (requires careful thought) versus having a convention in atomate. |
I hate to jump into the fray here, but in my opinion, everything should as far as possible be required. The fewer opportunities there are for people to be lazy, the more robust the code. Re @computron about people wanting to pass kwargs to objects within FireTasks/Works, there has to be some sane guidelines on what should be allowed. My suggested guidelines are pretty simple: if the called method has pretty much the same purpose as an existing method, or is a subclass, kwargs passthrough is reasonable. E.g., if you are writing a PDPlotter with a plot method, then passing through **kwargs to matplotlib.pyplot.plot is fine. Otherwise, the preference should be to use a proper dict arg/kwarg. E.g., allowing passthrough from FireTask to VaspInputSets would be a bad idea (this does not seem to be the case in atomate, and something like WriteVaspInputTask actually has a proper vasp_input_params). kwarg passthrough in atomate's FireWorks basically goes to the parent FireWork superclass, which is fine. There can be only one kwarg passthrough and no other objects are allowed to use it. So kwarg passthrough should be used very judiciously. In short, I would strongly discourage use of kwargs passthrough in most cases. In which case, the default should be to require everyone to specify all required and optional parameters unless there is great reason not to do so. You should also specify a rule that if a variable is named something in the core FireTasks, that parameter naming should be carried through. E.g., |
I have thus gone my original suggestion. If a developer specifies required_params, FireWorks will enforce that required_params are in place (this was already there before). If a developer specified optional_params for their FireTask, FireWorks will enforce that a user does not put any kwarg parameters (outside required or optional params) when initializing that FireTask. Note there are good reasons not to require developers to add |
Also, after making the FWs change, looks like @shyuep 's PyTask fails the tests since this is one of the few FireTasks that was actually doing the whole "pass-through" thing - meaning, PyTask allows a user to put in kwargs that are not explicitly listed in required_params or in optional_params. I've opened a separate ticket on FireWorks for this where we can resolve. |
@computron I think we can have an Atomate FiretaskBase class which other atomate firetasks must subclass. The default would be to enforce all parameters, unless someone turn on a flag called |
If it doesn't make sense to be more strict in FireWorks, I'm on board with atomate having a more strict FiretaskBase in atomate v2. That would be a particularly nice feature since we can catch bugs at the time the Fireworks and workflows are instantiated, rather than at runtime. Anything we can do that direction would be valuable. |
I don't think it makes sense to introduce a new FireTaskBase in atomate, that is just silly. It creates yet another thing for people to learn about - anyone outside this group of people will be utterly confused as to why atomate FireTasks should be subclassing some other base class than the normal one. If you really want to police this, I would suggest just adding a unit test that goes through all the FireTasks and makes sure Note that even if we did set up a different FireTaskBase, e.g. AtomateFireTaskBase (which I think there are many reasons not to), there'd be no automated way to detect that someone subclassed the original FireTaskBase by mistake. |
(unless you also did an automated test to make sure all FireTasks subclassed AtomateFireTaskBase - which at that point, you might as well just have that test check whether required_params and optional_params are present instead of checking the class that it is subclassing) |
The problem with just trying to test it is that it's hard to test the entire surface of arguments that every Firework tries to send to a Firetask. A plain test is fine, but what about a Workflow that has control flow that sets an option in a Firework that gets propagated to a Firetask (the All of atomate has 66 classes that inherit from FIretaskBase. They do not get added very frequently and since all things should go through PRs, it shouldn't be a problem to check it by hand, although (as you said) automated testing is possible. |
It should be impossible to write class DoSomethingFW(Firework):
def __init__(...):
# do stuff
tasks = []
...
tasks.append(MyCustomFiretask(..., bad_input_arg=foo))
super(DoSomethingFW, self).__init___(...) If But yeah, it does seem silly to have |
@bocklund That is already the case with the FWS changes I just pushed.... Look at the recent changes I pushed FWS recently and have been talking about since several comments up.
|
Sorry I missed that, thanks for highlighting it |
@computron I don't think it is silly to have a separate AtomateFireTaskBase that subclasses FireTaskBase and just adds a level of check. This is in fact frequently done in many other classes. The requirements of FireWorks and Atomate are different. Your solution of writing a test to check that all required/optional args are specified is impossible to implement (go ahead and tell me how you are going to inspect that a FireWork is calling a FireTask with the right parameters within a unittest framework, other than having FireWorks itself do the policing). It is also inflexible since exceptions (e.g., PyTask) has to be hard-coded into the test. My suggested solution merely forces the error to come up during runtime (whether in unittest or in normal use) when wrong args/kwargs are passed to an AtomateFireTask, like practically all methods where you supply garbage args/kwargs. e.g., if you supply a Anyway, I have said my piece. |
I think you are misunderstanding.
My suggestion was just to make sure that required_params and optional_params are defined (not None). It is up to the developer to make sure that those parameters are correct.
This is true, but it's not much different than having to manually review which FireTasks must subclass AtomateFireTaskBase and which ones are OK to subclass FireTaskBase. In my suggestions, at least those decisions are at least encoded in a unit test. i.e. we enforce that all FireTasks in atomate must include checks on required_params and optional_params, unless we make a real exception. This is in-line with your own suggestion to only allow flexible optional_params on a very-case-by-case basis. Note that in the current atomate, I don't think a single exception is needed. So it's not like the unit test is going to be littered with exceptions.
See my message to Brandon. This is already the case in the recent changes I pushed to FireWorks. As long as the user supplies required_params and/or optional_params, garbage arguments will throw an error at Runtime. No unit test is needed. For example, you can try to pull the most recent Fireworks (bleeding edge master) and just try to instantiate any atomate Firetask with garbage arguments. The code will fail immediately. No unit test is needed. The only Firetasks that are exceptions to this "fail immediately" behavior are the ones where the person who writes the FireTask does not supply required_params or optional_params. In those cases, the strict checking is skipped. For example, PyTask supplies required_params but not optional_params. FireWorks will thus make sure all required_params are present, but since there is no optional_params, it will not check for extraneous params (since PyTask wants to allow any parameter through its auto_kwarg argument). In short, just check the most recent FireWorks, it already has this. |
Also, since it seems multiple people seem to have missed my comment made on FireWorks changes, I'll explicitly paste the new unit test that it's in FireWorks now that does parameter checking:
Hopefully this will avoid any further questions / confusion on the same topic. |
@computron I did understand your changes. The fact is that it requires someone to have actually specified required and optional parameters for there to have any checks. My suggestion is a minor mod that even if someone did not supply required and optional params, garbage arguments are not allowed by default in Atomate. This is the subtle difference. In Fireworks, this would be allowed. |
We both want atomate FireTasks to include basic argument checking to help prevent problems like what Matt saw. You want to have a separate base class that is almost the same as FireTaskBase, but in which the default required_params and optional_params are essentially both What I want is for everyone in atomate to specify In your case, to enforce that people are following the rules, you must somehow make sure they are subclassing AtomateFiretaskBase. Either at PR time or through a unit test. In my case, to enforce that people are following the rules, I must somehow make sure they are specifying "required_params" and "optional_params". Either at PR time or through a unit test. Functionally, they are both the same. The difficulty of enforcement seems the same to me. I don't know how to convince you that my way will be more intuitive to most people. But, if you feel strongly about your way then you can open an issue to modify the way that it works. If other developers agree that it is more intuitive then I will switch it. |
It should be possible to write a test that asserts that |
@computron Functionally they are the same, but the intuition and ease of bad coding are different. Say I am a lazy coder who just wants to write something that will pass all the unit tests. I write BadFireTask and just set required_params and optional_params to []. This will basically allow me to pass any kind of garbage to the FireTask. Atomate's BDFL sees my FireTask and interprets that I have done my job to specify required_params and optional_params (even though both are []) and merges. Yay, everyone is happy, until Horton the Elephant comes along and stumbles on the Who bug above that is nigh impossible to catch otherwise. In my world, everyone subclasses AtomateFireTaskBase, which is in Atomate itself. I don't have to refer to FireWorks doc to find out what AtomateFireTaskBase is. There is no way I can be lazy and write required_params = [] and optional_params = [] to pass tests, because if something is not in there, it is not allowed. If I instead writes Anyway, that's just my world view. It comes from experience that making a choice is always a cognitive load. Whether you opt-in vs opt-out of organ donations, or 401k participation, for example. Studies have time and time shown again that the path of least resistance must push people to the desired behavior. |
|
@bocklund Ah yes, sorry my mistake. I meant if required and optional args are not specified. |
Brandon is correct. There is no way you can write required_params = [] and optional_params = [] to pass tests in my solution, since in that case if someone passes in any argument at all to the Firetask the code will throw a RuntimeError. I am done speaking on this thread, as there is only so many times I can say the same thing ... additional comments will not get any responses from me. |
If they aren't specified, then the default I think the solution implemented should make everyone here happy. |
Am I reading this incorrectly, or does this just not work for custom input set params? The keyword arg
vasp_input_set_params
should bevasp_input_params
?atomate/atomate/vasp/fireworks/core.py
Line 137 in 4ad8076
atomate/atomate/vasp/firetasks/write_inputs.py
Line 53 in 4ad8076
Just lost some CPU time because of this. I think how we pass around keyword args really needs to be re-thought, it's very brittle and doesn't scale well when you have a lot of nested classes.
At the very least we need more error checking, it's a shame my FW made it all the way to being computed before I realized. I imagine it may be possible to use this option and never realize your settings weren't picked up properly.
Will have to have a read through to see if other FWs are affected.
The text was updated successfully, but these errors were encountered: