You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Atomate drone changes has caused issues for MP production, since it is not feasible to re-ingest whenever the drone changes, and performing aggregations is difficult if a key changes (most recent example of this was the delta_volume_as_percent key).
The atomate drone is quite stable now so hopefully this issue will be minimal going forwards, however I suggest two changes:
The task doc needs a schema (jsonschema). We have tools to easily build up schemas that we've been using, so this will just need to be implemented, it will be an additional dependency however (the jsonschema package itself). JSON Schema is good because it is a standard and, also, can be easily attached to a Mongo collection to perform on-the-fly validation. It is also not necessarily strict, i.e. additional keys might be added and the schema will still validate, it will just ensure that a set of known keys are present. By adopting this standard schema, we can also remove the ad-hoc validation currently in the drone.
We should have code for database migrations. Whenever the schema does change, even if it's just a trivial key rename, we could have a script eg migrate_tasks_0.65_to_0.66.py or similar, as a central source of authority for how to process old task docs.
The text was updated successfully, but these errors were encountered:
Atomate drone changes has caused issues for MP production, since it is not feasible to re-ingest whenever the drone changes, and performing aggregations is difficult if a key changes (most recent example of this was the delta_volume_as_percent key).
The atomate drone is quite stable now so hopefully this issue will be minimal going forwards, however I suggest two changes:
The task doc needs a schema (jsonschema). We have tools to easily build up schemas that we've been using, so this will just need to be implemented, it will be an additional dependency however (the jsonschema package itself). JSON Schema is good because it is a standard and, also, can be easily attached to a Mongo collection to perform on-the-fly validation. It is also not necessarily strict, i.e. additional keys might be added and the schema will still validate, it will just ensure that a set of known keys are present. By adopting this standard schema, we can also remove the ad-hoc validation currently in the drone.
We should have code for database migrations. Whenever the schema does change, even if it's just a trivial key rename, we could have a script eg migrate_tasks_0.65_to_0.66.py or similar, as a central source of authority for how to process old task docs.
The text was updated successfully, but these errors were encountered: