Would users ever require a different hub model-output
directory structure?
#53
Replies: 2 comments 3 replies
-
The concepts of teams/models and rounds are central to hubs and so I can't imagine a hub objecting to using those. So I don't think we need to support "less" than the above structure. I could imagine a hub might want to allow for additional levels of file structure, e.g. using further file path information as is standard in arrow data sets to allow for rapid queries on target id or output type variables. For example, a hub might want to specify that files are saved in paths like On the data load side, support for this is almost out of the box from apache arrow, right? But it might take some more thought to specify how the file paths should be organized in hub metadata. I would personally file support for this type of thing as a relatively low priority future enhancement so that we can focus on higher priority validation stuff for now. |
Beta Was this translation helpful? Give feedback.
-
I think this is a matter of philosophy of the project. If we state up front that the goal here is not to claim to be exhaustive, but instead to do what seems to be appropriate based on our experience, we can leave an opening for other variants if/when they arise. That said, this feels like a case where prematurely generalizing without a specific driving use case could lead to difficulty, so I'd say stick with the current structure until something else arises. |
Beta Was this translation helpful? Give feedback.
-
Currently we specify the partitioning (directory structure) of the
model-output
directory only in the documentation and assume it is is fixed.This obviously makes things simple for
hubUtils
. I am wondering however whether there would be any situation where this structure might not meet user needs and whether an option for flexibility, i.e. the user to be able to set a different partitioning structure, would be required?Beta Was this translation helpful? Give feedback.
All reactions