Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent import paths for sibling classes across rail* repos #23

Open
aimalz opened this issue Jul 7, 2023 · 9 comments
Open

Inconsistent import paths for sibling classes across rail* repos #23

aimalz opened this issue Jul 7, 2023 · 9 comments
Assignees

Comments

@aimalz
Copy link
Collaborator

aimalz commented Jul 7, 2023

Related to #19, there are inconsistencies among import locations for analogous functionality between the rail repositories. For example, rail_pzflow defines the FlowHandle in src/rail/tools/flow_handle.py but the other DataHandle subclasses in rail_base are in src/rail/core/data.py. This issue is for establishing consistent locations for analogous subclasses of the same parent class that will be present in the standalone repos, either the way we already do for src/rail/creation/engines and src/rail/estimation/algos (and the same for evaluation once #14 is completed) or another way that's compatible with the namespace setup.

@eacharles
Copy link
Collaborator

eacharles commented Jul 7, 2023 via email

@aimalz
Copy link
Collaborator Author

aimalz commented Jul 14, 2023

  • rail_astro_tools/src/rail/tools/utilPhotometry.py (whose contents I'd expect to find in src/rail/core/photometry_utils.py)
  • rail_astro_tools/src/rail/creation/degradation (which for consistency with src/rail/creation/engines, src/rail/estimation/algos, and src/rail/evaluation/metrics should probably be called rail_astro_tools/src/rail/creation/degraders)
  • rail_bpz/src/rail/bpz/utils.py (with actual content that may or may not belong in rail_bpz/src/rail/estimation/algos/_bpz_utils.py, although the self-import I don't understand is a hint that it's related to namespace magic and might not be subject to this issue at all)
  • rail_fsps/src/rail/added_examples/sed_gen_fsps-demo.ipynb (which I think can straightforwardly be moved to rail/examples/creation_examples/sed_gen_fsps-demo.ipynb as per Gather demo notebooks from older standalones rail#48)
  • rail_fsps/src/rail/creation/sed_generation/generator.py (which is an analogue to Creator so would should be at the same level as rail_base/src/rail/creation/engine.py and suggests that the contents of rail_base/src/rail/creation/engine.py should be split into separate modules; also, it would be more consistent if the current generator.py were called sed_generator.py)
  • rail_fsps/src/rail/creation/sed_generator.py (which belongs in src/rail/engines; also, it would be more consistent if the current sed_generator.py were called fsps_gen.py or something like that)
  • rail_pzflow/src/rail/tools/flow_handle.py (I'm not totally clear on why rail_base's ModelDict/ModelHandle aren't sufficient for this, but if we anticipate standalones to have to define their own DataHandles, we should group them in rail_base/src/rail/core/handles, starting with the current DataHandle subclasses in rail_base/src/rail/core/data.py.)

@aimalz aimalz self-assigned this Jul 14, 2023
@eacharles
Copy link
Collaborator

eacharles commented Jul 17, 2023 via email

@eacharles
Copy link
Collaborator

Ok, it looks like the this has all been address in other PRs.

@aimalz
Copy link
Collaborator Author

aimalz commented Oct 31, 2023

To recap, it looks like the status is as follows (and please correct me if I'm wrong here):

  • rail_astro_tools/src/rail/tools/utilPhotometry.py --> rail_astro_tools/src/rail/core/photometry_utils.py is completed
  • rail_fsps sounds like there are no objections to the proposed changes. In fact, @torluca already took a stab at most of them, and the remaining ones might be blocked by some FSPS-specific data issues IIRC. I'll follow up and make an issue if any actionable consistency improvements remain.

The remaining items are related to extensibility of the code to more underlying algos so their necessity largely depends on how likely we think it is that these will come up again as the railiverse expands.

  • rail_pzflow/src/rail/tools/flow_handle.py -- sorry for not being clearer in my original description! I wasn't suggesting moving content from the standalones into rail_base, because that would indeed make no sense. The proposal is to create a src/rail/core/handles directory in rail_base and the other repos that define custom datahandles, to move the current datahandles defined in rail_base/src/rail/core/data.py to separate modules in that new path in the rail_base repo, and to move custom datahandles from other repos into that path within their own repo. The motivation for this change is that new algorithms in their standalone repos are fairly likely to need custom datahandles, so it should be straightforward to find and access them under the namespace setup, particularly since datahandle types are very visible to users when troubleshooting pipelines.
  • rail_bpz/src/rail/bpz/utils.py -- thanks for the confirmation of how it works! I think this comes down to whether we expect other algorithms to also need special structure for handling paths, or if it's just BPZ, and, if it is going to come up again, how user-facing it is. My sense is that if there wasn't a way to eliminate this path issue for BPZ, there probably won't be a way to do it for other algos that need specialized input files in hardcoded directories (for relevance, I think this is related to though perhaps not exactly the same as what's come up with FSPS recently), so we'd benefit from setting ourselves up for generalizability sooner rather than later. The fact that its presence in demos is stated in "utils" import problem #21 as a reason why the change could be disruptive is a strong indication that it's sufficiently user-facing to merit consideration (and if the "disruption" is isolated to the demos, I'm happy to take care of it). A src/rail/core/path_utils directory would be the most general approach (as in the above proposal for a common import path to custom datahandles) if we have to worry about users being able to figure out how to access this kind of thing for multiple standalone algos.

@eacharles
Copy link
Collaborator

I'm sorry, but it looks to me like both of these changes would be highly disruptive and I'm not seeing that we would gain much. Perhaps someone else can weigh in.

@eacharles
Copy link
Collaborator

To wit, 1) I don't see a problem with having a number of rail_xxx/src/rail/xxx/utils.py files that set specific things for specific packages.
2) moving where the data handles live would break literally everything. It just doesn't seem worth it to me.

@sschmidt23
Copy link
Collaborator

My two cents:
-I haven't looked at the fps structure very much, so I can't say much on that one.

-Do we really see lots of custom data handles being a big issue? I did not expect a huge number of those, so the flow_handle thing seems a bit like overkill to me, particularly from a cost/benefit analysis standpoint.

-For the bpz_utils, I think this is fairly unique to BPZ, it was done this way because the existing code handled file I/O with paths relative to an environment variable base path, and doing it this way was minimally disruptive to that existing code. So, I think it will not be a common occurrence in the future, and we can just let this one instance sit as-is and not worry about it too much.

@ztq1996
Copy link
Contributor

ztq1996 commented Apr 17, 2024

I am moving this out of the v1 tracker since we have decided not to make this change before v1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants