Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify carton_program_search to accept initial query #23

Merged
merged 4 commits into from
Jun 26, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 19 additions & 10 deletions python/valis/db/queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,9 @@ def carton_program_map(key: str = 'program') -> dict:
return mapping


def carton_program_search(name: str, name_type: str) -> peewee.ModelSelect:
def carton_program_search(name: str,
name_type: str,
query: peewee.ModelSelect | None = None) -> peewee.ModelSelect:
""" Perform a search on either carton or program

Parameters
Expand All @@ -252,22 +254,29 @@ def carton_program_search(name: str, name_type: str) -> peewee.ModelSelect:
Either the carton name or the program name
name_type: str
Which type you are searching on, either 'carton' or 'program'
query : ModelSelect
An initial query to extend. If ``None``, a new query with all the unique
``sdss_id``s is created.

Returns
-------
peewee.ModelSelect
the ORM query
"""
model = vizdb.SDSSidFlat.select(peewee.fn.DISTINCT(vizdb.SDSSidFlat.sdss_id))\
.join(targetdb.Target,
on=(targetdb.Target.catalogid == vizdb.SDSSidFlat.catalogid))\
.join(targetdb.CartonToTarget)\
.join(targetdb.Carton)\
.where(getattr(targetdb.Carton, name_type) == name)
return vizdb.SDSSidStacked.select().join(
model, on=(model.c.sdss_id == vizdb.SDSSidStacked.sdss_id)
)

if query is None:
query = vizdb.SDSSidFlat.select(peewee.fn.DISTINCT(vizdb.SDSSidFlat.sdss_id))

query = (query.join(
vizdb.SDSSidFlat,
on=(vizdb.SDSSidFlat.sdss_id == vizdb.SDSSidStacked.sdss_id))
.join(targetdb.Target,
on=(targetdb.Target.catalogid == vizdb.SDSSidFlat.catalogid))
.join(targetdb.CartonToTarget)
.join(targetdb.Carton)
.where(getattr(targetdb.Carton, name_type) == name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll likely need to do this for many of the query functions we have, as we add them to the main search. Is this how you'd recommend modifying them? Did you try the select_extend method? If so, how did that compare?

Relatedly, should we be writing our queries differently to make this kind of single-use or dynamic extension easier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The select_extend() will only help if we want to return more columns than in the initial query select. Do you want this function to also add returning the program and carton columns? As it is written right now that function can be called with any query that initiates with a SDSSidStacked model and it will restrict it to that carton or program.

The problem with the original query was that it would do a subquery to return all the unique sdss_ids and then subset to only those in the program or carton. That's a very expensive query and I think not necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I'd like to add the ability in the main search to specify additional columns to return. These columns in principle could come from any table, which might make things more complicated, but that can be addressed later.

RIght now only a single carton or program can be selected. It probably doesn't make since to return the carton or program name in that case. I'd like to eventually move to a multi-select option for program/carton, in which case it would be nice to have those values returned. I think the query would have to be re-written anyways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for multiple cartons this query would need to be rewritten. It's fairly easy to change the .where(getattr(targetdb.Carton, name_type) == name)) to .where(getattr(targetdb.Carton, name_type).in_(name))) but I'd wait until that functionality is implemented in the API/Zora since the IN statement is less efficient than ==.

But I did test adding .select_extend(targetdb.Carton.carton) for the carton and program after carton_program_search has been called and that works fine.


return query

def get_targets_obs(release: str, obs: str, spectrograph: str) -> peewee.ModelSelect:
""" Return all targets with spectra from a given observatory
Expand Down
7 changes: 4 additions & 3 deletions python/valis/routes/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,10 @@ async def main_search(self, body: SearchModel):
query = get_targets_by_sdss_id(body.id)

# build the program/carton query
elif body.program or body.carton:
query = carton_program_search(body.program or body.carton, 'program' if body.program else 'carton')

if body.program or body.carton:
query = carton_program_search(body.program or body.carton,
'program' if body.program else 'carton',
query=query)
# append query to pipes
query = append_pipes(query)

Expand Down
Loading