Support the data-metrics study having a per-study build option #308

mikix · 2024-10-22T15:09:20Z

Problem Statement
The data-metrics study (which is a special case in a lot of ways) wants a new special status of having per-study builds.

What I mean is, we want a user to be able to ask "what are the data metrics of the cohort selected by the covid study?". This would help determine if your study cohort data is well formed / has good data quality.

Since this will require special Library support in a few places, I wanted to file a tracking issue for the various pieces of this. Maybe these should be separate issues - but I wanted to leave open the option to discuss the whole approach here too.

Manifests/Inventories

We'll probably want the Library to start writing out inventory tables (a list of resource IDs in the study cohort) somewhere so that data-metrics could read it and scope down its investigation to a set of IDs rather than the whole database.

Maybe just patient & encounter IDs? Or could do it for all resources.

I don't know what table naming approach makes sense. Maybe study_name__lib_manifest_patients?

~~Cleaning~~ (solved by #309)

Another concern is that the Library likes to auto-clean a study prefix during build. If the data-metrics study is making per-study little mini-builds in a custom prefix (maybe data_metrics_study_name__*), we'll need to tell the Library to only clean that custom prefix.

Library code has the option for custom prefix cleaning. We just need to tell it which prefix.

Since that would be dynamic (likely based on some runtime option like --option study:study_name), we'd need the Library to call some study-based Python code for the prefix.

Maybe that could be more generic and have a manifest hook for some early Python that would allow editing the manifest definition (of which, study prefix is but one option).

The text was updated successfully, but these errors were encountered:

mikix · 2024-10-23T13:13:44Z

After talking, I believe Matt and I are thinking that for the cleaning part - we'll add something to the manifest.toml like:

prefix_generator = 'gen-my-prefix.py'

And this would allow the study to return a string (which Library would require to be [a-zA-Z_] or similar) to use as a prefix. Very custom but scoped-down approach.

mikix · 2024-10-29T14:56:14Z

The cleaning portion has been solved by #309

mikix added the enhancement New feature or request label Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support the data-metrics study having a per-study build option #308

Support the data-metrics study having a per-study build option #308

mikix commented Oct 22, 2024 •

edited

Loading

mikix commented Oct 23, 2024

mikix commented Oct 29, 2024

Support the data-metrics study having a per-study build option #308

Support the data-metrics study having a per-study build option #308

Comments

mikix commented Oct 22, 2024 • edited Loading

mikix commented Oct 23, 2024

mikix commented Oct 29, 2024

mikix commented Oct 22, 2024 •

edited

Loading