Refactor Evaluator/MetricEvaluator #14

aimalz · 2023-05-26T21:00:47Z

Riffing off @hdante's point in #11, the evaluation subpackage has some unexpected behavior that could make it unnecessarily computationally intensive. Currently the Evaluator in src/rail/evaluation is a meta-metric that evaluates all the specific metrics, and MetricEvaluator is the base class for individual metrics but lives in src/rail/evaluation/metrics. This is backwards of how creation and estimation are set up and doesn't convey that Evaluator is not the base class but actually a shortcut to running all the subclasses that constitute metrics. We should refactor this (and propagate changes through the unit tests and demos) to have the base class with the basic name outside the metrics directory and the metametric with an appropriately descriptive name in the metrics directory, mirroring the structures of creation and estimation. This will make it more straightforward for users to avoid costly metrics calculations unless they truly need them.

The text was updated successfully, but these errors were encountered:

eacharles · 2023-05-26T21:21:40Z

I think you might be better off leaving it as it is and putting in code to select which metrics to calculate. Having all the metrics in a single file will be very useful. Have them scattered across a number of small files will be annoying.On May 26, 2023, at 3:00 PM, Alex Malz ***@***.***> wrote: Riffing off @hdante's point in LSSTDESC/rail_base#11, the evaluation subpackage has some unexpected behavior that could make it unnecessarily computationally intensive. Currently the Evaluator in src/rail/evaluation is a meta-metric that evaluates all the specific metrics, and MetricEvaluator is the base class for individual metrics but lives in src/rail/evaluation/metrics. This is backwards of how creation and estimation are set up and doesn't convey that Evaluator is not the base class but actually a shortcut to running all the subclasses that constitute metrics. We should refactor this (and propagate changes through the unit tests and demos) to have the base class with the basic name outside the metrics directory and the metametric with an appropriately descriptive name in the metrics directory, mirroring the structures of creation and estimation. This will make it more straightforward for users to avoid costly metrics calculations unless they actually need them. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

aimalz · 2023-08-04T21:53:57Z

@yanzastro had a very relevant thought on metrics for pipelines where true redshifts are not available, namely a pairwise difference between qp.Ensemble.ancil arrays of point estimates where rows in Ensemble could be cells in SOM, for example, as per LSSTDESC/rail_som#1, once LSSTDESC/rail_som#2 produces photo-z PDF estimates per cell.

aimalz · 2023-08-17T22:03:42Z

Just catching this up with notes from the retreat before I get too far into implementing it in the corresponding branch: The plan is to define classes of metrics based on what input they take (the 3 logical pairings of point values and PDFs, where estimate is always first and truth/reference is always second). This will give users the opportunity to specify the subset of metrics they want to calculate as keyword config options while still maintaining a low number of output files of metrics (one per class), with the added bonus of knowing which metrics can be accessed through each class just based on what the paired input types are. Here's an outline of what's going to go where:

PointToPointEvaluator: bias, scatter, "robust" versions thereof, IQR, MAD, various "outlier rate" definitions
ProbToProbEvaluator: KLD, KS, AD, CvM, Brier, EMD/Wasserstein
ProbToPointEvaluator: CDE Loss, CRPS, PIT (maybe with statistics thereof, but maybe not?)

This plan also corresponds to some way to generate point estimates from PDFs as a standalone stage, thereby overcoming the restriction that the only way to make point estimates in a pipeline right now is to do it at the time of estimation.

PointEstReducer (or somesuch): median, mode, mean, RBPE, sample

(Note that this functionality has some implications for the formatting of point estimates calculated by an Estimator; @drewoldag and I will outline these in detail in a design document shortly.)

eacharles · 2023-08-17T22:08:54Z

I think you might consider keeping it all in a single class, and just giving it the option of comparing against truth or a distribution or both.On Aug 17, 2023, at 3:03 PM, Alex Malz ***@***.***> wrote: Just catching this up with notes from the retreat before I get too far into implementing it in the corresponding branch: The plan is to define classes of metrics based on what input they take (the 3 logical pairings of point values and PDFs, where estimate is always first and truth/reference is always second). This will give users the opportunity to specify the subset of metrics they want to calculate as keyword config options while still maintaining a low number of output files of metrics (one per class), with the added bonus of knowing which metrics can be accessed through each class just based on what the paired input types are. Here's an outline of what's going to go where: PointToPointEvaluator: bias, scatter, "robust" versions thereof, IQR, MAD, various "outlier rate" definitions ProbToProbEvaluator: KLD, KS, AD, CvM, Brier, EMD/Wasserstein ProbToPointEvaluator: CDE Loss, CRPS, PIT —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

joezuntz · 2023-09-27T07:51:28Z

Hi all. Alex asked me about stages that can have variable numbers of inputs. I'll have a think about this and get back to you.

ztq1996 · 2024-03-21T22:02:49Z

I want to revisit this next Monday (3/25), since the codebase has changed, the functionality of the current Evaluator class in src/evaluation/evaluator.py should be fulfilled by the single evaluator class in the eac branch, and we should be freeing up the superclass, and have metric evaluator classes based on that.

eacharles · 2024-04-05T16:55:22Z

done with #98

aimalz added the good first issue Good for newcomers label Jun 7, 2023

aimalz mentioned this issue Jun 9, 2023

Issue/288/standardize naming LSSTDESC/rail_attic#391

Draft

35 tasks

eacharles transferred this issue from LSSTDESC/rail_attic Jun 13, 2023

eacharles mentioned this issue Jun 15, 2023

Rename all the things (that are outdated/confusing/inconsistent) LSSTDESC/rail#37

Closed

aimalz mentioned this issue Jul 7, 2023

Inconsistent import paths for sibling classes across rail* repos #23

Open

aimalz self-assigned this Jul 14, 2023

aimalz added the help wanted Extra attention is needed label Jul 14, 2023

aimalz mentioned this issue Aug 2, 2023

Degrader refactoring #33

Closed

20 tasks

aimalz mentioned this issue Aug 4, 2023

Implement likelihood/posterior distinction (in output of estimators and ingestion in summarizers) #2

Open

aimalz mentioned this issue Oct 30, 2023

"utils" import problem #21

Open

drewoldag mentioned this issue Nov 7, 2023

WIP - Per-input evaluator classes #74

Closed

4 tasks

eacharles closed this as completed Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Evaluator/MetricEvaluator #14

Refactor Evaluator/MetricEvaluator #14

aimalz commented May 26, 2023 •

edited

Loading

eacharles commented May 26, 2023 via email

aimalz commented Aug 4, 2023

aimalz commented Aug 17, 2023 •

edited

Loading

eacharles commented Aug 17, 2023 via email

joezuntz commented Sep 27, 2023

ztq1996 commented Mar 21, 2024

eacharles commented Apr 5, 2024

Refactor Evaluator/MetricEvaluator #14

Refactor Evaluator/MetricEvaluator #14

Comments

aimalz commented May 26, 2023 • edited Loading

eacharles commented May 26, 2023 via email

aimalz commented Aug 4, 2023

aimalz commented Aug 17, 2023 • edited Loading

eacharles commented Aug 17, 2023 via email

joezuntz commented Sep 27, 2023

ztq1996 commented Mar 21, 2024

eacharles commented Apr 5, 2024

aimalz commented May 26, 2023 •

edited

Loading

aimalz commented Aug 17, 2023 •

edited

Loading