-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Evaluator/MetricEvaluator #14
Comments
I think you might be better off leaving it as it is and putting in code to select which metrics to calculate. Having all the metrics in a single file will be very useful. Have them scattered across a number of small files will be annoying.On May 26, 2023, at 3:00 PM, Alex Malz ***@***.***> wrote:
Riffing off @hdante's point in LSSTDESC/rail_base#11, the evaluation subpackage has some unexpected behavior that could make it unnecessarily computationally intensive. Currently the Evaluator in src/rail/evaluation is a meta-metric that evaluates all the specific metrics, and MetricEvaluator is the base class for individual metrics but lives in src/rail/evaluation/metrics. This is backwards of how creation and estimation are set up and doesn't convey that Evaluator is not the base class but actually a shortcut to running all the subclasses that constitute metrics. We should refactor this (and propagate changes through the unit tests and demos) to have the base class with the basic name outside the metrics directory and the metametric with an appropriately descriptive name in the metrics directory, mirroring the structures of creation and estimation. This will make it more straightforward for users to avoid costly metrics calculations unless they actually need them.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
@yanzastro had a very relevant thought on metrics for pipelines where true redshifts are not available, namely a pairwise difference between qp.Ensemble.ancil arrays of point estimates where rows in Ensemble could be cells in SOM, for example, as per LSSTDESC/rail_som#1, once LSSTDESC/rail_som#2 produces photo-z PDF estimates per cell. |
Just catching this up with notes from the retreat before I get too far into implementing it in the corresponding branch: The plan is to define classes of metrics based on what input they take (the 3 logical pairings of point values and PDFs, where estimate is always first and truth/reference is always second). This will give users the opportunity to specify the subset of metrics they want to calculate as keyword config options while still maintaining a low number of output files of metrics (one per class), with the added bonus of knowing which metrics can be accessed through each class just based on what the paired input types are. Here's an outline of what's going to go where:
This plan also corresponds to some way to generate point estimates from PDFs as a standalone stage, thereby overcoming the restriction that the only way to make point estimates in a pipeline right now is to do it at the time of estimation.
(Note that this functionality has some implications for the formatting of point estimates calculated by an Estimator; @drewoldag and I will outline these in detail in a design document shortly.) |
I think you might consider keeping it all in a single class, and just giving it the option of comparing against truth or a distribution or both.On Aug 17, 2023, at 3:03 PM, Alex Malz ***@***.***> wrote:
Just catching this up with notes from the retreat before I get too far into implementing it in the corresponding branch: The plan is to define classes of metrics based on what input they take (the 3 logical pairings of point values and PDFs, where estimate is always first and truth/reference is always second). This will give users the opportunity to specify the subset of metrics they want to calculate as keyword config options while still maintaining a low number of output files of metrics (one per class), with the added bonus of knowing which metrics can be accessed through each class just based on what the paired input types are. Here's an outline of what's going to go where:
PointToPointEvaluator: bias, scatter, "robust" versions thereof, IQR, MAD, various "outlier rate" definitions
ProbToProbEvaluator: KLD, KS, AD, CvM, Brier, EMD/Wasserstein
ProbToPointEvaluator: CDE Loss, CRPS, PIT
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>
|
Hi all. Alex asked me about stages that can have variable numbers of inputs. I'll have a think about this and get back to you. |
I want to revisit this next Monday (3/25), since the codebase has changed, the functionality of the current Evaluator class in |
done with #98 |
Riffing off @hdante's point in #11, the evaluation subpackage has some unexpected behavior that could make it unnecessarily computationally intensive. Currently the
Evaluator
in src/rail/evaluation is a meta-metric that evaluates all the specific metrics, andMetricEvaluator
is the base class for individual metrics but lives in src/rail/evaluation/metrics. This is backwards of how creation and estimation are set up and doesn't convey thatEvaluator
is not the base class but actually a shortcut to running all the subclasses that constitute metrics. We should refactor this (and propagate changes through the unit tests and demos) to have the base class with the basic name outside the metrics directory and the metametric with an appropriately descriptive name in the metrics directory, mirroring the structures of creation and estimation. This will make it more straightforward for users to avoid costly metrics calculations unless they truly need them.The text was updated successfully, but these errors were encountered: