You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a way to print the threshold learned by MacroBase when running MAD classifier? I want to know what values of a metric are considered outliers.
The text was updated successfully, but these errors were encountered:
ganesh-srinivas
changed the title
How can I print threshold beyond which MAD/MCD classifies metric(s) as outliers?
How can I print threshold beyond which MAD classifies a metric as outliers?
May 2, 2018
Is it correct to use the following formula to calculate the upper thresholds and lower thresholds learned by the MAD classifier for a metric?
upper_threshold_for_metric_value = median + score_percentile_cutoff*MAD
lower_threshold_for_metric_value = median - score_percentile_cutoff*MAD
How
I was able to derive this formula from a function in legacy/src/main/java/macrobase/analysis/stats/MAD.java:
public double score(Datum datum) {
double point = datum.metrics().getEntry(0);
return Math.abs(point - median) / (MAD);
}
Verification
I printed the values of median, MAD and score_percentile_cutoff by setting the logging level to TRACE and running a batch query on sensor_data_demo_db_version.txt (two extra rows carrying infinitesimally small values of power_drain).
The calculations give upper_threshold_for_metric_value = 0.8634457399999994 and lower_threshold_for_metric_value = -0.2608399236802713. This is very close to the observed
thresholds: 1010/1012 outliers have power_drain > .864 and 2/1012 outliers have power_drain <= -0.260865.
Is there a way to print the threshold learned by MacroBase when running MAD classifier? I want to know what values of a metric are considered outliers.
The text was updated successfully, but these errors were encountered: