Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated outlier detection for adjust sum dialog #18723

Merged
merged 2 commits into from
Feb 28, 2024

Conversation

karwosts
Copy link
Contributor

Proposed change

Add a button in the adjust-sum dialog to automatically search statistics for outliers (the top 10 largest change values in the stat history).

I decided to try this as I find the current process somewhat tedious to find values to change. It is possible that you have an erroneous energy value, but just know what day it occured on, and not know the time. Then it requires scanning through the entire day by manually adjusting the datetime picker for every 20-30 minutes through the entire 24 hour period, which involves having to switch between changing minutes, changing hours, changing am/pm until you find it.

In all cases where I've had to do this it is because some value had such a large change that it would be instantly recognizible just by sorting the statistics by magnitude of change. So this button approximates that process, and allows to locate the bad values quickly with one click.

detect-outliers

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (thank you!)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Example configuration

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue or discussion:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • There is no commented out code in this PR.
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

Comment on lines +445 to +448
addOutlier(c.hour);
} else {
c.fiveMin.forEach((s) => {
addOutlier(s);
Copy link
Member

@bramkragten bramkragten Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can process 5min and hour the same way, a hour change normally already is 12 times as big?

};

// If an hour has no five minute data, add the hour value
// Otherwise, add the 5 minute values and ignore the hour value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An hour can have only half of the 5 minute data right? So we could miss a bunch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this a bit and I'm not sure if I can really think of a case where there would be a problem here. Possibly a minor edge case during the single hour where 5 minute data has been partially purged? But that just seems really unlikely to be a problem.

I'm not really sure why we're even dealing with 5 minute data here at all as it is just temporary, but I was just sort of trying to mimic how the dialog already handled overlapping 5min/hour data.

@bramkragten
Copy link
Member

I think this should really be done on the backend, and we should maybe even raise issues for it if the value is way different than the normal value.

const numOutliers = 10;

// Track the top 10 values.
const addOutlier = (s) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say an outlier is not the biggest 10 numbers, but should be x% higher or lower than the mean change value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the distinction would matter that much for the typical use case here, but I'm not opposed to doing that (it's just more cpu work required to calculate the mean).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it probably doesn't add much

@bramkragten bramkragten merged commit 763c672 into home-assistant:dev Feb 28, 2024
13 checks passed
@karwosts karwosts deleted the adjust-sum-outliers branch February 28, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants