Skip to content

Commit

Permalink
added docstring to _remove_outliers
Browse files Browse the repository at this point in the history
  • Loading branch information
mcneela committed Feb 22, 2024
1 parent bc4c747 commit 9c1010a
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions src/openqdc/datasets/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,24 @@ def _remove_outliers(
avg_fn: str = "median",
num_stds: float = 3.0,
) -> np.array:
"""
Removes outliers from the dataset (based on formation_E)
that are outside the range of
```
avg_fn(formation_E) +- num_stds * formation_E.std()
```
Args:
formation_E : np.array
numpy array holding the formation energies for the dataset
avg_fn : str
str specifing the averaging function to be used (current options are "mean" and "median")
num_stds : float
the number of standard deviations to use for outlier detection
Returns:
np.array of formation energies with the outliers removed
"""
assert(
avg_fn in BaseDataset.avg_options.keys(),
f"{avg_fn} is not a valid option, should be one of {list(BaseDataset.avg_options.keys())}"
Expand Down

0 comments on commit 9c1010a

Please sign in to comment.