Skip to content

Commit

Permalink
Minor fixes (#366)
Browse files Browse the repository at this point in the history
* Minor typo fixes

* Fixed some broken links

* Fixed with correct link
  • Loading branch information
NeoKish authored Feb 24, 2024
1 parent ed391d9 commit 3246e76
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 5 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
NannyML is an open-source python library that allows you to **estimate post-deployment model performance** (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, interactive visualizations, is completely model-agnostic and currently supports all tabular use cases, classification and **regression**.

The core contributors of NannyML have researched and developed multiple novel algorithms for estimating model performance: [confidence-based performance estimation (CBPE)](https://nannyml.readthedocs.io/en/stable/how_it_works/performance_estimation.html#confidence-based-performance-estimation-cbpe) and [direct loss estimation (DLE)](https://nannyml.readthedocs.io/en/stable/how_it_works/performance_estimation.html#direct-loss-estimation-dle).
The nansters also invented a new approach to detect [multivariate data drift](https://nannyml.readthedocs.io/en/stable/how_it_works/data_reconstruction.html) using PCA-based data reconstruction.
The nansters also invented a new approach to detect [multivariate data drift](https://nannyml.readthedocs.io/en/stable/how_it_works/multivariate_drift.html#data-reconstruction-with-pca) using PCA-based data reconstruction.

If you like what we are working on, be sure to become a Nanster yourself, join our [community slack](https://join.slack.com/t/nannymlbeta/shared_invite/zt-16fvpeddz-HAvTsjNEyC9CE6JXbiM7BQ) <img src="https://raw.githubusercontent.com/NannyML/nannyml/main/media/slack.png" height='15'> and support us with a GitHub <img src="https://raw.githubusercontent.com/NannyML/nannyml/main/media/github.png" height='15'> star ⭐.

Expand Down Expand Up @@ -98,9 +98,9 @@ NannyML can also **track the realised performance** of your machine learning mod

### 2. Data drift detection

To detect **multivariate feature drift** NannyML uses [PCA-based data reconstruction](https://nannyml.readthedocs.io/en/main/how_it_works/data_reconstruction.html). Changes in the resulting reconstruction error are monitored over time and data drift alerts are logged when the reconstruction error in a certain period exceeds a threshold. This threshold is calculated based on the reconstruction error observed in the reference period.
To detect **multivariate feature drift** NannyML uses [PCA-based data reconstruction](https://nannyml.readthedocs.io/en/stable/how_it_works/multivariate_drift.html#data-reconstruction-with-pca). Changes in the resulting reconstruction error are monitored over time and data drift alerts are logged when the reconstruction error in a certain period exceeds a threshold. This threshold is calculated based on the reconstruction error observed in the reference period.

<p><img src="https://raw.githubusercontent.com/NannyML/nannyml/main/docs/_static/butterfly-multivariate-drift.svg"></p>
<p><img src="https://raw.githubusercontent.com/NannyML/nannyml/main/docs/_static/how-it-works/butterfly-multivariate-drift-pca.svg"></p>

NannyML utilises statistical tests to detect **univariate feature drift**. We have just added a bunch of new univariate tests including Jensen-Shannon Distance and L-Infinity Distance, check out the [comprehensive list](https://nannyml.readthedocs.io/en/stable/how_it_works/univariate_drift_detection.html#methods-for-continuous-features). The results of these tests are tracked over time, properly corrected to counteract multiplicity and overlayed on the temporal feature distributions. (It is also possible to visualise the test-statistics over time, to get a notion of the drift magnitude.)

Expand Down
2 changes: 1 addition & 1 deletion docs/how_it_works/estimation_of_standard_error.rst
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ Through a simple application of error propagation:
which means that the standard error of the sum is the standard error of the mean multiplied by sample size.


Stnadard Deviation
Standard Deviation
------------------

The standard error of the variance of a random variable is given by the following exact formula:
Expand Down
2 changes: 1 addition & 1 deletion docs/how_it_works/multivariate_drift.rst
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ The classifier cross validation part uses the data created and consists of the f

- Optionally, hyperparameter tuning is performed. The hyperparameters learnt during
this step will be used in the model training steps below. If hyperparameter tuning
is not requested, user specified hyperpatameters can be used instead of the default LightGBM optioms.
is not requested, user specified hyperparameters can be used instead of the default LightGBM options.
- Stratified split is used to split the data into validation folds
- For each split NannyML trains an `LGBMClassifier` and saves its predicted
scores in the validation fold.
Expand Down

0 comments on commit 3246e76

Please sign in to comment.