The Reweighted-FastLTS is a robust regression algorithm that allows you to detect anomalous observations.
A Python implementation of FastLTS (by Michele Cappellari) is based on the analysis of datasets with 3 predictors (p). Inspired by the work of Cappellari and the research of Prof. Peter Rousseeuw I implemented a python version of the Reweighted-FastLTS for (i) p predictors with p < n (n number of observations) (ii) n < 600.
The attributes of Reweighted-FastLTS python class are the same that would be obtained by invoking the ltsReg in RStudio.
Some doubts are about the implementation of FastMCD. In particular, I used MinCovDet from the sklearn library, and I realized that the location and the covariance matrix are different from those obtained by RStudio, with the consequence that the Robust Distance is different.
Below I report the results of some tests. In particular, in the left column you will see the results obtained with Reweighted-FastLTS, while in the right column you will see the results obtained with ltsReg of RStudio's robustbase library. The datasets used are Hawkins-Bradu-Kass data(HBK) and Stackloss data .
| Reweighted-FastLTS | ltsReg |
| | |
alpha | 0.5 | 0.5 |
quan | 40 | 40 |
raw_coefficents | [ 0.27835867, 0.04327558, -0.10558377] | [0.27835868, 0.04327561, -0.10558381] |
raw_intercept | -0.62325114 | -0.6232511 |
raw_scale | 0.8535975675079938 | 0.8543587 |
raw_correction_factor | 1.2752919 | 1.275292 |
| |
coefficents | [0.08137871, 0.03990183, -0.05166559] | [0.08137871, 0.03990181, -0.05166558] |
intercept | -0.18046165 | -0.18046163 |
scale | 0.744041162494403 | 0.7440412 |
chn_factor | 1.34586238 | 1.345862 |
correction_factor | 1.01626593 | 1.016266 |
| |
good leverage points | [11, 12, 13, 14] | [11, 12, 13, 14] |
leverage points | [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] |
vertical outliers | | |
| Reweighted-FastLTS | ltsReg |
| | |
alpha | 0.5 | 0.5 |
quan | 13 | 13 |
raw_coefficents | [0.7409212, 0.39152664, 0.01113465] | [0.74092106, 0.39152672, 0.01113454] |
raw_intercept | -37.32334 | -37.32332647 |
raw_scale | 1.863142881084838 | 1.863146 |
raw_correction_factor | 1.88416645 | 1.884166 |
| |
coefficents | [0.7976856, 0.5773405, -0.06706011] | [0.79768556, 0.57734046, -0.06706018] |
intercept | -37.652466 | -37.65245890 |
scale | 1.9218770928830033 | 1.921877 |
chn_factor | 1.48689415 | 1.486894 |
correction_factor | 1.14467424 | 1.144674 |
| |
good leverage points | [2, 15, 16, 17, 18, 19] | [2, 15, 16, 17, 18, 19] |
bad leverage points | [1, 3, 21] | [1, 3, 21] |
vertical outliers | [4] | [4] |