Locally Selective Combination in Parallel Outlier Ensembles (LSCP): a fully unsupervised framework to selectively combine base detectors by emphasizing data locality.
Zhao, Y., Nasrullah, Z., Hryniewicki, M.K. and Li, Z. LSCP: Locally Selective Combination in Parallel Outlier Ensembles. SIAM International Conference on Data Mining (SDM), 2019.
Please cite the paper as:
@inproceedings{zhao2019lscp,
title={{LSCP:} Locally Selective Combination in Parallel Outlier Ensembles},
author={Zhao, Yue and Nasrullah, Zain and Hryniewicki, Maciej K and Li, Zheng},
booktitle={Proceedings of the 2019 {SIAM} International Conference on Data Mining, {SDM} 2019},
pages={585--593},
month = {May},
year={2019},
address = {Calgary, Canada},
organization={SIAM},
url={https://doi.org/10.1137/1.9781611975673.66},
doi={10.1137/1.9781611975673.66}
}
PDF for Personal Use | SIAM Page | Presentation Slides | API Documentation | Example with PyOD
Update (May 9th, 2019): Published version is available for download.
Update (Jan 23th, 2019): Camera-ready version is available for download.
Update (Dec 25th, 2018): LSCP has been officially released in Python Outlier Detection (PyOD) V0.6.6.
Update (Dec 21th, 2018): LSCP has been accepted at SDM 2019. Acceptance rate 22.7% (90/397).
Update (Dec 6th, 2018): LSCP has been included as part of Python Outlier Detection (PyOD), to be released in pyod V0.6.6.
- Two versions of codes are provided:
- Demo version (demo_lof.py) is created for the fast reproduction of the experiment results. The demo version only compares the baseline algorithms with LSCP algorithms.
- Production version (Python Outlier Detection (PyOD)) is released with full optimization and testing as a framework. The purpose of this version is to be used in real applications, which should require fewer dependencies and faster execution.
- It is understood that there are small variations in the results due to the random process, e.g., splitting the training and test sets. Thus, running demo codes would only result in similar results to the paper but not the exactly same results.
In unsupervised outlier ensembles, the absence of ground truth makes the combination of base outlier detectors a challenging task. Specifically, existing parallel outlier ensembles lack a reliable way of selecting competent base detectors, affecting accuracy and stability, during model combination. In this paper, we propose a framework---called Locally Selective Combination in Parallel Outlier Ensembles (LSCP)---which addresses the issue by defining a local region around a test instance using the consensus of its nearest neighbors in randomly selected feature subspaces. The top-performing base detectors in this local region are selected and combined as the model's final output. Four variants of the LSCP framework are compared with seven widely used parallel frameworks. Experimental results demonstrate that one of these variants, LSCP_AOM, consistently outperforms baselines on the majority of twenty real-world datasets.
The experiment codes are writen in Python 3.6 and built on a number of Python packages:
- numpy>=1.13
- numba>=0.35
- scipy>=0.19
- scikit_learn>=0.19
Batch installation is possible using the supplied "requirements.txt" with pip or conda.
pip install -r requirements.txt
20 datasets are used (see dataset folder):
Datasets | #Sample Dimension | Dimension | #Outliers | # Outlier Perc |
---|---|---|---|---|
Annthyroid | 7200 | 6 | 534 | 7.41 |
Arrhythmia | 452 | 274 | 66 | 14.60 |
Breastw | 683 | 9 | 239 | 34.99 |
Cardio | 1831 | 21 | 176 | 9.61 |
Letter | 1600 | 32 | 100 | 6.25 |
MNIST | 7603 | 100 | 700 | 9.21 |
Musk | 3062 | 166 | 97 | 3.17 |
PageBlocks | 5393 | 10 | 510 | 9.46 |
Pendigits | 6870 | 16 | 156 | 2.27 |
Pima | 768 | 8 | 268 | 34.90 |
Satellite | 6435 | 36 | 2036 | 31.64 |
Satimage-2 | 5803 | 36 | 71 | 1.22 |
Shuttle | 49097 | 9 | 3511 | 7.15 |
SpamSpace | 4207 | 57 | 1679 | 39.91 |
Stamps | 340 | 9 | 31 | 9.12 |
Thyroid | 3772 | 6 | 93 | 2.47 |
Vertebral | 240 | 6 | 30 | 12.50 |
Vowels | 1456 | 12 | 50 | 3.43 |
Wbc | 378 | 30 | 21 | 5.56 |
Wilt | 4819 | 5 | 257 | 5.33 |
All datasets are accessible from http://odds.cs.stonybrook.edu/ and http://www.dbs.ifi.lmu.de/research/outlier-evaluation/DAMI/.
Citation Suggestion for the datasets please refer to:
Shebuti Rayana (2016). ODDS Library [http://odds.cs.stonybrook.edu]. Stony Brook, NY: Stony Brook University, Department of Computer Science.
Campos, G.O., Zimek, A., Sander, J., Campello, R.J., Micenková, B., Schubert, E., Assent, I. and Houle, M.E., 2016. On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Mining and Knowledge Discovery, 30(4), pp.891-927.
Experiments could be reproduced by running demo_lof.py directly. You could simply download/clone the entire repository and execute the code by
python demo_lof.py
Two evaluation methods are introduced and the result would be saved into "results" folder:
- The area under receiver operating characteristic curve (ROC)
- mean Average Precision (mAP)
Table 2: ROC-AUC scores (average of 30 independent trials, highest score highlighted in bold)
Datasets | LSCP_A | LSCP_MOA | LSCP_M | LSCP_AOM | GG_A | GG_MOA | GG_M | GG_AOM | GG_WA | GG_TH | GG_FB |
---|---|---|---|---|---|---|---|---|---|---|---|
Annthyroid | 0.7548 | 0.7590 | 0.7849 | 0.7520 | 0.7642 | 0.7660 | 0.7769 | 0.7730 | 0.7632 | 0.7552 | 0.7854 |
Arrhythmia | 0.7746 | 0.7715 | 0.7729 | 0.7763 | 0.7758 | 0.7749 | 0.7656 | 0.7690 | 0.7758 | 0.7313 | 0.7709 |
Breastw | 0.6553 | 0.7044 | 0.7236 | 0.7845 | 0.7362 | 0.7140 | 0.6590 | 0.6838 | 0.7453 | 0.6285 | 0.3935 |
Cardio | 0.8691 | 0.8908 | 0.8491 | 0.9013 | 0.8770 | 0.8865 | 0.8798 | 0.8903 | 0.8782 | 0.8830 | 0.8422 |
Letter | 0.7818 | 0.7954 | 0.8361 | 0.7867 | 0.7925 | 0.8031 | 0.8434 | 0.8300 | 0.7908 | 0.8001 | 0.7640 |
MNIST | 0.8576 | 0.8623 | 0.7812 | 0.8633 | 0.8557 | 0.8588 | 0.8349 | 0.8553 | 0.8563 | 0.8272 | 0.8468 |
Musk | 0.9950 | 0.9970 | 0.9931 | 0.9981 | 0.9937 | 0.9960 | 0.9960 | 0.9970 | 0.9953 | 0.9958 | 0.7344 |
PageBlocks | 0.9349 | 0.9343 | 0.8687 | 0.9488 | 0.9443 | 0.9440 | 0.9240 | 0.9371 | 0.9453 | 0.9418 | 0.9284 |
Pendigits | 0.8238 | 0.8656 | 0.7238 | 0.8744 | 0.8378 | 0.8509 | 0.8488 | 0.8622 | 0.8425 | 0.8548 | 0.8034 |
Pima | 0.7059 | 0.6991 | 0.6640 | 0.7061 | 0.7030 | 0.7003 | 0.6730 | 0.6856 | 0.7037 | 0.6349 | 0.6989 |
Satellite | 0.5814 | 0.6106 | 0.6006 | 0.6015 | 0.5881 | 0.5992 | 0.6258 | 0.6220 | 0.5876 | 0.6101 | 0.5818 |
Satimage-2 | 0.9852 | 0.9931 | 0.9878 | 0.9935 | 0.9872 | 0.9907 | 0.9909 | 0.9925 | 0.9880 | 0.9881 | 0.9181 |
Shuttle | 0.5392 | 0.5551 | 0.5373 | 0.5514 | 0.5439 | 0.5504 | 0.5612 | 0.5602 | 0.5413 | 0.5561 | 0.3702 |
SpamSpace | 0.3792 | 0.4594 | 0.4305 | 0.4744 | 0.4487 | 0.4377 | 0.4060 | 0.4128 | 0.4580 | 0.4104 | 0.3312 |
Stamps | 0.8888 | 0.8719 | 0.8525 | 0.8985 | 0.8946 | 0.8927 | 0.8559 | 0.8763 | 0.8953 | 0.8904 | 0.8715 |
Thyroid | 0.9579 | 0.9624 | 0.9413 | 0.9700 | 0.9656 | 0.9647 | 0.9385 | 0.9510 | 0.9665 | 0.9644 | 0.8510 |
Vertebral | 0.3324 | 0.3662 | 0.4306 | 0.3478 | 0.3433 | 0.3467 | 0.3662 | 0.3614 | 0.3442 | 0.3678 | 0.3385 |
Vowels | 0.9276 | 0.9185 | 0.9238 | 0.9199 | 0.9265 | 0.9275 | 0.9313 | 0.9271 | 0.9261 | 0.9299 | 0.9148 |
WBC | 0.9379 | 0.9344 | 0.9242 | 0.9451 | 0.9421 | 0.9409 | 0.9321 | 0.9367 | 0.9420 | 0.9314 | 0.9407 |
Wilt | 0.5275 | 0.5517 | 0.6550 | 0.4286 | 0.5101 | 0.5358 | 0.6384 | 0.6056 | 0.5037 | 0.5586 | 0.5868 |
Table 3: mAP scores (average of 30 independent trials, highest score highlighted in bold)
Datasets | LSCP_A | LSCP_MOA | LSCP_M | LSCP_AOM | GG_A | GG_MOA | GG_M | GG_AOM | GG_WA | GG_TH | GG_FB |
---|---|---|---|---|---|---|---|---|---|---|---|
Annthyroid | 0.2283 | 0.2375 | 0.2349 | 0.2453 | 0.2301 | 0.2395 | 0.2413 | 0.2516 | 0.2306 | 0.2277 | 0.1864 |
Arrhythmia | 0.3780 | 0.3744 | 0.3790 | 0.3796 | 0.3766 | 0.3769 | 0.3690 | 0.3722 | 0.3766 | 0.3468 | 0.3707 |
Breastw | 0.4334 | 0.4766 | 0.4728 | 0.5655 | 0.4995 | 0.4849 | 0.4249 | 0.4577 | 0.5085 | 0.4366 | 0.2854 |
Cardio | 0.3375 | 0.3960 | 0.3197 | 0.4117 | 0.3516 | 0.3708 | 0.3666 | 0.3864 | 0.3535 | 0.3629 | 0.3643 |
Letter | 0.2302 | 0.2396 | 0.3346 | 0.2407 | 0.2388 | 0.2473 | 0.3160 | 0.2867 | 0.2372 | 0.2416 | 0.2193 |
MNIST | 0.3933 | 0.3974 | 0.3353 | 0.3979 | 0.3911 | 0.3941 | 0.3701 | 0.3896 | 0.3918 | 0.3836 | 0.3928 |
Musk | 0.8478 | 0.8773 | 0.8433 | 0.9240 | 0.8245 | 0.8718 | 0.8479 | 0.8806 | 0.8608 | 0.8629 | 0.5806 |
PageBlocks | 0.5805 | 0.5707 | 0.4684 | 0.6360 | 0.6043 | 0.6016 | 0.5297 | 0.5733 | 0.6077 | 0.6064 | 0.6094 |
Pendigits | 0.0709 | 0.0893 | 0.0625 | 0.0944 | 0.0777 | 0.0823 | 0.0834 | 0.0895 | 0.0780 | 0.0832 | 0.0834 |
Pima | 0.5092 | 0.5045 | 0.4716 | 0.5142 | 0.5089 | 0.5054 | 0.4813 | 0.4920 | 0.5095 | 0.4599 | 0.5094 |
Satellite | 0.4077 | 0.4268 | 0.4223 | 0.4196 | 0.4047 | 0.4139 | 0.4385 | 0.4352 | 0.4047 | 0.4031 | 0.4049 |
Satimage-2 | 0.3477 | 0.6248 | 0.3994 | 0.6249 | 0.3959 | 0.5089 | 0.5344 | 0.5922 | 0.4159 | 0.4114 | 0.4851 |
Shuttle | 0.1228 | 0.1296 | 0.1167 | 0.1330 | 0.1297 | 0.1316 | 0.1239 | 0.1294 | 0.1293 | 0.1316 | 0.0549 |
SpamSpace | 0.3326 | 0.3615 | 0.3592 | 0.3665 | 0.3572 | 0.3521 | 0.3379 | 0.3413 | 0.3612 | 0.3601 | 0.3079 |
Stamps | 0.3596 | 0.3310 | 0.3193 | 0.3779 | 0.3694 | 0.3660 | 0.3144 | 0.3387 | 0.3706 | 0.3638 | 0.3535 |
Thyroid | 0.3544 | 0.3955 | 0.2638 | 0.4651 | 0.4045 | 0.4123 | 0.2850 | 0.3488 | 0.4130 | 0.4071 | 0.1186 |
Vertebral | 0.0948 | 0.1020 | 0.1230 | 0.0988 | 0.0971 | 0.0975 | 0.1029 | 0.1000 | 0.0972 | 0.1067 | 0.0965 |
Vowels | 0.3913 | 0.3678 | 0.3482 | 0.3539 | 0.3783 | 0.3790 | 0.3760 | 0.3732 | 0.3784 | 0.3783 | 0.3340 |
WBC | 0.6033 | 0.5983 | 0.5472 | 0.6131 | 0.6097 | 0.6069 | 0.5579 | 0.5925 | 0.6105 | 0.6045 | 0.5933 |
Wilt | 0.0518 | 0.0557 | 0.0770 | 0.0423 | 0.0493 | 0.0523 | 0.0715 | 0.0633 | 0.0486 | 0.0537 | 0.0591 |
In this work, we propose four variants of a novel unsupervised outlier detection framework called Locally Selective Combination in Parallel Outlier Ensembles (LSCP). Unlike traditional combination approaches, LSCP identifies the top-performing base detectors for each test instance relative to its local region. To validate its effectiveness, the proposed framework is assessed on 20 real-world datasets and is shown to be superior to baseline algorithms. The ensemble approach LSCP_AOM demonstrates the best performance achieving the highest detection score on 13/20 datasets with respect to ROC-AUC and 14/20 datasets with respect to mAP. Theoretical considerations under the bias-variance framework and visualizations are also provided for LSCP to provide a holistic overview of the framework. Since LSCP demonstrates the promise of data locality, future work can extend this exploration by investigating the use of heterogeneous base detectors and more reliable pseudo ground truth generation methods.