Release 0.3.0 · scikit-adaptation/skada

First release of SKADA!

The following algorithms are currently implemented.

Domain adaptation algorithms

Sample reweighting methods (Gaussian [1], Discriminant [2], KLIEPReweight [3],
DensityRatio [4], TarS [21], KMMReweight [23])
Sample mapping methods (CORAL [5], Optimal Transport DA OTDA [6], LinearMonge [7], LS-ConS [21])
Subspace methods (SubspaceAlignment [8], TCA [9], Transfer Subspace Learning [27])
Other methods (JDOT [10], DASVM [11], OT Label Propagation [28])

Any methods that can be cast as an adaptation of the input data can be used in one of two ways:

a scikit-learn transformer (Adapter) which provides both a full Classifier/Regressor estimator
or an Adapter that can be used in a DA pipeline with make_da_pipeline.
Refer to the examples below and visit the gallery for more details.

Deep learning domain adaptation algorithms

Deep Correlation alignment (DeepCORAL [12])
Deep joint distribution optimal (DeepJDOT [13])
Divergence minimization (MMD/DAN [14])
Adversarial/discriminator based DA (DANN [15], CDAN [16])

DA metrics

Importance Weighted [17]
Prediction entropy [18]
Soft neighborhood density [19]
Deep Embedded Validation (DEV) [20]
Circular Validation [11]

References

[1] Shimodaira Hidetoshi. "Improving predictive inference under covariate shift by weighting the log-likelihood function." Journal of statistical planning and inference 90, no. 2 (2000): 227-244.

[2] Sugiyama Masashi, Taiji Suzuki, and Takafumi Kanamori. "Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation." Annals of the Institute of Statistical Mathematics 64 (2012): 1009-1044.

[3] Sugiyama Masashi, Taiji Suzuki, Shinichi Nakajima, Hisashi Kashima, Paul Von Bünau, and Motoaki Kawanabe. "Direct importance estimation for covariate shift adaptation." Annals of the Institute of Statistical Mathematics 60 (2008): 699-746.

[4] Sugiyama Masashi, and Klaus-Robert Müller. "Input-dependent estimation of generalization error under covariate shift." (2005): 249-279.

[5] Sun Baochen, Jiashi Feng, and Kate Saenko. "Correlation alignment for unsupervised domain adaptation." Domain adaptation in computer vision applications (2017): 153-171.

[6] Courty Nicolas, Flamary Rémi, Tuia Devis, and Alain Rakotomamonjy. "Optimal transport for domain adaptation." IEEE Trans. Pattern Anal. Mach. Intell 1, no. 1-40 (2016): 2.

[7] Flamary, R., Lounici, K., & Ferrari, A. (2019). Concentration bounds for linear monge mapping estimation and optimal transport domain adaptation. arXiv preprint arXiv:1905.10155.

[8] Fernando, B., Habrard, A., Sebban, M., & Tuytelaars, T. (2013). Unsupervised visual domain adaptation using subspace alignment. In Proceedings of the IEEE international conference on computer vision (pp. 2960-2967).

[9] Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2010). Domain adaptation via transfer component analysis. IEEE transactions on neural networks, 22(2), 199-210.

[10] Courty, N., Flamary, R., Habrard, A., & Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. Advances in neural information processing systems, 30.

[11] Bruzzone, L., & Marconcini, M. (2009). Domain adaptation problems: A DASVM classification technique and a circular validation strategy. IEEE transactions on pattern analysis and machine intelligence, 32(5), 770-787.

[12] Sun, B., & Saenko, K. (2016). Deep coral: Correlation alignment for deep domain adaptation. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14 (pp. 443-450). Springer International Publishing.

[13] Damodaran, B. B., Kellenberger, B., Flamary, R., Tuia, D., & Courty, N. (2018). Deepjdot: Deep joint distribution optimal transport for unsupervised domain adaptation. In Proceedings of the European conference on computer vision (ECCV) (pp. 447-463).

[14] Long, M., Cao, Y., Wang, J., & Jordan, M. (2015, June). Learning transferable features with deep adaptation networks. In International conference on machine learning (pp. 97-105). PMLR.

[15] Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., ... & Lempitsky, V. (2016). Domain-adversarial training of neural networks. Journal of machine learning research, 17(59), 1-35.

[16] Long, M., Cao, Z., Wang, J., & Jordan, M. I. (2018). Conditional adversarial domain adaptation. Advances in neural information processing systems, 31.

[17] Sugiyama, M., Krauledat, M., & Müller, K. R. (2007). Covariate shift adaptation by importance weighted cross validation. Journal of Machine Learning Research, 8(5).

[18] Morerio, P., Cavazza, J., & Murino, V. (2017). Minimal-entropy correlation alignment for unsupervised deep domain adaptation. arXiv preprint arXiv:1711.10288.

[19] Saito, K., Kim, D., Teterwak, P., Sclaroff, S., Darrell, T., & Saenko, K. (2021). Tune it the right way: Unsupervised validation of domain adaptation via soft neighborhood density. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9184-9193).

[20] You, K., Wang, X., Long, M., & Jordan, M. (2019, May). Towards accurate model selection in deep unsupervised domain adaptation. In International Conference on Machine Learning (pp. 7124-7133). PMLR.

[21] Zhang, K., Schölkopf, B., Muandet, K., Wang, Z. (2013). Domain Adaptation under Target and Conditional Shift. In International Conference on Machine Learning (pp. 819-827). PMLR.

[22] Loog, M. (2012). Nearest neighbor-based importance weighting. In 2012 IEEE International Workshop on Machine Learning for Signal Processing, pages 1–6. IEEE (https://arxiv.org/pdf/2102.02291.pdf)

[23] Domain Adaptation Problems: A DASVM ClassificationTechnique and a Circular Validation StrategyLorenzo Bruzzone, Fellow, IEEE, and Mattia Marconcini, Member, IEEE (https://rslab.disi.unitn.it/papers/R82-PAMI.pdf)

[24] Loog, M. (2012). Nearest neighbor-based importance weighting. In 2012 IEEE International Workshop on Machine Learning for Signal Processing, pages 1–6. IEEE (https://arxiv.org/pdf/2102.02291.pdf)

[25] J. Huang, A. Gretton, K. Borgwardt, B. Schölkopf and A. J. Smola. Correcting sample selection bias by unlabeled data. In NIPS, 2007. (https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=07117994f0971b2fc2df95adb373c31c3d313442)

[26] Long, M., Wang, J., Ding, G., Sun, J., and Yu, P. (2014). Transfer joint matching for unsupervised domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1410–1417

[27] S. Si, D. Tao and B. Geng. In IEEE Transactions on Knowledge and Data Engineering, (2010) Bregman Divergence-Based Regularization for Transfer Subspace Learning

[28] Solomon, J., Rustamov, R., Guibas, L., & Butscher, A. (2014, January). Wasserstein propagation for semi-supervised learning. In International Conference on Machine Learning (pp. 306-314). PMLR.

What's Changed

Update previously used dataset fixture by @kachayev in #117
Remove masked inputs only if estimator does not accept sample_domain by @kachayev in #123
Fix DiscriminatorReweightDensity and ReweightDensity by @antoinedemathelin in #118
[TO_REVIEW] _find_y_type return enum by @YanisLalou in #125
[MRG] Implement CircularValidation as a scorer by @YanisLalou in #124
No need for mark-as-final operation by @kachayev in #128
[MRG] Add TarS method by @antoinecollas in #93
Selector to avoid filtering out masked samples when fitting transformer by @kachayev in #129
[MRG] Fix tests random seed and TarS by @antoinecollas in #131
[MRG] DA deep methods with new API by @tgnassou in #45
[MRG] Add LS-ConS method by @antoinecollas in #103
Precommit with ruff+codespell by @agramfort in #130
Ignore deep/* tests when torch is not installed by @kachayev in #133
[MRG] Separate Lint and Tests + codecov configuration by @rflamary in #135
SelectSource and SelectTarget selectors by @kachayev in #142
SelectSourceTarget selector by @kachayev in #145
[MRG] Implementation of 1NN reweighting and reweighting example implementation by @BuenoRuben in #108
[MRG] Add predict_proba for jdot by @YanisLalou in #153
Update readme and add unique references in docstring by @ambroiseodt in #154
[MRG] CircularValidation: Re-encode y labels before training the estimator on y_source_pred by @YanisLalou in #155
[MRG] Fix Tars & MMDSConS by @YanisLalou in #156
[MRG] Add the auto/scale mode in KLIEP by @YanisLalou in #157
[TO_REVIEW] Make DeepEmbeddedValidation scorer to work with the new API by @YanisLalou in #47
[MRG] Add predict_proba to DA_SVM by @YanisLalou in #158
[MRG] Update How-to with advanced pipelines examples by @rflamary in #144
[MRG] Propagate adaptation output through multiple steps by @kachayev in #149
[MRG] Implementation of the TJM method by @BuenoRuben in #140
[TO_REVIEW] Small bug fix check_X_y_domain() + source_target_merge() by @YanisLalou in #165
Add frank-wolfe solver (v2) by @antoinedemathelin in #167
[MRG] Add kwargs to DASVM predict/predict_proba + add score func by @YanisLalou in #160
Allow multi dim inputs through the pipeline by @kachayev in #170
Fix sign for source/targets in the HowTo docs by @kachayev in #172
Make deep DA methods compatible with GPU by @Florent-Michel in #174
[MRG] Make make_da_pipeline work with deep methods by @tgnassou in #159
Fix issue 171 on deep_coral_loss by @Florent-Michel in #182
Update documentation by @apmellot in #176
Fix CDAN input to domain_classifier and gpu compatibility by @Florent-Michel in #178
[MRG] Fix issue of the circular validation with deep models by @YanisLalou in #169
[MRG] Major update of Adapter API by @kachayev in #184
[MRG] Add TransferSubspaceLearning method by @antoinecollas in #181
[MRG] OT Label propagation methods (classical and target shift) by @rflamary in #195
[MRG] Fix pack when y is a string by @YanisLalou in #197
[MRG] Add officehome dataset by @YanisLalou in #196
[MRG] Add Amazon review dataset to skada by @YanisLalou in #185
[TO_REVIEW] Add n_neighbours as an arg of NN RW by @YanisLalou in #199
[MRG][FIX] Add allo_source parameter to JDOTClassifier by @rflamary in #202
[MRG] Fix Circular validation for NO_DA_TARGET_ONLY by @YanisLalou in #201
[MRG] MMDLS handle case where we have X_source and no y_source by @YanisLalou in #200
[MRG] Clean datasets references by @ambroiseodt in #203
[MRG] Incorporate sklearn 1.5.0 changes that are incompatible by @kachayev in #208
[MRG] Fix subspace alignment by @antoinecollas in #206
[MRG] Fix Density Reweight + Deep embedded validation + ImportanceWeightedScorer by @YanisLalou in #205
FIX test collection don't run examples by @tomMoral in #209
Add a cov_shift_center parameter to move the center of the covariate shift by @antoinedemathelin in #210
[WIP] Center data when using transform of coral by @antoinecollas in #211
Add n_iter_max to OTLabelProp by @antoinecollas in #212
Fix test_cv np.bincount error by @YanisLalou in #213
[FIX][ENH] Fix and add more explanation to "DA validation procedures" examples by @apmellot in #183
[FIX][ENH] Improve "DA methods" examples by @vloison in #188
[MRG] Release 0.3.0, pyproject.toml an Citations files by @rflamary in #215

New Contributors

@antoinecollas made their first contribution in #93
@tgnassou made their first contribution in #45
@agramfort made their first contribution in #130
@ambroiseodt made their first contribution in #154
@Florent-Michel made their first contribution in #174
@apmellot made their first contribution in #176
@tomMoral made their first contribution in #209
@vloison made their first contribution in #188

Full Changelog: 0.2.3...0.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.3.0

Domain adaptation algorithms

Deep learning domain adaptation algorithms

DA metrics

What's Changed

New Contributors

Contributors