Skip to content

Latest commit

 

History

History
153 lines (113 loc) · 4.97 KB

README.md

File metadata and controls

153 lines (113 loc) · 4.97 KB

TabPFN

PyPI version Downloads Discord Documentation colab

TabPFN is a foundation model for tabular data that outperforms traditional methods while being dramatically faster. This repository contains the core PyTorch implementation with CUDA optimization.

⚠️ Major Update: Version 2.0: Complete codebase overhaul with new architecture and features. Previous version available at v1.0.0 and pip install tabpfn<2.

📚 For detailed usage examples and best practices, check out Interactive Colab Tutorial

🌐 TabPFN Ecosystem

Choose the right TabPFN implementation for your needs:

  • TabPFN Client: Easy-to-use API client for cloud-based inference
  • TabPFN Extensions: Community extensions and integrations
  • TabPFN (this repo): Core implementation for local deployment and research
  • TabPFN UX: No-code TabPFN usage

Try our Interactive Colab Tutorial to get started quickly.

🏁 Quick Start

Installation

# Simple installation
pip install tabpfn

# Local development installation
git clone https://github.com/PriorLabs/TabPFN.git
pip install -e "TabPFN[dev]"

Basic Usage

from sklearn.datasets import load_breast_cancer
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.model_selection import train_test_split

from tabpfn import TabPFNClassifier

# Load data
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

# Initialize a classifier
clf = TabPFNClassifier()
clf.fit(X_train, y_train)

# Predict probabilities
prediction_probabilities = clf.predict_proba(X_test)
print("ROC AUC:", roc_auc_score(y_test, prediction_probabilities[:, 1]))

# Predict labels
predictions = clf.predict(X_test)
print("Accuracy", accuracy_score(y_test, predictions))

Best Results

For optimal performance, use the AutoTabPFNClassifier or AutoTabPFNRegressor for post-hoc ensembling. These can be found in the TabPFN Extensions repository. Post-hoc ensembling combines multiple TabPFN models into an ensemble.

Steps for Best Results:

  1. Install the extensions:

    git clone https://github.com/priorlabs/tabpfn-extensions.git
    pip install -e tabpfn-extensions
  2. from tabpfn_extensions.post_hoc_ensembles.sklearn_interface import AutoTabPFNClassifier
    
    clf = AutoTabPFNClassifier(max_time=120) # 120 seconds tuning time
    clf.fit(X_train, y_train)
    predictions = clf.predict(X_test)

See https://colab.research.google.com/drive/1SHa43VuHASLjevzO7y3-wPCxHY18-2H6#scrollTo=49sMXWT5DYzj&line=1&uniqifier=1

📜 License

Prior Labs License (Apache 2.0 with additional attribution requirement)

📚 Citation

@article{hollmann2025tabpfn,
 title={Accurate predictions on small data with a tabular foundation model},
 author={Hollmann, Noah and M{\"u}ller, Samuel and Purucker, Lennart and
         Krishnakumar, Arjun and K{\"o}rfer, Max and Hoo, Shi Bin and
         Schirrmeister, Robin Tibor and Hutter, Frank},
 journal={Nature},
 year={2025},
 month={01},
 day={09},
 doi={10.1038/s41586-024-08328-6},
 publisher={Springer Nature},
 url={https://www.nature.com/articles/s41586-024-08328-6},
}

🤝 Join Our Community

We're building the future of tabular machine learning and would love your involvement:

  1. Connect & Learn:

  2. Contribute:

    • Report bugs or request features
    • Submit pull requests
    • Share your research and use cases
  3. Stay Updated: Star the repo and join Discord for the latest updates

🛠️ Development

  1. Setup environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
git clone https://github.com/PriorLabs/TabPFN.git
cd tabpfn
pip install -e ".[dev]"
pre-commit install
  1. Before committing:
pre-commit run --all-files
  1. Run tests:
pytest tests/

Built with ❤️ by Prior Labs - Copyright (c) 2025 Prior Labs GmbH