This repository contains the code for our project based on:

What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg*, Dimitris Tsipras*, Percy Liang, Gregory Valiant
Paper: http://arxiv.org/abs/2208.01066

Getting started

You can start by checking out our notebooks:

The original.ipynb notebook contains code from the original paper to trian models, plot the pre-computed metrics, and evaluate them on new data.
The polynomial-regression.ipynb notebook tests the ability of a transformer to learn fixed degree or descending degree polynomial regression.
The knn-classification.ipynb notebook tests the ability of a transformer to learn knn-based classification. This is very slow due to clustering and k-nearest neighbor operations.
To recreate our models, run the polynomial-regression.ipynb notebook. This will create the full model from scratch, which will take around two hours. You can vary the args.training.task_kwargs["degree"] parameter to change the degree of the polynomial or the args.training.task parameter to change the learned task.
To create smaller models, run the polynomial-regression.ipynb notebook and vary the args.training.train_steps parameter to decrease the number of iteration steps.

To recompute our plots:

The review.ipynb notebook contains code to plot the metrics of our pre-trained models and evaluate them on new data.

    @InProceedings{garg2022what,
        title={What Can Transformers Learn In-Context? A Case Study of Simple Function Classes},
        author={Shivam Garg and Dimitris Tsipras and Percy Liang and Gregory Valiant},
        year={2022},
        booktitle={arXiv preprint}
    }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Getting started

Files

README.md

Latest commit

History

README.md

File metadata and controls

Getting started