LM-compositionality

This repo contains the code for the paper "Are representations built from the ground up? An empirical examination of local composition in language models".

How to access datasets

Penn Treebank

Please download a version of the Penn Treebank files on your own machine. Afterwards, you can run src/generate_data_treebank.py to generate the data files from the Penn Treebank. The path to treebank files should look something like this: [...]/treebank_3/parsed/mrg/. The dataset is automatically sharded into 10 parts, so to build the full dataset, you need to run with -i=[0...9].

python3 -m src.generate_data_treebank --model=[bert,roberta,deberta,gpt2] -i=[0...9] --layer=12 --emb_type=[CLS,avg] [--cuda]

Afterwards, you can generate the embeddings with this script:

python3 -m src.calc_compositionality_scores -i=[0...9] --emb_type=[CLS,avg] --layer=12 --model=[bert, roberta,deberta,gpt2] --full

The embeddings will be saved to data/binary_child_embs_[model]_[i]_[emb_type]_full_True_layer_12.npz". They can be combined again through running python3 src/utils/combine_data.py`.

CHIP (Compositionality of Human-annotated Idiomatic Phrases)

To access this dataset, you can download the csv file at data/qualtrics_results/chip_dataset.csv. Running python3 -m src.data.process_qualtrics_data will show details such as human annotator agreements and Spearman correlations between human results and model compositionality scores. There are 1001 phrases in total.

Each phrase is scored from 1 (not compositional) to 3 (fully compositional) by three annotators. Each row represents the judgments of a different annotator. An empty value for an annotator means that the annotator thought that the phrase didn't make sense (these were ignored for the analysis in the paper).

Training probes

To train approximative probes, you can run the following script:

python3 -m src.models.fit_composition_functions [add,mult,w1,w2,linear,mlp,affine] --model=[bert,roberta,deberta,gpt2] --emb_type=[cls,avg] --use_binary --full [--use_control_task]

--use_control_task is for the anisotropy control setting (predicting randomly selected vectors rather than the parent vector). For the normal task, you don't need to run with this flag.

Producing figures

These scripts can be found in src/visualizations.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
data		data
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LM-compositionality

How to access datasets

Penn Treebank

CHIP (Compositionality of Human-annotated Idiomatic Phrases)

Training probes

Producing figures

About

Releases

Packages

Languages

nightingal3/lm-compositionality

Folders and files

Latest commit

History

Repository files navigation

LM-compositionality

How to access datasets

Penn Treebank

CHIP (Compositionality of Human-annotated Idiomatic Phrases)

Training probes

Producing figures

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages