LLaMA-CCS

A lightly modified version of facebookresearch/llama that allows for saving the intermediate activations in order to run CCS.

Some experiments for CS 229br Foundations of Deep Learning as taught in Spring 2023 at Harvard University by Boaz Barak (and teaching fellows Gustaf Ahdritz and Gal Kaplun).

Setup

Make sure you have CUDA available.

pip install -r requirements.txt

To generate the BoolQ prompts, run

python generate_dataset.py ./data/boolq/prompts.csv --tokenizer_path $TARGET_FOLDER/tokenizer.model

To evaluate a LLaMA model on the saved dataset, set the variables accordingly and run

torchrun --nproc_per_node $MP example.py --ckpt_dir $TARGET_FOLDER/$MODEL_SIZE --tokenizer_path $TARGET_FOLDER/tokenizer.model --save_activations_path ./data/boolq --prompt_csv ./data/prompts.csv

Different models require different MP values:

Model	MP
7B	1
13B	2
33B	4
65B	8

Related resources

Discovering Latent Knowledge, in Language Models Without Supervision: The original paper by Collin Burns and Haotian Ye et al that proposes "Contrast-Consistent Search" (CCS).

collin-burns/discovering_latent_knowledge: The corresponding repository.
- This is claimed to be quite buggy. See Bugs of the Initial Release of CCS by Fabien Roger.
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme

What Discovering Latent Knowledge Did and Did Not Find: A writeup by Fabien Roger on takeaways from the original paper.

safer-ai/Exhaustive-CCS: The corresponding repository. Similar to Collin Burns's but with fewer bugs.
Several experiments with CCS.

EleutherAI/elk: Contains many further innovations on top of CCS.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
ccs		ccs
llama		llama
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
FAQ.md		FAQ.md
LICENSE		LICENSE
MODEL_CARD.md		MODEL_CARD.md
README.md		README.md
download.sh		download.sh
example.py		example.py
explore.ipynb		explore.ipynb
generate_dataset.py		generate_dataset.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLaMA-CCS

Setup

Related resources

About

Releases

Packages

Languages

License

adzcai/llama-ccs

Folders and files

Latest commit

History

Repository files navigation

LLaMA-CCS

Setup

Related resources

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages