Skip to content

Commit

Permalink
mega update: fix(TransformerLens): fix precision error in Llama
Browse files Browse the repository at this point in the history
  • Loading branch information
Hzfinfdu committed Aug 19, 2024
2 parents 04c8aed + 81a07f0 commit 654970b
Show file tree
Hide file tree
Showing 89 changed files with 10,681 additions and 1,848 deletions.
4 changes: 3 additions & 1 deletion TransformerLens/.github/workflows/checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ jobs:
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
steps:
Expand Down Expand Up @@ -123,6 +122,7 @@ jobs:
notebook:
# - "Activation_Patching_in_TL_Demo"
# - "Attribution_Patching_Demo"
- "ARENA_Content"
- "BERT"
- "Exploratory_Analysis_Demo"
# - "Grokking_Demo"
Expand All @@ -133,6 +133,8 @@ jobs:
- "Main_Demo"
# - "No_Position_Experiment"
- "Othello_GPT"
- "Patchscopes_Generation_Demo"
# - "T5"
steps:
- uses: actions/checkout@v3
- name: Install Poetry
Expand Down
33 changes: 16 additions & 17 deletions TransformerLens/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,11 @@ CD](https://github.com/TransformerLensOrg/TransformerLens/actions/workflows/chec
[![Docs
CD](https://github.com/TransformerLensOrg/TransformerLens/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/TransformerLensOrg/TransformerLens/actions/workflows/pages/pages-build-deployment)

A Library for Mechanistic Interpretability of Generative Language Models.
A Library for Mechanistic Interpretability of Generative Language Models. Maintained by [Bryce Meyer](https://github.com/bryce13950) and created by [Neel Nanda](https://neelnanda.io/about)

[![Read the Docs
Here](https://img.shields.io/badge/-Read%20the%20Docs%20Here-blue?style=for-the-badge&logo=Read-the-Docs&logoColor=white&link=https://TransformerLensOrg.github.io/TransformerLens/)](https://TransformerLensOrg.github.io/TransformerLens/)

| :exclamation: HookedSAETransformer Removed |
|-----------------------------------------------|

Hooked SAE has been removed from TransformerLens 2.0. The functionality is being moved to
[SAELens](http://github.com/jbloomAus/SAELens). For more information on this release, please see the
accompanying
[announcement](https://transformerlensorg.github.io/TransformerLens/content/news/release-2.0.html)
for details on what's new, and the future of TransformerLens.

This is a library for doing [mechanistic
interpretability](https://distill.pub/2020/circuits/zoom-in/) of GPT-2 Style language models. The
goal of mechanistic interpretability is to take a trained model and reverse engineer the algorithms
Expand Down Expand Up @@ -56,7 +47,7 @@ logits, activations = model.run_with_cache("Hello World")
## Key Tutorials

* [Introduction to the Library and Mech
Interp](https://arena-ch1-transformers.streamlit.app/[1.2]_Intro_to_Mech_Interp)
Interp](https://arena3-chapter1-transformer-interp.streamlit.app/[1.2]_Intro_to_Mech_Interp)
* [Demo of Main TransformerLens Features](https://neelnanda.io/transformer-lens-demo)

## Gallery
Expand Down Expand Up @@ -111,20 +102,20 @@ you would like to help, please try working on one! The standard answer to "why h
yet" is just that there aren't enough people! Key resources:

* [A Guide to Getting Started in Mechanistic Interpretability](https://neelnanda.io/getting-started)
* [ARENA Mechanistic Interpretability Tutorials](https://arena-ch1-transformers.streamlit.app/) from
* [ARENA Mechanistic Interpretability Tutorials](https://arena3-chapter1-transformer-interp.streamlit.app/) from
Callum McDougall. A comprehensive practical introduction to mech interp, written in
TransformerLens - full of snippets to copy and they come with exercises and solutions! Notable
tutorials:
* [Coding GPT-2 from
scratch](https://arena-ch1-transformers.streamlit.app/[1.1]_Transformer_from_Scratch), with
scratch](https://arena3-chapter1-transformer-interp.streamlit.app/[1.1]_Transformer_from_Scratch), with
accompanying video tutorial from me ([1](https://neelnanda.io/transformer-tutorial)
[2](https://neelnanda.io/transformer-tutorial-2)) - a good introduction to transformers
* [Introduction to Mech Interp and
TransformerLens](https://arena-ch1-transformers.streamlit.app/[1.2]_Intro_to_Mech_Interp): An
TransformerLens](https://arena3-chapter1-transformer-interp.streamlit.app/[1.2]_Intro_to_Mech_Interp): An
introduction to TransformerLens and mech interp via studying induction heads. Covers the
foundational concepts of the library
* [Indirect Object
Identification](https://arena-ch1-transformers.streamlit.app/[1.3]_Indirect_Object_Identification):
Identification](https://arena3-chapter1-transformer-interp.streamlit.app/[1.3]_Indirect_Object_Identification):
a replication of interpretability in the wild, that covers standard techniques in mech interp
such as [direct logit
attribution](https://dynalist.io/d/n2ZWtnoYHrU1s4vnFSAQ519J#z=disz2gTx-jooAcR0a5r8e7LZ),
Expand Down Expand Up @@ -156,10 +147,18 @@ discussions about eg supporting important new use cases, or if you want to make
contributions to the library and want a maintainer's opinion. We'd also love for you to come and
share your projects on the Slack!

| :exclamation: HookedSAETransformer Removed |
|-----------------------------------------------|

Hooked SAE has been removed from TransformerLens in version 2.0. The functionality is being moved to
[SAELens](http://github.com/jbloomAus/SAELens). For more information on this release, please see the
accompanying
[announcement](https://transformerlensorg.github.io/TransformerLens/content/news/release-2.0.html)
for details on what's new, and the future of TransformerLens.

## Credits

This library was created by **[Neel Nanda](https://neelnanda.io)** and is maintained by **Joseph
Bloom**.
This library was created by **[Neel Nanda](https://neelnanda.io)** and is maintained by **[Bryce Meyer](https://github.com/bryce13950)**.

The core features of TransformerLens were heavily inspired by the interface to [Anthropic's
excellent Garcon tool](https://transformer-circuits.pub/2021/garcon/index.html). Credit to Nelson
Expand Down
Loading

0 comments on commit 654970b

Please sign in to comment.