Releases: ndif-team/nnsight
v0.4.0.dev
Changelog
0.4.0.dev
released: 2024-12-05
Refer to the colab for interactive walkthrough of the new changes.
Breaking Changes
-
The InterventionGraph now follows a sequential execution order. Module envoys are expected to be referenced following the model’s architecture hierarchy. This means that out-of-order in-place operations will not take effect.
-
Saved node values are automatically injected into their proxy reference in the Python frames post graph execution. If you are calling
.value
in your code after tracing, this could lead to the wrong behavior.
New Features
-
NNsightError.
nnsight
will now produce a more comprehensive and user friendly explanation when an error occurs during execution, as a result of operations defined during tracing. Error messages will now point directly to the original traceback where the operation was initially defined. This feature can be toggled usingConfig.APP.DEBUG
, defaults totrue
. -
Traceable Python Control Flow. This feature adds Python’s Conditional and Iterator block statements as traceable operations on
InterventionProxy
s. This feature can be toggled usingConfig.APP.CONTROL_FLOW_HACKS
, default totrue
.Tracer.cond(…)
andTracer.iter(…)
are still supported.
import nnsight
...
with lm.trace("Hello World!"):
foo = nnsight.list([0, 1, 2, 3])
for item in foo:
if item % 2 == 0:
nnsight.log(item)
>>> 0
>>> 2
- Value Injection. References to saved proxies are now automatically replaced by their node value after the execution of the
nnsight
backend. This feature can be toggled usingConfig.APP.FRAME_INJECTION
, defaults totrue
.
import nnsight
...
with lm.trace("Hello World!"):
foo = nnsight.list().save()
print(type(foo))
>>> <class 'list'>
- vLLM models support. vLLM models can now be traced and intervened on using
nnsight
. Simply call theVLLM
constructor and pass it the key-string to the desired vLLM model (https://docs.vllm.ai/en/latest/models/supported_models.html)
from nnsight.modeling.vllm import VLLM
vllm_gpt2 = VLLM("gpt2",
tensor_parallel_size=2,
gpu_memory_utilization=0.5,
dispatch=True)
with vllm_gpt2.trace("The Eiffel Tower is located in the city of", temperature=0.0, top_p=1.0, max_tokens=1):
hs = vllm_gpt2.transformer.h[5].output.save()
logit = vllm_gpt2.logits.output.save()
print(vllm_gpt2.tokenizer.decode(logit.argmax(dim=-1)))
>>> " Paris"
Arguments to the vllm.LLM
engine can be directly passed to the nnsight
VLLM constructor, while arguments of vllm.SamplingParams
(typically passed during generation) can be passed to the trace call or to individual invoker calls which will override any parameters specified in the trace call for that single batch.
vLLM flattens the batch dimension, however, nnsight
converts the indexing values so that interventions can still be easily carried on different batches, similar to how it would conducted on a non-vLLM model.
Tensor Parallelism > 1 is supported with the vLLM integration.
- Trace Decorator. You can now decorate your external functions to make them traceable within your nnsight experiment. This is required for user-defined functionality to be traced.
import nnsight
...
@nnsight.trace
def my_func(value):
print(value)
with lm.trace("Hello World!"):
num = nnsight.int(5)
my_func(num)
>>> 5
- IteratorEnvoy context. It’s easier now to define interventions for multiple iterations of token generation by opening a context with
.all
or.iter
on an envoy. The.all
context will apply the envoy specific interventions at every forward pass, while.iter
can be indexed and sliced to apply the interventions on specifically chosen iterations.
Ex: Envoy.all()
import nnsight
...
with lm.generate("Hello", max_new_tokens=10):
logits = nnsight.list().save()
with lm.lm_head.all():
logits.append(lm.lm_head.output)
print(len(logits))
>>> 10
Ex: Envoy.iter
import nnsight
...
with lm.generate("Hello", max_new_tokens=10):
logits = nnsight.list().save()
with lm.lm_head.iter[5:8]:
logits.append(lm.lm_head.output)
print(len(logits))
>>> 3
Known Issues
- Inline Control Flow are not supported.
Ex:
with lm.trace("Hello World!"):
foo = nnsight.list([0, 1, 2, 3]).save()
[nnsight.log(item) for item in foo]
>>> Error
-
Value Injection is not supported for proxies referenced within objects.
-
The
vllm.LLM
engine performsmax_tokens
+ 1 forward passes which can lead to undesired behavior if you are running interventions on all iterations of multi-token generation.
Ex:
with vllm_gpt2("Hello World!", max_tokens=10):
logits = nnsight.list().save()
with vllm_gpt2.logits.all():
logits.append(vllm_gpt2.logits.output)
print(len(logits))
>>> 11 # expected: 10
- IteratorEnvoy contexts can produce undesired behavior for subsequent operations defined below it that are not dependent on
InterentionProxy
s.
Ex:
with lm.generate("Hello World!", max_new_tokens=10):
hs_4 = nnsight.list().save()
with lm.transformer.h[4].all():
hs_4.append(lm.transformer.h[4].output)
hs_4.append(433)
print(len(hs_4))
>>> 20 # expected: 11
Important Considerations
-
Remote execution is currently not available with this pre-release version!
-
Tracer.cond(…)
andTracer.iter(…)
are still supported. -
vLLM does not come as a pre-installed dependency of
nnsight
. -
nnsight
supportsvllm==0.6.4.post1
-
vLLM support only includes
cuda
andauto
devices at the moment. -
vLLM models do not support gradients.
-
The
@nnsight.trace
decorator does not enable user-defined operations to be executed remotely. Something coming soon for that...
What's Changed
- docs: NNsight 0.3 guide by @AdamBelfki3 in #218
- When doing global patching on a class vs a fn need to by @JadenFiotto-Kaufman in #219
- Add envoy type hinting n nsight by @JadenFiotto-Kaufman in #220
- Update DiffusionModel with 0.3 and some other nice to haves. by @JadenFiotto-Kaufman in #221
- Renamed Envoy._module_path to .path by @JadenFiotto-Kaufman in #223
- Global patching cls by @JadenFiotto-Kaufman in #224
- Fix Global Patching of classes by @JadenFiotto-Kaufman in #226
- Change base torch version to 2.4 by @JadenFiotto-Kaufman in #230
- Default Tokenizer Padding Side by @AdamBelfki3 in #229
- doc: scan defaults to False by @AdamBelfki3 in #240
- Ton of optimizations for serialization: by @JadenFiotto-Kaufman in #242
- nnsight.version by @AdamBelfki3 in #247
- Display error message from backend by @MichaelRipa in #249
- feat (api): add patching for torch.cat() by @AdamBelfki3 in #253
- Add ability to call envoys / models outside of tracing context by @JadenFiotto-Kaufman in #263
- Streaming protocol by @JadenFiotto-Kaufman in #268
- Envoy all by @JadenFiotto-Kaufman in #269
- nnsight support for VLLM with Tensor Parallelism by @AdamBelfki3 in #270
- Addons methods by @JadenFiotto-Kaufman in #271
- Vllm by @JadenFiotto-Kaufman in #272
- Major Refactoring by @AdamBelfki3 in #278
- Envoy.iter! by @JadenFiotto-Kaufman in #274
- Local by @JadenFiotto-Kaufman in #285
- a very useful fix by @ainatersol in #289
- set up stop functionality by @ainatersol in #288
- envoy: remove duplicated method by @ainatersol in #287
- interleaving: support no nodes in graph by @ainatersol in #286
- Debugging by @AdamBelfki3 in #284
- Buf fixes by @JadenFiotto-Kaufman in #294
- Added ndif-api-key to POST header by @MichaelRipa in #292
- Updates to pyproject.toml by @MichaelRipa in #290
- Injecting .saved() values into their Frames locals so you dont need t… by @JadenFiotto-Kaufman in #297
- Graph Visualization 0.4 by @AdamBelfki3 in #283
- Hacks2.0 by @JadenFiotto-Kaufman in #300
- VLLM w/ Tensor Parallelism by @AdamBelfki3 in #293
- NNsight.VLLM tests by @AdamBelfki3 in #301
toml
dependency by @AdamBelfki3 in #302
New Con...
v0.3.7
What's Changed
- nnsight.version by @AdamBelfki3 in #247
- Display error message from backend by @MichaelRipa in #249
- Update the model name in the about page by @djw8605 in #248
- Bug fixes to the activation patching tutorial by @arunasank in #245
- Fix tokenizer kwarg in unifiedTransformer by @Butanium in #203
- Add <3.10 compatibility to remote execution by @Butanium in #259
- Update README.md by @MichaelRipa in #256
- feat (api): add patching for torch.cat() by @AdamBelfki3 in #253
- Fix new synthax erro in readme by @Butanium in #260
- Add ability to call envoys / models outside of tracing context by @JadenFiotto-Kaufman in #263
- Updated Tutorials for Website by @ebortz in #291
- Dev by @JadenFiotto-Kaufman in #295
New Contributors
- @djw8605 made their first contribution in #248
- @arunasank made their first contribution in #245
- @ebortz made their first contribution in #291
Full Changelog: v0.3.6...v0.3.7
v0.3.6
What's Changed
- doc: scan defaults to False by @AdamBelfki3 in #240
- Changed the /status page icon location in the website navbar by @mitroitskii in #232
- Ton of optimizations for serialization: by @JadenFiotto-Kaufman in #242
- Dev by @JadenFiotto-Kaufman in #243
Full Changelog: v0.3.5...v0.3.6
v0.3.5
What's Changed
- Fix logic in LaguageModel init. The meta model (from_config) was not … by @JadenFiotto-Kaufman in #235
Full Changelog: v0.3.4...v0.3.5
v0.3.4
What's Changed
- Change base torch version to 2.4 by @JadenFiotto-Kaufman in #230
- Default Tokenizer Padding Side by @AdamBelfki3 in #229
- Dev by @JadenFiotto-Kaufman in #231
Full Changelog: v0.3.3...v0.3.4
v0.3.3
What's Changed
- Fix Global Patching of classes by @JadenFiotto-Kaufman in #226
- Dev by @JadenFiotto-Kaufman in #227
Full Changelog: v0.3.2...v0.3.3
v0.3.2
What's Changed
- Renamed Envoy._module_path to .path by @JadenFiotto-Kaufman in #223
- Global patching cls by @JadenFiotto-Kaufman in #224
- Dev by @JadenFiotto-Kaufman in #225
Full Changelog: v0.3.1...v0.3.2
v0.3.1
What's Changed
- docs: NNsight 0.3 guide by @AdamBelfki3 in #218
- When doing global patching on a class vs a fn need to by @JadenFiotto-Kaufman in #219
- Add envoy type hinting n nsight by @JadenFiotto-Kaufman in #220
- Update DiffusionModel with 0.3 and some other nice to haves. by @JadenFiotto-Kaufman in #221
- Dev by @JadenFiotto-Kaufman in #222
Full Changelog: v0.3.0...v0.3.1
0.3
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# #
# :::: ::: :::: ::: :::::::: ::::::::::: :::::::: ::: ::: ::::::::::: ::::::: :::::::: #
# :+:+: :+: :+:+: :+: :+: :+: :+: :+: :+: :+: :+: :+: :+: :+: :+: :+: #
# :+:+:+ +:+ :+:+:+ +:+ +:+ +:+ +:+ +:+ +:+ +:+ +:+ :+:+ +:+ #
# +#+ +:+ +#+ +#+ +:+ +#+ +#++:++#++ +#+ :#: +#++:++#++ +#+ +#+ + +:+ +#++: #
# +#+ +#+#+# +#+ +#+#+# +#+ +#+ +#+ +#+# +#+ +#+ +#+ +#+# +#+ +#+ #
# #+# #+#+# #+# #+#+# #+# #+# #+# #+# #+# #+# #+# #+# #+# #+# #+# #+# #+# #
# ### #### ### #### ######## ########### ######## ### ### ### ####### ### ######## #
# #
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
Changelog
0.3.0
released: 2024-08-29
We are excited to announce the release of nnsight0.3
.
This version significantly enhances the library's remote execution capabilities. It improves the integration experience with the NDIF backend and allows users to define and execute optimized training loop workflows directly on the remote server, including LoRA and other PEFT methods.
Breaking Changes
-
Module
input
access has a syntactic change:- Old:
nnsight.Envoy.input
- New:
nnsight.Envoy.inputs
- Note:
nnsight.Envoy.input
now provides access to the first positional argument of the module's input.
- Old:
-
scan
&validate
are set toFalse
by default in theTracer
context.
New Features
-
Session context: efficiently package multi-tracing experiments into a single request, enabling faster, more scalable remote experimentation.
-
Iterator context: define an intervention loop for iterative execution.
-
Model editing: alter a model by setting default edits and interventions in an editing context, applied before each forward pass.
-
Early stopping: interrup a model's forward pass at a chosen module before execution completes.
-
Conditional context: define interventions within a Conditional context, executed only when the specified condition evaluates to be True.
-
Scanning context: perform exclusive model scanning to gather important insights.
-
nnsight
builtins: define traceablePython
builtins as part of the intervention graph. -
Proxy update: assign new values to existing proxies.
-
In-Trace logging: add log statements to be called during the intervention graph execution.
-
Traceable function calls: make unsupported functions traceable by the intervention graph. Note that all pytorch functions are now traceable by
nnsight
by default.
v0.2.21
Full Changelog: v0.2.20...v0.2.21