Skip to content

Commit

Permalink
Add script to copy and process benchmark figures
Browse files Browse the repository at this point in the history
  • Loading branch information
hendrikvanantwerpen committed Oct 1, 2024
1 parent 0c66cab commit f05019b
Show file tree
Hide file tree
Showing 10 changed files with 192 additions and 780 deletions.
10 changes: 0 additions & 10 deletions crates/bpe/.gitignore

This file was deleted.

8 changes: 4 additions & 4 deletions crates/bpe/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,16 +203,16 @@ If the requirement of correct BPE output can be relaxed, then the Greedy approac

Results for counting o200k tokens for random 10000 byte slices. The setup time of the interval encoder is comparable to backtracking. After setup counting of slices of the original data are approximately constant time.

<img src="./benches/result/reports/counting-o200k/lines.svg" style="background-color: white" />
![counting runtime comparison](./benches/result/counting-o200k.svg)

### Encoding results

Results for encoding o200k tokens for random 1000 bytes. The backtracking encoder consistently outperforms tiktoken by a constant factor.

<img src="./benches/result/reports/encoding-o200k/lines.svg" style="background-color: white" />
![encoding runtime comparison](./benches/result/encoding-o200k.svg)

### Incremental encoding results

Results for incrementally encoding o200k tokens by appending 10000 random bytes. The appending encoder is slower by a constant factor but overall has similar performance curve as the backtracking encoder encoding all data at once.
Results for incrementally encoding o200k tokens by appending 10000 random bytes. The appending encoder is slower by a constant factor but overall has similar performance curve as the backtracking encoder encoding all data at once.

<img src="./benches/result/reports/appending-o200k/lines.svg" style="background-color: white" />
![appending runtime comparison](./benches/result/appending-o200k.svg)
52 changes: 52 additions & 0 deletions crates/bpe/benches/result/appending-o200k.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 48 additions & 0 deletions crates/bpe/benches/result/counting-o200k.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
76 changes: 76 additions & 0 deletions crates/bpe/benches/result/encoding-o200k.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
232 changes: 0 additions & 232 deletions crates/bpe/benches/result/reports/appending-o200k/lines.svg

This file was deleted.

217 changes: 0 additions & 217 deletions crates/bpe/benches/result/reports/counting-o200k/lines.svg

This file was deleted.

Loading

0 comments on commit f05019b

Please sign in to comment.