Skip to content

Commit

Permalink
feat: Released 1.2.0.
Browse files Browse the repository at this point in the history
  • Loading branch information
HavenDV committed Feb 18, 2024
1 parent ebd0aaf commit 58b2053
Show file tree
Hide file tree
Showing 11 changed files with 126 additions and 78 deletions.
70 changes: 35 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,45 +34,45 @@ You can view the reports for each version [here](benchmarks)
<!--BENCHMARKS_START-->
```
BenchmarkDotNet v0.13.7, macOS Ventura 13.5.1 (22G90) [Darwin 22.6.0]
BenchmarkDotNet v0.13.12, macOS Sonoma 14.2.1 (23C71) [Darwin 23.2.0]
Apple M1 Pro, 1 CPU, 10 logical and 10 physical cores
.NET SDK 7.0.400
[Host] : .NET 7.0.10 (7.0.1023.36312), Arm64 RyuJIT AdvSIMD
DefaultJob : .NET 7.0.10 (7.0.1023.36312), Arm64 RyuJIT AdvSIMD
.NET SDK 8.0.100
[Host] : .NET 8.0.0 (8.0.23.53103), Arm64 RyuJIT AdvSIMD
DefaultJob : .NET 8.0.0 (8.0.23.53103), Arm64 RyuJIT AdvSIMD
```
| Method | Categories | Data | Mean | Ratio | Gen0 | Gen1 | Allocated | Alloc Ratio |
|--------------------------- |------------ |-------------------- |---------------:|------:|---------:|---------:|----------:|------------:|
| **SharpTokenV1_2_8_** | **CountTokens** | **1. (...)57. [19866]** | **1,450,007.0 ns** | **1.00** | **292.9688** | **146.4844** | **1846187 B** | **1.00** |
| TiktokenSharpV1_0_6_ | CountTokens | 1. (...)57. [19866] | 977,818.9 ns | 0.67 | 250.0000 | 125.0000 | 1571155 B | 0.85 |
| TokenizerLibV1_3_2_ | CountTokens | 1. (...)57. [19866] | 854,357.2 ns | 0.59 | 246.0938 | 85.9375 | 1547673 B | 0.84 |
| Tiktoken_ | CountTokens | 1. (...)57. [19866] | 355,029.1 ns | 0.24 | 49.3164 | - | 309449 B | 0.17 |
| | | | | | | | | |
| **SharpTokenV1_2_8_** | **CountTokens** | **Hello, World!** | **1,722.2 ns** | **1.00** | **0.5264** | **-** | **3304 B** | **1.00** |
| TiktokenSharpV1_0_6_ | CountTokens | Hello, World! | 6,291.2 ns | 3.65 | 2.1820 | 0.0305 | 13728 B | 4.15 |
| TokenizerLibV1_3_2_ | CountTokens | Hello, World! | 604.0 ns | 0.35 | 0.2356 | - | 1480 B | 0.45 |
| Tiktoken_ | CountTokens | Hello, World! | 247.0 ns | 0.14 | 0.0420 | - | 264 B | 0.08 |
| | | | | | | | | |
| **SharpTokenV1_2_8_** | **CountTokens** | **King(...)edy. [275]** | **15,377.1 ns** | **1.00** | **4.1199** | **0.1526** | **26008 B** | **1.00** |
| TiktokenSharpV1_0_6_ | CountTokens | King(...)edy. [275] | 14,758.1 ns | 0.96 | 5.1117 | 0.1526 | 32096 B | 1.23 |
| TokenizerLibV1_3_2_ | CountTokens | King(...)edy. [275] | 8,366.9 ns | 0.54 | 3.0823 | 0.1373 | 19344 B | 0.74 |
| Tiktoken_ | CountTokens | King(...)edy. [275] | 3,838.6 ns | 0.25 | 0.6409 | - | 4032 B | 0.16 |
| | | | | | | | | |
| **SharpTokenV1_2_8_Encode** | **Encode** | **1. (...)57. [19866]** | **1,393,026.6 ns** | **1.00** | **292.9688** | **146.4844** | **1846187 B** | **1.00** |
| TiktokenSharpV1_0_6_Encode | Encode | 1. (...)57. [19866] | 1,246,776.8 ns | 0.90 | 250.0000 | 125.0000 | 1571155 B | 0.85 |
| TokenizerLibV1_3_2_Encode | Encode | 1. (...)57. [19866] | 852,519.6 ns | 0.61 | 246.0938 | 85.9375 | 1547673 B | 0.84 |
| Tiktoken_Encode | Encode | 1. (...)57. [19866] | 378,546.7 ns | 0.27 | 59.5703 | 2.4414 | 375665 B | 0.20 |
| | | | | | | | | |
| **SharpTokenV1_2_8_Encode** | **Encode** | **Hello, World!** | **1,719.3 ns** | **1.00** | **0.5264** | **-** | **3304 B** | **1.00** |
| TiktokenSharpV1_0_6_Encode | Encode | Hello, World! | 6,293.3 ns | 3.66 | 2.1820 | 0.0305 | 13728 B | 4.15 |
| TokenizerLibV1_3_2_Encode | Encode | Hello, World! | 607.6 ns | 0.35 | 0.2356 | - | 1480 B | 0.45 |
| Tiktoken_Encode | Encode | Hello, World! | 320.6 ns | 0.19 | 0.1135 | - | 712 B | 0.22 |
| | | | | | | | | |
| **SharpTokenV1_2_8_Encode** | **Encode** | **King(...)edy. [275]** | **15,444.0 ns** | **1.00** | **4.1199** | **0.1526** | **26008 B** | **1.00** |
| TiktokenSharpV1_0_6_Encode | Encode | King(...)edy. [275] | 14,704.0 ns | 0.95 | 5.1117 | 0.1526 | 32096 B | 1.23 |
| TokenizerLibV1_3_2_Encode | Encode | King(...)edy. [275] | 8,556.8 ns | 0.55 | 3.0823 | 0.1373 | 19344 B | 0.74 |
| Tiktoken_Encode | Encode | King(...)edy. [275] | 4,136.4 ns | 0.27 | 0.8011 | - | 5056 B | 0.19 |
| Method | Categories | Data | Mean | Median | Ratio | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio |
|--------------------------- |------------ |-------------------- |---------------:|---------------:|------:|---------:|---------:|-------:|----------:|------------:|
| **SharpTokenV1_2_16_** | **CountTokens** | **1. (...)57. [19866]** | **1,554,552.0 ns** | **1,552,769.4 ns** | **1.00** | **292.9688** | **146.4844** | **-** | **1846147 B** | **1.00** |
| TiktokenSharpV1_0_9_ | CountTokens | 1. (...)57. [19866] | 1,242,157.7 ns | 1,241,657.7 ns | 0.80 | 253.9063 | 117.1875 | 3.9063 | 1570786 B | 0.85 |
| TokenizerLibV1_3_3_ | CountTokens | 1. (...)57. [19866] | 815,490.5 ns | 806,761.4 ns | 0.52 | 247.0703 | 98.6328 | 0.9766 | 1547678 B | 0.84 |
| Tiktoken_ | CountTokens | 1. (...)57. [19866] | 311,744.2 ns | 311,591.0 ns | 0.20 | 49.3164 | - | - | 309449 B | 0.17 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_** | **CountTokens** | **Hello, World!** | **1,585.8 ns** | **1,586.5 ns** | **1.00** | **0.5188** | **0.0019** | **-** | **3264 B** | **1.00** |
| TiktokenSharpV1_0_9_ | CountTokens | Hello, World! | 5,806.8 ns | 5,805.7 ns | 3.66 | 2.1286 | 0.0381 | 0.0076 | 13344 B | 4.09 |
| TokenizerLibV1_3_3_ | CountTokens | Hello, World! | 766.2 ns | 766.7 ns | 0.48 | 0.2356 | - | - | 1480 B | 0.45 |
| Tiktoken_ | CountTokens | Hello, World! | 210.9 ns | 210.2 ns | 0.13 | 0.0420 | - | - | 264 B | 0.08 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_** | **CountTokens** | **King(...)edy. [275]** | **13,851.9 ns** | **13,808.5 ns** | **1.00** | **4.1351** | **0.0153** | **-** | **25968 B** | **1.00** |
| TiktokenSharpV1_0_9_ | CountTokens | King(...)edy. [275] | 13,387.6 ns | 13,395.3 ns | 0.97 | 5.0659 | 0.1984 | 0.0153 | 31712 B | 1.22 |
| TokenizerLibV1_3_3_ | CountTokens | King(...)edy. [275] | 10,861.4 ns | 10,865.2 ns | 0.78 | 3.0975 | 0.1526 | 0.0153 | 19344 B | 0.74 |
| Tiktoken_ | CountTokens | King(...)edy. [275] | 3,162.3 ns | 3,162.0 ns | 0.23 | 0.6447 | - | - | 4064 B | 0.16 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_Encode** | **Encode** | **1. (...)57. [19866]** | **1,327,775.1 ns** | **1,330,166.1 ns** | **1.00** | **294.9219** | **142.5781** | **1.9531** | **1846151 B** | **1.00** |
| TiktokenSharpV1_0_9_Encode | Encode | 1. (...)57. [19866] | 1,016,985.4 ns | 994,095.3 ns | 0.80 | 250.0000 | 125.0000 | - | 1570772 B | 0.85 |
| TokenizerLibV1_3_3_Encode | Encode | 1. (...)57. [19866] | 804,657.4 ns | 803,549.7 ns | 0.61 | 247.0703 | 108.3984 | 0.9766 | 1547678 B | 0.84 |
| Tiktoken_Encode | Encode | 1. (...)57. [19866] | 331,107.8 ns | 331,142.1 ns | 0.25 | 59.5703 | 2.4414 | - | 375601 B | 0.20 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_Encode** | **Encode** | **Hello, World!** | **1,891.1 ns** | **1,894.6 ns** | **1.00** | **0.5188** | **0.0019** | **-** | **3264 B** | **1.00** |
| TiktokenSharpV1_0_9_Encode | Encode | Hello, World! | 5,816.9 ns | 5,824.0 ns | 3.08 | 2.1210 | 0.0381 | - | 13344 B | 4.09 |
| TokenizerLibV1_3_3_Encode | Encode | Hello, World! | 496.7 ns | 496.8 ns | 0.26 | 0.2356 | - | - | 1480 B | 0.45 |
| Tiktoken_Encode | Encode | Hello, World! | 265.3 ns | 264.7 ns | 0.14 | 0.1030 | - | - | 648 B | 0.20 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_Encode** | **Encode** | **King(...)edy. [275]** | **17,497.7 ns** | **17,480.3 ns** | **1.00** | **4.1199** | **0.0305** | **-** | **25968 B** | **1.00** |
| TiktokenSharpV1_0_9_Encode | Encode | King(...)edy. [275] | 13,374.0 ns | 13,348.4 ns | 0.76 | 5.0659 | 0.1984 | 0.0153 | 31712 B | 1.22 |
| TokenizerLibV1_3_3_Encode | Encode | King(...)edy. [275] | 7,333.9 ns | 7,338.7 ns | 0.42 | 3.0899 | 0.1450 | 0.0076 | 19344 B | 0.74 |
| Tiktoken_Encode | Encode | King(...)edy. [275] | 3,450.2 ns | 3,452.9 ns | 0.20 | 0.7973 | - | - | 5024 B | 0.19 |

<!--BENCHMARKS_END-->

Expand Down
41 changes: 41 additions & 0 deletions benchmarks/1.2.0.0_encode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
```
BenchmarkDotNet v0.13.12, macOS Sonoma 14.2.1 (23C71) [Darwin 23.2.0]
Apple M1 Pro, 1 CPU, 10 logical and 10 physical cores
.NET SDK 8.0.100
[Host] : .NET 8.0.0 (8.0.23.53103), Arm64 RyuJIT AdvSIMD
DefaultJob : .NET 8.0.0 (8.0.23.53103), Arm64 RyuJIT AdvSIMD
```
| Method | Categories | Data | Mean | Median | Ratio | Gen0 | Gen1 | Gen2 | Allocated | Alloc Ratio |
|--------------------------- |------------ |-------------------- |---------------:|---------------:|------:|---------:|---------:|-------:|----------:|------------:|
| **SharpTokenV1_2_16_** | **CountTokens** | **1. (...)57. [19866]** | **1,554,552.0 ns** | **1,552,769.4 ns** | **1.00** | **292.9688** | **146.4844** | **-** | **1846147 B** | **1.00** |
| TiktokenSharpV1_0_9_ | CountTokens | 1. (...)57. [19866] | 1,242,157.7 ns | 1,241,657.7 ns | 0.80 | 253.9063 | 117.1875 | 3.9063 | 1570786 B | 0.85 |
| TokenizerLibV1_3_3_ | CountTokens | 1. (...)57. [19866] | 815,490.5 ns | 806,761.4 ns | 0.52 | 247.0703 | 98.6328 | 0.9766 | 1547678 B | 0.84 |
| Tiktoken_ | CountTokens | 1. (...)57. [19866] | 311,744.2 ns | 311,591.0 ns | 0.20 | 49.3164 | - | - | 309449 B | 0.17 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_** | **CountTokens** | **Hello, World!** | **1,585.8 ns** | **1,586.5 ns** | **1.00** | **0.5188** | **0.0019** | **-** | **3264 B** | **1.00** |
| TiktokenSharpV1_0_9_ | CountTokens | Hello, World! | 5,806.8 ns | 5,805.7 ns | 3.66 | 2.1286 | 0.0381 | 0.0076 | 13344 B | 4.09 |
| TokenizerLibV1_3_3_ | CountTokens | Hello, World! | 766.2 ns | 766.7 ns | 0.48 | 0.2356 | - | - | 1480 B | 0.45 |
| Tiktoken_ | CountTokens | Hello, World! | 210.9 ns | 210.2 ns | 0.13 | 0.0420 | - | - | 264 B | 0.08 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_** | **CountTokens** | **King(...)edy. [275]** | **13,851.9 ns** | **13,808.5 ns** | **1.00** | **4.1351** | **0.0153** | **-** | **25968 B** | **1.00** |
| TiktokenSharpV1_0_9_ | CountTokens | King(...)edy. [275] | 13,387.6 ns | 13,395.3 ns | 0.97 | 5.0659 | 0.1984 | 0.0153 | 31712 B | 1.22 |
| TokenizerLibV1_3_3_ | CountTokens | King(...)edy. [275] | 10,861.4 ns | 10,865.2 ns | 0.78 | 3.0975 | 0.1526 | 0.0153 | 19344 B | 0.74 |
| Tiktoken_ | CountTokens | King(...)edy. [275] | 3,162.3 ns | 3,162.0 ns | 0.23 | 0.6447 | - | - | 4064 B | 0.16 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_Encode** | **Encode** | **1. (...)57. [19866]** | **1,327,775.1 ns** | **1,330,166.1 ns** | **1.00** | **294.9219** | **142.5781** | **1.9531** | **1846151 B** | **1.00** |
| TiktokenSharpV1_0_9_Encode | Encode | 1. (...)57. [19866] | 1,016,985.4 ns | 994,095.3 ns | 0.80 | 250.0000 | 125.0000 | - | 1570772 B | 0.85 |
| TokenizerLibV1_3_3_Encode | Encode | 1. (...)57. [19866] | 804,657.4 ns | 803,549.7 ns | 0.61 | 247.0703 | 108.3984 | 0.9766 | 1547678 B | 0.84 |
| Tiktoken_Encode | Encode | 1. (...)57. [19866] | 331,107.8 ns | 331,142.1 ns | 0.25 | 59.5703 | 2.4414 | - | 375601 B | 0.20 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_Encode** | **Encode** | **Hello, World!** | **1,891.1 ns** | **1,894.6 ns** | **1.00** | **0.5188** | **0.0019** | **-** | **3264 B** | **1.00** |
| TiktokenSharpV1_0_9_Encode | Encode | Hello, World! | 5,816.9 ns | 5,824.0 ns | 3.08 | 2.1210 | 0.0381 | - | 13344 B | 4.09 |
| TokenizerLibV1_3_3_Encode | Encode | Hello, World! | 496.7 ns | 496.8 ns | 0.26 | 0.2356 | - | - | 1480 B | 0.45 |
| Tiktoken_Encode | Encode | Hello, World! | 265.3 ns | 264.7 ns | 0.14 | 0.1030 | - | - | 648 B | 0.20 |
| | | | | | | | | | | |
| **SharpTokenV1_2_16_Encode** | **Encode** | **King(...)edy. [275]** | **17,497.7 ns** | **17,480.3 ns** | **1.00** | **4.1199** | **0.0305** | **-** | **25968 B** | **1.00** |
| TiktokenSharpV1_0_9_Encode | Encode | King(...)edy. [275] | 13,374.0 ns | 13,348.4 ns | 0.76 | 5.0659 | 0.1984 | 0.0153 | 31712 B | 1.22 |
| TokenizerLibV1_3_3_Encode | Encode | King(...)edy. [275] | 7,333.9 ns | 7,338.7 ns | 0.42 | 3.0899 | 0.1450 | 0.0076 | 19344 B | 0.74 |
| Tiktoken_Encode | Encode | King(...)edy. [275] | 3,450.2 ns | 3,452.9 ns | 0.20 | 0.7973 | - | - | 5024 B | 0.19 |
24 changes: 15 additions & 9 deletions src/Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,25 @@
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
</PropertyGroup>
<ItemGroup>
<PackageVersion Include="BenchmarkDotNet" Version="0.13.10" />
<PackageVersion Include="BenchmarkDotNet" Version="0.13.12" />
<PackageVersion Include="DotNet.ReproducibleBuilds" Version="1.1.1" />
<PackageVersion Include="FluentAssertions" Version="6.12.0" />
<PackageVersion Include="GitHubActionsTestLogger" Version="2.3.3" />
<PackageVersion Include="H.Resources.Generator" Version="1.5.1" />
<PackageVersion Include="Microsoft.DeepDev.TokenizerLib" Version="1.3.2" />
<PackageVersion Include="Microsoft.NET.Test.Sdk" Version="17.8.0" />
<PackageVersion Include="MSTest.TestAdapter" Version="3.1.1" />
<PackageVersion Include="MSTest.TestFramework" Version="3.1.1" />
<PackageVersion Include="PolySharp" Version="1.13.2" />
<PackageVersion Include="SharpToken" Version="1.2.14" />
<PackageVersion Include="H.Resources.Generator" Version="1.6.0">
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
</PackageVersion>
<PackageVersion Include="Microsoft.DeepDev.TokenizerLib" Version="1.3.3" />
<PackageVersion Include="Microsoft.NET.Test.Sdk" Version="17.9.0" />
<PackageVersion Include="MSTest.TestAdapter" Version="3.2.1" />
<PackageVersion Include="MSTest.TestFramework" Version="3.2.1" />
<PackageVersion Include="PolySharp" Version="1.14.1">
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
</PackageVersion>
<PackageVersion Include="SharpToken" Version="1.2.16" />
<PackageVersion Include="System.ValueTuple" Version="4.5.0" />
<PackageVersion Include="TiktokenSharp" Version="1.0.9" />
<PackageVersion Include="Verify.MSTest" Version="22.6.0" />
<PackageVersion Include="Verify.MSTest" Version="23.2.0" />
</ItemGroup>
</Project>
12 changes: 6 additions & 6 deletions src/benchmarks/Tiktoken.Benchmarks/Benchmarks.cs
Original file line number Diff line number Diff line change
Expand Up @@ -30,15 +30,15 @@ public async Task GlobalSetup()

[Benchmark(Baseline = true)]
[BenchmarkCategory("Encode")]
public List<int> SharpTokenV1_2_8_Encode() => _sharpToken.Encode(Data);
public List<int> SharpTokenV1_2_16_Encode() => _sharpToken.Encode(Data);

[Benchmark]
[BenchmarkCategory("Encode")]
public List<int> TiktokenSharpV1_0_6_Encode() => _tiktokenSharp.Encode(Data);
public List<int> TiktokenSharpV1_0_9_Encode() => _tiktokenSharp.Encode(Data);

[Benchmark]
[BenchmarkCategory("Encode")]
public IReadOnlyCollection<int> TokenizerLibV1_3_2_Encode() => _tokenizerLib!.Encode(Data, ArraySegment<string>.Empty);
public IReadOnlyCollection<int> TokenizerLibV1_3_3_Encode() => _tokenizerLib!.Encode(Data, ArraySegment<string>.Empty);

[Benchmark]
[BenchmarkCategory("Encode")]
Expand All @@ -47,15 +47,15 @@ public async Task GlobalSetup()

[Benchmark(Baseline = true)]
[BenchmarkCategory("CountTokens")]
public int SharpTokenV1_2_8_() => _sharpToken.Encode(Data).Count;
public int SharpTokenV1_2_16_() => _sharpToken.Encode(Data).Count;

[Benchmark]
[BenchmarkCategory("CountTokens")]
public int TiktokenSharpV1_0_6_() => _tiktokenSharp.Encode(Data).Count;
public int TiktokenSharpV1_0_9_() => _tiktokenSharp.Encode(Data).Count;

[Benchmark]
[BenchmarkCategory("CountTokens")]
public int TokenizerLibV1_3_2_() => _tokenizerLib!.Encode(Data, ArraySegment<string>.Empty).Count;
public int TokenizerLibV1_3_3_() => _tokenizerLib!.Encode(Data, ArraySegment<string>.Empty).Count;

[Benchmark]
[BenchmarkCategory("CountTokens")]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net7.0</TargetFramework>
<TargetFramework>net8.0</TargetFramework>
<NoWarn>$(NoWarn);CS8002</NoWarn>
</PropertyGroup>

Expand Down
Loading

0 comments on commit 58b2053

Please sign in to comment.