OpenAI's tiktoken tokenizer written in Go. The vocabularies are embedded and do not need to be downloaded at runtime.
go get github.com/hupe1980/go-tiktoken
package main
import (
"fmt"
"log"
"github.com/hupe1980/go-tiktoken"
)
func main() {
encoding, err := tiktoken.NewEncodingForModel("gpt-3.5-turbo")
if err != nil {
log.Fatal(err)
}
ids, tokens, err := encoding.Encode("Hello World", nil, nil)
if err != nil {
log.Fatal(err)
}
fmt.Println("IDs:", ids)
fmt.Println("Tokens:", tokens)
}
Output:
IDs: [9906 4435]
Tokens: [Hello World]
For more example usage, see _examples.
- ✅ o200k_base
- ✅ cl100k_base
- ✅ p50k_base
- ✅ p50k_edit
- ✅ r50k_base
- ✅ gpt2
- ✅ claude