Skip to content

GPT model token counter, with extra support for OpenAI's completion API

License

Notifications You must be signed in to change notification settings

lukaszkorecki/tolkien

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tolkien

Installation & usage

Add to your deps.edn:

 io.github.lukaszkorecki/tolkien {:git/tag "v0.1.4" :git/sha "65aeb98"}

And try it out:

(require '[tolkien.core :as token])

;; Simple token count
(token/count-tokens "gpt-3.5-turbo" "Hello World!")

;; Count chat-completion API request tokens:

(token/count-chat-completion-tokens {:model "gpt-3.5-turbo"
                                     :messages [{:role "system"
                                                 :content "You're a helpful assistant, but sometimes you make things up."}
                                                {:role "user"
                                                 :content "How many items are in this list? bananas, apples, raspberries"}]})

What is it?

If you're working with OpenAIs chat completion API sooner or later, you're going to run into the token size limit of the context length limit:

InvalidRequestError: This model’s maximum context length is 16385 tokens. However, your messages resulted in 18108 tokens. Please reduce the length of the messages.

Tolkien helps you to get accurate token counts for strings and the Chat Completion API.

⚠️ Chat Completion API payloads are notoriously hard to get accurate counts for - read this blog post explaning why.

Based on my experiments, usage of real life data (short, mid-size and long form text), Tolkien has a ~25 token error margin, meaning it will undercount or overcount by 25 tokens max, given the Chat Completion API features used.

Credits & acknowledgments

About

GPT model token counter, with extra support for OpenAI's completion API

Resources

License

Stars

Watchers

Forks

Packages

No packages published