A Go package provides high-level abstraction to define functions with code (the usual way), data (providing examples of inputs and expected outputs which are then used with an AI model), or natural language description. It is the simplest but powerful way to use large language models (LLMs) in Go.
Features:
- A common interface to support both code-defined, data-defined, and description-defined functions.
- Functions are strongly typed so inputs and outputs can be Go structs and values.
- Provides unofficial OpenAI, Groq, Anthropic and Ollama integrations for AI (LLM) models.
- Support for tool calling which transparently calls into Go functions with Go structs and values as inputs and outputs. Recursion possible.
- Uses adaptive rate limiting to maximize throughput of API calls made to integrated AI models.
- Provides a CLI tool
fun
which makes it easy to run data-defined and description-defined functions on files.
This is a Go package. You can add it to your project using go get
:
go get gitlab.com/tozd/go/fun
It requires Go 1.23 or newer.
Releases page
contains a list of stable versions of the fun
tool.
Each includes:
- Statically compiled binaries.
- Docker images.
You should just download/use the latest one.
The tool is implemented in Go. You can also use go install
to install the latest stable (released) version:
go install gitlab.com/tozd/go/fun/cmd/go/fun@latest
To install the latest development version (main
branch):
go install gitlab.com/tozd/go/fun/cmd/go/fun@main
See full package documentation with examples on pkg.go.dev.
fun
tool calls a function on files. You can provide:
- Examples of inputs and expected outputs as files (as pairs of files with same basename but different file extensions).
- Natural language description of the function, a prompt.
- Input files on which to run the function.
- Files with input and output JSON Schemas to validate inputs and outputs, respectively.
You have to provide example inputs and outputs or a prompt, and you can provide both.
fun
has two sub-commands:
extract
supports extracting parts of one JSON into multiple files using GJSON query. Becausefun
calls the function on files this is useful to preprocess a large JSON file to create files to then call the function on.- The query should return an array of objects with ID and data fields
(by default named
id
anddata
).
- The query should return an array of objects with ID and data fields
(by default named
call
then calls the function on files in the input directory and writes results into files in the output directory.- Corresponding output files will have the same basename as input files but with the output file extension (configurable) so it is safe to use the same directory both for input and output files.
fun
calls the function only for files which do not yet exist in the output directory so it is safe to runfun
multiple times if previous run offun
had issues or was interrupted.fun
supports splitting input files into batches so one run offun
can operate only on a particular batch. Useful if you want to distribute execution across multiple machines.- If output fails to validate the JSON Schema, the output is stored into a file with
additional suffix
.invalid
. If calling the function fails for some other reason, the error is stored into a file with additional suffix.error
.
combine
combines multiple input directories into one output directory with only those files which are equal in all input directories.- Provided input directories should be outputs from different models or different configurations but all run on same input files.
- This allows decreasing false positives at the expense of having less outputs overall.
For details on all CLI arguments possible, run fun --help
:
fun --help
If you have Go available, you can run it without installation:
go run gitlab.com/tozd/go/fun/cmd/go/fun@latest --help
Or with Docker:
docker run -i registry.gitlab.com/tozd/go/fun/branch/main:latest --help
The above command runs the latest development version (main
branch).
See releases page for a Docker image for the latest stable version.
If you have a large JSON file with the following structure:
{
"exercises": [
{
"serial": 1,
"text": "Ariel was playing basketball. 1 of her shots went in the hoop. 2 of the shots did not go in the hoop. How many shots were there in total?"
},
// ...
]
}
To create for each exercise a .txt
file with filename based on the serial
field
(e.g., 1.txt
) and contents based on the text
field, in the data
output directory,
you could run:
fun extract --input exercises.json --output data --out=.txt 'exercises.#.{id:serial,data:text}'
To solve all exercises, you can then run:
export ANTHROPIC_API_KEY='...'
echo "You MUST output only final number, nothing more." > prompt.txt
fun call --input data --output results --provider anthropic --model claude-3-haiku-20240307 --in .txt --out .txt --prompt prompt.txt
For the data/1.txt
input file you should now get results/1.txt
output file with contents 3
.
The issue is that sadly the function might sometimes output more than just the number. We can detect those cases using JSON Schema to validate outputs. We can use a JSON Schema to validate that the output is an integer. We will see warnings in cases when outputs do not validate and corresponding output files will not be created.
echo '{"type": "integer"}' > schema.json
fun call --input data --output results --provider anthropic --model claude-3-haiku-20240307 --in .txt --out .txt --prompt prompt.txt --output-schema schema.json
We can also use a JSON Schema to validate that the output is a string matching a regex:
echo '{"type": "string", "pattern": "^[0-9]+$"}' > schema.json
fun call --input data --output results --provider anthropic --model claude-3-haiku-20240307 --in .txt --out .txt --prompt prompt.txt --output-schema schema.json
There is also a read-only GitHub mirror available, if you need to fork the project there.
The project gratefully acknowledge the HPC RIVR consortium and EuroHPC JU for funding this project by providing computing resources of the HPC system Vega at the Institute of Information Science.
Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or European Commission. Neither the European Union nor the granting authority can be held responsible for them. Funded within the framework of the NGI Search project under grant agreement No 101069364.