diff --git a/docs/source/language/decoders.md b/docs/source/language/decoders.md index 902d26a5..c991ebd2 100644 --- a/docs/source/language/decoders.md +++ b/docs/source/language/decoders.md @@ -6,6 +6,39 @@ LMQL also includes a library for array-based decoding `dclib`, which can be used In general, all LMQL decoding algorithms are model-agnostic and can be used with any LMQL-supported inference backend. For more information on the supported inference backends, see the [Models](./models.md) chapter. +## Specifying The Decoding Algorithm + +Depending on the context, LMQL offers two ways to specify the decoding algorithm to use. + +**Queries with Decoding Clause**: The first option is to simply specify the decoding algorithm and its parameters as part of the query itself. This can be particularly useful, if your choice of decoder is relevant and should be part of your program. + +```{lmql} + +name::specify-decoder +beam(n=2) + "This is a query with a specified decoder: [RESPONSE] +from + "openai/text-ada-001" +``` + +**Specifying the Decoding Algorithm Externally**: The second option is to specify the decoding algorithm and parameters externally, i.e. separatly from the actual program code: + +```python +import lmql + +@lmql.query(model="openai/text-davinci-003", decoder="sample", temperature=1.8) +def tell_a_joke(): + '''lmql + """A list good dad joke. A indicates the punchline: + Q:[JOKE] + A:[PUNCHLINE]""" where STOPS_AT(JOKE, "?") and STOPS_AT(PUNCHLINE, "\n") + ''' + +tell_a_joke() # uses the decoder specified in @lmql.query(...) +tell_a_joke(decoder="beam", n=2) # uses a beam search decoder with n=2 +``` + +This is only possible when using LMQL from a Python program. For more information on this, also see the chapter on how to specify the [model to use for decoding](models.md). ## Supported Decoding Algorithms In general, the very first keyword of an LMQL query, specifies the decoding algorithm to use. For this, the following decoder keywords are available: diff --git a/docs/source/language/models.md b/docs/source/language/models.md index a63bf766..2e212110 100644 --- a/docs/source/language/models.md +++ b/docs/source/language/models.md @@ -4,6 +4,53 @@ LMQL is a high-level, front-end language for text generation. This means that LM Due to the modular design of LMQL, it is easy to add support for new models and backends. If you would like to propose or add support for a new model API or inference engine, please reach out to us via our [Community Discord](https://discord.com/invite/7eJP4fcyNT) or via [hello@lmql.ai](mailto:hello@lmql.ai). +## Specifying The Model + +LMQL offers two ways to specify the model that is used as underlying LLM: + +**Queries with `from` Clause**: The first option is to simply specify the model as part of the query itself. For this, you can use the `from` in combination with the indented syntax. This can be particularly useful, if your choice of model is intentional and should be part of your program. + +```{lmql} + +name::specify-decoder +argmax + "This is a query with a specified decoder: [RESPONSE] +from + "openai/text-ada-001" +``` + +**Specifying the Model Externally**: The second option is to specify the model and its parameters externally, i.e. separately from the actual program code: + +```python +import lmql + +# uses 'chatgpt' by default +@lmql.query(model="chatgpt") +def tell_a_joke(): + '''lmql + """A list of good dad jokes. A indicates the punchline + Q: How does a penguin build its house? + A: Igloos it together. + Q: Which knight invented King Arthur's Round Table? + A: Sir Cumference. + Q:[JOKE] + A:[PUNCHLINE]""" where STOPS_AT(JOKE, "?") and STOPS_AT(PUNCHLINE, "\n") + ''' + +tell_a_joke() # uses chatgpt + +tell_a_joke(model="openai/text-davinci-003") # uses text-davinci-003 + +``` + +This is only possible when using LMQL from a Python program. When running in the playground, you can alternatively use the model dropdown available in the top right of the program editor: + +
+ Screenshot of the model dropdown in the playground +
+ +## Available Model Backends + ```{toctree} :maxdepth: 1 @@ -11,4 +58,10 @@ openai.md azure.md hf.md llama.cpp.md -``` \ No newline at end of file +``` + +## Using Multiple Models + +LMQL currently supports the use of only one model per query. If you want to mix multiple models, the advised way is to use multiple queries that are executed in sequence. The main obstacles in supporting this, is the fact that different models produce differently scaled token probabilities, which means an end-to-end decoding process would be difficult to implement. + +However, we are actively exploring ways to [support this in the future](https://github.com/eth-sri/lmql/issues/82). \ No newline at end of file diff --git a/docs/source/language/overview.md b/docs/source/language/overview.md index 822abfef..2e97c18d 100644 --- a/docs/source/language/overview.md +++ b/docs/source/language/overview.md @@ -75,9 +75,9 @@ P(CLS) Instead of constraining `CLS` with a `where` expression, we now constrain it in the separate `distribution` clause. In LMQL, the `distribution` clause can be used to specify whether we want to additionally obtain the distribution over the possible values for a given variable. In this case, we want to obtain the distribution over the possible values for `CLS`. -> Note, that to use the `distribution` clause, we have to make our choice of decoding algorithm explicit, by specifying `argmax` at the beginning of our code (see [Decoding Algorithms](./decoding.md) for more information). ¸ +> **Extended Syntax**: Note, that to use the `distribution` clause, we have to make our choice of decoding algorithm explicit, by specifying `argmax` at the beginning of our code (see [Decoding Algorithms](./decoding.md) for more information). ¸ > -> In general, indenting your program and explicitly specifying e.g. `argmax` at the beginning of your code is optional, but recommended if you want to use the `distribution` clause. Throughout the documentation we will make use of both options. +> In general, this extended form of LMQL syntax, i.e. indenting your program and explicitly specifying e.g. `argmax` at the beginning of your code, is optional, but recommended if you want to use the `distribution` clause. Throughout the documentation we will make use of both syntax variants. In addition to using the model to perform the `ANALYSIS`, LMQL now also scores each of the individually provided values for `CLS` and normalizes the resulting sequence scores into a probability distribution `P(CLS)` (printed to the Terminal Output of the Playground or Standard Output of the CLI).