Merge branch 'main' into release-0.0.6.5-blog

eth-sri · Jul 14, 2023 · ee5d49f · ee5d49f
2 parents 221d988 + b56d2c7
commit ee5d49f
Show file tree

Hide file tree

Showing 3 changed files with 37 additions and 47 deletions.
diff --git a/README.md b/README.md
@@ -23,47 +23,24 @@
   </p>
 </div>
 
-LMQL is an open source programming language for large language models (LLMs) based on a *superset of Python*. LMQL goes beyond traditional templating languages by providing full Python support, yet a lightweight programming interface. 
+LMQL is a programming language for large language models (LLMs) based on a *superset of Python*. LMQL offers a novel way of interweaving traditional programming with the ability to call LLMs in your code. It goes beyond traditional templating languages by integrating LLM interaction natively at the level of your program code. 
+## Explore LMQL
 
-LMQL is designed to make working with language models like OpenAI, 🤗 Transformers more efficient and powerful through its advanced functionality, including multi-variable templates, conditional distributions, constraints, datatype constraints and control flow.
+An LMQL program reads like standard Python, but top-level strings are interpreted as query strings, i.e. they are passed to an LLM , where template variables like `[GREETINGS]` are completed by the model:
 
-Features:
+```python
+"Greet LMQL:[GREETINGS]\n" where stops_at(GREETINGS, ".") and not "\n" in GREETINGS
 
-- [X] **Python Syntax**: Write your queries using [familiar Python syntax](https://docs.lmql.ai/en/stable/language/overview.html), fully integrated with your Python environment (classes, variable captures, etc.)
-- [X] **Rich Control-Flow**: LMQL offers full Python support, enabling powerful [control flow and logic](https://docs.lmql.ai/en/stable/language/scripted_prompts.html) in your prompting logic.
-- [X] **Advanced Decoding**: Take advantage of advanced decoding techniques like [beam search, best_k, and more](https://docs.lmql.ai/en/stable/language/decoders.html).
-- [X] **Powerful Constraints Via Logit Masking**: Apply [constraints to model output](https://docs.lmql.ai/en/stable/language/constraints.html), e.g. to specify token length, character-level constraints, datatype and stopping phrases to get more control of model behavior.
-- [X] **Optimizing Runtime:** LMQL leverages speculative execution to enable faster inference, constraint short-circuiting, more efficient token use and [tree-based caching](https://lmql.ai/blog/release-0.0.6.html).
-- [X] **Sync and Async API**: Execute hundreds of queries in parallel with LMQL's [asynchronous API](https://docs.lmql.ai/en/stable/python/python.html), which enables cross-query batching.
-- [X] **Multi-Model Support**: Seamlessly use LMQL with [OpenAI API, Azure OpenAI, and 🤗 Transformers models](https://docs.lmql.ai/en/stable/language/models.html).
-- [X] **Extensive Applications**: Use LMQL to implement advanced applications like [schema-safe JSON decoding](https://github.com/microsoft/guidance#guaranteeing-valid-syntax-json-example-notebook), [algorithmic prompting](https://twitter.com/lbeurerkellner/status/1648076868807950337), [interactive chat interfaces](https://twitter.com/lmqllang/status/1645776209702182917), and [inline tool use](https://lmql.ai/#kv).
-- [X] **Library Integration**: Easily employ LMQL in your existing stack leveraging [LangChain](https://docs.lmql.ai/en/stable/python/langchain.html) or [LlamaIndex](https://docs.lmql.ai/en/stable/python/llama_index.html).
-- [X] **Flexible Tooling**: Enjoy an interactive development experience with [LMQL's Interactive Playground IDE](https://lmql.ai/playground), and [Visual Studio Code Extension](https://marketplace.visualstudio.com/items?itemName=lmql-team.lmql).
-- [X] **Output Streaming**: Stream model output easily via [WebSocket, REST endpoint, or Server-Sent Event streaming](https://github.com/eth-sri/lmql/blob/main/src/lmql/output/).
+if "Hi there" in GREETINGS:
+    "Can you reformulate your greeting in the speech of \
+     victorian-era English: [VIC_GREETINGS]\n" where stops_at(VIC_GREETINGS, ".")
 
-## Explore LMQL
+"Analyse what part of this response makes it typically victorian:\n"
 
-A simple example program in LMQL looks like this:
+for i in range(4):
+    "-[THOUGHT]\n" where stops_at(THOUGHT, ".")
 
-```python
-argmax
-   "Greet LMQL:[GREETINGS]\n"
-
-   if "Hi there" in GREETINGS:
-      "Can you reformulate your greeting in the speech of victorian-era English: [VIC_GREETINGS]\n"
-
-   "Analyse what part of this response makes it typically victorian:\n"
-
-   for i in range(4):
-      "-[THOUGHT]\n"
-
-   "To summarize:[SUMMARY]"
-from 
-   "openai/text-davinci-003" 
-where 
-   stops_at(GREETINGS, ".") and not "\n" in GREETINGS and 
-   stops_at(VIC_GREETINGS, ".") and 
-   stops_at(THOUGHT, ".")
+"To summarize:[SUMMARY]"
 ```
 
 Program Output:
@@ -72,19 +49,30 @@ Program Output:
   <br/>
 </div>
 
+LMQL allows you to express programs that contain both, traditional algorithmic logic, and LLM calls. 
+At any point during execution, you can prompt an LLM on program variables in combination with standard natural language prompting, to leverage model reasoning capabilities in the context of your program.
 
+To better control LLM behavior, you can use the `where` keyword to specify constraints and data types of the generated text. This enables guidance of the model's reasoning process, and constraining of intermediate outputs using an [expressive constraint language](https://docs.lmql.ai/en/stable/language/constraints.html).
 
-The main body of an LMQL program reads like standard Python (with control-flow), where top-level strings are interpreted as model input with template variables like `[GREETINGS]`. 
+Beyond this linear form of scripting, LMQL also supports a number of decoding algorithms to execute your program, such as `argmax`, `sample` or even advanced branching decoders like [beam search and `best_k`](https://docs.lmql.ai/en/stable/language/decoders.html). 
 
-The `argmax` keyword in the beginning specifies the decoding algorithm used to generate tokens, e.g. `argmax`, `sample` or
-even advanced branching decoders like [beam search and `best_k`](https://docs.lmql.ai/en/stable/language/decoders.html).
+Learn more about LMQL by exploring thne **[Example Showcase](https://lmql.ai)**, by running your own programs in our **[browser-based Playground IDE](https://lmql.ai/playground)** or by reading the **[documentation](https://docs.lmql.ai)**.
 
-The `from` and `where` clauses specify the model and constraints that are employed during decoding. 
+## Feature Overview
 
-Overall, this style of language model programming facilitates guidance of the model's reasoning process, and constraining
-of intermediate outputs using an [expressive constraint language](https://docs.lmql.ai/en/stable/language/constraints.html).
+LMQL is designed to make working with language models like OpenAI and 🤗 Transformers more efficient and powerful through its advanced functionality, including multi-variable templates, conditional distributions, constraints, datatypes and control flow.
 
-Learn more about LMQL by exploring our **[Example Showcase](https://lmql.ai)** or by running your own programs in our **[browser-based Playground IDE](https://lmql.ai/playground)**.
+- [X] **Python Syntax**: Write your queries using [familiar Python syntax](https://docs.lmql.ai/en/stable/language/overview.html), fully integrated with your Python environment (classes, variable captures, etc.)
+- [X] **Rich Control-Flow**: LMQL offers full Python support, enabling powerful [control flow and logic](https://docs.lmql.ai/en/stable/language/scripted_prompts.html) in your prompting logic.
+- [X] **Advanced Decoding**: Take advantage of advanced decoding techniques like [beam search, best_k, and more](https://docs.lmql.ai/en/stable/language/decoders.html).
+- [X] **Powerful Constraints Via Logit Masking**: Apply [constraints to model output](https://docs.lmql.ai/en/stable/language/constraints.html), e.g. to specify token length, character-level constraints, datatype and stopping phrases to get more control of model behavior.
+- [X] **Optimizing Runtime:** LMQL leverages speculative execution to enable faster inference, constraint short-circuiting, more efficient token use and [tree-based caching](https://lmql.ai/blog/release-0.0.6.html).
+- [X] **Sync and Async API**: Execute hundreds of queries in parallel with LMQL's [asynchronous API](https://docs.lmql.ai/en/stable/python/python.html), which enables cross-query batching.
+- [X] **Multi-Model Support**: Seamlessly use LMQL with [OpenAI API, Azure OpenAI, and 🤗 Transformers models](https://docs.lmql.ai/en/stable/language/models.html).
+- [X] **Extensive Applications**: Use LMQL to implement advanced applications like [schema-safe JSON decoding](https://github.com/microsoft/guidance#guaranteeing-valid-syntax-json-example-notebook), [algorithmic prompting](https://twitter.com/lbeurerkellner/status/1648076868807950337), [interactive chat interfaces](https://twitter.com/lmqllang/status/1645776209702182917), and [inline tool use](https://lmql.ai/#kv).
+- [X] **Library Integration**: Easily employ LMQL in your existing stack leveraging [LangChain](https://docs.lmql.ai/en/stable/python/langchain.html) or [LlamaIndex](https://docs.lmql.ai/en/stable/python/llama_index.html).
+- [X] **Flexible Tooling**: Enjoy an interactive development experience with [LMQL's Interactive Playground IDE](https://lmql.ai/playground), and [Visual Studio Code Extension](https://marketplace.visualstudio.com/items?itemName=lmql-team.lmql).
+- [X] **Output Streaming**: Stream model output easily via [WebSocket, REST endpoint, or Server-Sent Event streaming](https://github.com/eth-sri/lmql/blob/main/src/lmql/output/).
 
 ## Getting Started
 

diff --git a/docs/source/language/azure.md b/docs/source/language/azure.md
@@ -48,4 +48,6 @@ You can then use this model in your query code as follows:
 name::use-azure-model
 
 argmax "Hello [WHO]" from my_azure_model
-```
+```
+
+Configuration parameters specified as part of an `lmql.model(...)` object take precedence over environment variables. The latter just act as a fallback, e.g. when `api_key=` is not specified as a keyword argument.
diff --git a/src/lmql/runtime/bopenai/openai_api.py b/src/lmql/runtime/bopenai/openai_api.py
@@ -107,13 +107,13 @@ def get_azure_config(model, api_config):
     api_type = api_config.get("api_type", os.environ.get("OPENAI_API_TYPE", ""))
 
     if (api_type == "azure" or api_type == "azure-chat"):
-        api_base = os.environ.get("OPENAI_API_BASE", api_config.get("api_base", None))
+        api_base = api_config.get("api_base", None) or os.environ.get("OPENAI_API_BASE", None)
         assert api_base is not None, "Please specify the Azure API base URL as 'api_base' or environment variable OPENAI_API_BASE"
-        api_version = os.environ.get("OPENAI_API_VERSION", api_config.get("api_version", "2023-05-15"))
-        deployment = os.environ.get("OPENAI_DEPLOYMENT", api_config.get("api_deployment", model))
+        api_version = api_config.get("api_version", None) or os.environ.get("OPENAI_API_VERSION", "2023-05-15")
+        deployment = api_config.get("api_deployment", None) or os.environ.get("OPENAI_DEPLOYMENT", model)
 
         deployment_specific_api_key = f"OPENAI_API_KEY_{deployment.upper()}"
-        api_key = os.environ.get(deployment_specific_api_key, None) or os.environ.get("OPENAI_API_KEY", api_config.get("api_key", None))
+        api_key = api_config.get("api_key", None) or os.environ.get(deployment_specific_api_key, None) or os.environ.get("OPENAI_API_KEY", None)
         assert api_key is not None, "Please specify the Azure API key as 'api_key' or environment variable OPENAI_API_KEY or OPENAI_API_KEY_<DEPLOYMENT>"
 
         is_chat = api_type == "azure-chat"