Set up a Rust-first repo structure #4

ErikKaum · 2024-08-15T16:35:47Z

PR still no more in draft. The main goal of this PR is to have:

a repo structure that we're all happy with
make sure the bindings & integration of the rust crate work
a way to build the python package, both for local dev and then a distribution build*

*the build process is not necessarily the most elegant one. We're between two choices matruin vs. setuptools-rust.

matruin doesn't support SCM-based versioning, but is in general easier to use. maturin build "just works"
setuptools-rust allows us to use SCM-based versioning but requires more custom scripts. Especially, it doesn't play well with dependencies from parent directories. The solution is to create a symlink & edit Cargo.toml --> then build

note: CI is currently not working with the new setup, I propose to create a new issue+PR where we address that separately.

--------- Co-authored-by: Andrew Lapp <[email protected]>

Release Docker dispatch: https://github.com/lapp0/outlines/actions/runs/7994419887 - "Fails successfully": Got to the point where it only auth errors. `Error: buildx failed with: ERROR: denied: requested access to the resource is denied` Not testing fetch PyPi. Changes are minimal between this version and the previous *working* main. ``` git diff e99d92d -- .github/workflows/release_pypi.yaml .github/workflows/release.yml | cat diff --git a/.github/workflows/release.yml b/.github/workflows/release_pypi.yaml similarity index 92% rename from .github/workflows/release.yml rename to .github/workflows/release_pypi.yaml index e6bf1b1..597ebb7 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release_pypi.yaml @@ -1,16 +1,17 @@ -name: Release +name: Release PyPi on: release: types: - created - jobs: release-job: name: Build and publish on PyPi runs-on: ubuntu-latest + environment: release steps: - - uses: actions/checkout@v2 + - name: Checkout + uses: actions/checkout@v2 - name: Set up Python uses: actions/setup-python@v2 with: ``` --------- Co-authored-by: Andrew Lapp <[email protected]>

We currently store the logits processor in the `LlamaCpp` instance. This causes issues when doing successive generations with different generators. In this PR we create a new `LlamaSequenceGenerator` instance every time we create a new generator, and store the logits processor in this instance which solves the issue. Fixes #700.

When integrating Outlines with vLLM I faced the following issues, which are fixed in this PR: 1. When calling `vllm.LLM.generate` then within the internals of vLLM a `copy.deepcopy` of the vLLM `SamplingParams` is made, which includes the logits processor from Outlines (`RegexLogitsProcessor`, say). This requires everything to be pickleable, and the `RegexLogitsProcessor.fsm.vocabulary` is a `dict_values` object, which doesn't satisfy that. The fix is easy: just convert it to a list. This doesn't affect how this `vocabulary` variable is being used in the code. 2. The `RegexLogitsProcessor` takes an `llm` argument, which the docstring states should be a `vllm.LLM` object, but then attempts to extract the underlying tokenizer via `llm.tokenizer.tokenizer`. The tokenizer of `vllm.LLM` currently lies in the `vllm.LLM.llm_engine.tokenizer.tokenizer` attribute, but this is a big mess and isn't backwards compatible with previous vLLM versions. Instead, they have a convenience method, `vllm.LLM.get_tokenizer`, which fetches the tokenizer. To remain backwards compatibility, in case people have supplied `vllm.LLM.llm_engine` directly into `RegexLogitsProcessor`, it falls back to a `tokenizer` or `tokenizer.tokenizer` attribute. I also updated the vLLM example script, as that was outdated as well (used the previous `_patched_apply_logits_processors`). Closes #704

Fixes #716

…evice

Fixes dottxt-ai/outlines#720

A recent change replaced the set of FSM final states with the state -1 that is used to represent an EOS token being generated. This could explain the issue reported in #605.

…pported by exllama (#729) Refactored the exl2 function in exllamav2.py. The new version offers the following benefits: 1. auto split support. You no longer need to split a large model over 2 GPUs manually, exllama will do it for you 2. 8 bit cache support. Supports the 8 bit cache, can squeeze more context into the same GPU 3. Additional exllamav2 improvements. Supports low_mem, fasttensors. 4. No longer need to pass in num_experts, it is optional. 5. Future support for 4 bit cache. Whenever turbo updates the pip package, uncomment the 4 bit lines for 4 bit support. 6. Refactored the function parameters. Changed the model_kwargs dictionary to individual parameters. Combined with documentation this makes it easier for new users to understand what options they can select.

Fixes [this issue](dottxt-ai/outlines#743)

…r individual bytes

Some model use bytes as their tokens, such as Qwen (see: https://huggingface.co/Qwen/Qwen-7B/blob/ef3c5c9c57b252f3149c1408daf4d649ec8b6c85/tokenization_qwen.py#L136 )

take in everything from main

This PR update the `pyproject.toml`, `Manifest.in` and `Cargo.toml` to support compilation and installation of rust `outlines_core`. Note* building and install relies on `pip` rather then the `build` library. --------- Co-authored-by: erikkaum <[email protected]>

rlouf and others added 30 commits February 20, 2024 10:20

Add links to Outlines' supporter

2dbc279

Dockerize and Add Release-Push Workflow (#688)

26d9f0c

--------- Co-authored-by: Andrew Lapp <[email protected]>

fix minor typo

9c4e7f4

Generate text using any interegular FSM

7a21043

Add developer survey link

d85e67f

Return FSM final state if already in final state (#718)

c0b47a4

Fixes #716

Allow specifying own HF tokenizer object

c1b4ffa

Run commit hooks

c4de2e0

Put prompt_token_ids, attentions_mask and weights on the same d…

42f465c

…evice

Fixing Enum with only one element being ignored (#721)

88dc97c

Fixes dottxt-ai/outlines#720

Escape JSON property names in regex

0488ad2

Simplify the transformers and llamacpp interfaces

a62ff00

Update the pick-odd-one example

4775309

Let people init OpenAI with client and tokenizer

9fd0f46

Stop generation at every FSM final state

c1851df

A recent change replaced the set of FSM final states with the state -1 that is used to represent an EOS token being generated. This could explain the issue reported in #605.

Test llamacpp when successive regex-guided generations

05c1d56

Add Guide interface

11143df

Add integration for transformers via logits processors

5d97ee1

Restore FSM interface for backward compatibility

d47bd6b

Update the docstring of exl2

6484d8c

Pass 'model_kwargs for outlines.models.llamacpp as dict (#744)

5c15e8c

Fixes [this issue](dottxt-ai/outlines#743)

Add a function to convert utf8 regexps into regexps that operates ove…

d7295a7

…r individual bytes

Support generating multi-byte utf8 characters

043117f

Make model_kwargs dictionary by default

c8566e8

Check if the given token is a string (#745)

aa0a35e

Some model use bytes as their tokens, such as Qwen (see: https://huggingface.co/Qwen/Qwen-7B/blob/ef3c5c9c57b252f3149c1408daf4d649ec8b6c85/tokenization_qwen.py#L136 )

Add BibteX citation

f7cafe5

ErikKaum and others added 21 commits August 23, 2024 18:05

big reshuffle but this should work

0f7a766

take in everything from main

wrong name in outlines-core Cargo.toml

29c6f11

make example function more clearly an example

6b19692

run pre commit

68b6724

move test & benchmarks to bindings/python

f20620a

forgot that one

ec23be1

with maturin & incorporate latest changes

f1373e4

run pre-commit

8a003aa

change workdir for jobs

dd1b5c7

consolidate pyproject.toml

07cb81a

add setup.cfg back

ae386d2

fix paths in pre commit configuration

cc210cd

don't run full matrix and activate venv

2db0651

test without sccache

33c5e2e

pytest dependecy fix

2b8b4ad

fix coverage wrong path

7d9a8e8

debug .coverage path

2e6ba7f

more debug .coverage path

d1022ae

should add github.workspace?

c30cfb6

continue path debug

804c66a

brandonwillard force-pushed the repo-structure branch from 83003ec to 804c66a Compare August 23, 2024 23:13

brandonwillard force-pushed the main branch 2 times, most recently from 347e191 to c448997 Compare August 29, 2024 15:36

ErikKaum mentioned this pull request Aug 29, 2024

Update build, CI and release scripts #18

Closed

brandonwillard force-pushed the main branch 2 times, most recently from 2b07cfd to ed7cdf2 Compare September 10, 2024 22:20

brandonwillard force-pushed the main branch from e63edab to c39600f Compare September 23, 2024 23:25

brandonwillard closed this Sep 30, 2024

brandonwillard force-pushed the main branch from b656883 to bcf655a Compare September 30, 2024 22:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up a Rust-first repo structure #4

Set up a Rust-first repo structure #4

ErikKaum commented Aug 15, 2024 •

edited

Loading

Set up a Rust-first repo structure #4

Set up a Rust-first repo structure #4

Conversation

ErikKaum commented Aug 15, 2024 • edited Loading

ErikKaum commented Aug 15, 2024 •

edited

Loading