Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the code in python/outlines-core #2

Closed
1 of 4 tasks
rlouf opened this issue Aug 14, 2024 · 3 comments · Fixed by #3
Closed
1 of 4 tasks

Simplify the code in python/outlines-core #2

rlouf opened this issue Aug 14, 2024 · 3 comments · Fixed by #3
Labels
Outlines Related to the integration with Outlines

Comments

@rlouf
Copy link
Member

rlouf commented Aug 14, 2024

The requirements are similar to those of #1, except we also need to write Python bindings. Where the interface lies is yet to be discussed; for now I suggest implementing the guides in this library.

  • Remove the logic that is unrelated to structured generation (Clean the repository #3)
  • Use adapt_tokenizer instead of TransformersTokenizer so we don't have to keep the models directory
  • Flatten the library
  • Remove .readthedoc.yml
@rlouf rlouf added the Outlines Related to the integration with Outlines label Aug 14, 2024
@rlouf rlouf assigned rlouf and lapp0 and unassigned rlouf and lapp0 Aug 14, 2024
@rlouf rlouf linked a pull request Aug 16, 2024 that will close this issue
@rlouf rlouf changed the title Use outlines-core in Outlines Simplify the code in python/outlines-core Aug 22, 2024
@rlouf rlouf unassigned lapp0 Sep 14, 2024
@brandonwillard
Copy link
Member

brandonwillard commented Oct 9, 2024

After #52, outlines-core no longer has tokenizer support, aside from the two copies of TransformerTokenizer in the test and benchmark code. What's the plan wrt. this?

If the plan is to use adapt_tokenizer to patch transformers tokenizers, it's not clear how that's an improvement over a custom tokenizer wrapper classes and a conditional transformers dependency, for example. In general, we could move TransformerTokenizer back to outlines-core and make transformers optional, then outlines-core will be usable with llama-based tokenizers and we won't need two copies for testing.

@rlouf
Copy link
Member Author

rlouf commented Oct 23, 2024

I agree, having integrated outlines-core in different libraries I think we should move TransformersTokenizer back to outlines-core; otherwise we have to duplicate this code wherever outlines-core is used.

@rlouf
Copy link
Member Author

rlouf commented Oct 23, 2024

Broke this down in #80 and #81

@rlouf rlouf closed this as not planned Won't fix, can't repro, duplicate, stale Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Outlines Related to the integration with Outlines
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants