You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After #52, outlines-core no longer has tokenizer support, aside from the two copies of TransformerTokenizer in the test and benchmark code. What's the plan wrt. this?
If the plan is to use adapt_tokenizer to patch transformers tokenizers, it's not clear how that's an improvement over a custom tokenizer wrapper classes and a conditional transformers dependency, for example. In general, we could move TransformerTokenizer back to outlines-core and make transformers optional, then outlines-core will be usable with llama-based tokenizers and we won't need two copies for testing.
The clean solution is actually to use the tokenizers crate to remove the dependency on transformers in the Python package. In the meantime, it is unreasonable to ask downstream libraries to implement their own version of adapt_tokenizer since this is always required to use the package.
After #52,
outlines-core
no longer has tokenizer support, aside from the two copies ofTransformerTokenizer
in the test and benchmark code. What's the plan wrt. this?If the plan is to use
adapt_tokenizer
to patchtransformers
tokenizers, it's not clear how that's an improvement over a custom tokenizer wrapper classes and a conditionaltransformers
dependency, for example. In general, we could moveTransformerTokenizer
back tooutlines-core
and maketransformers
optional, thenoutlines-core
will be usable with llama-based tokenizers and we won't need two copies for testing.Originally posted by @brandonwillard in #2 (comment)
The text was updated successfully, but these errors were encountered: