Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix transformers-related warning #148

Open
arashmath opened this issue Aug 26, 2024 · 1 comment
Open

Fix transformers-related warning #148

arashmath opened this issue Aug 26, 2024 · 1 comment

Comments

@arashmath
Copy link

Dear lambeq developers,

I was playing around with the package and testing the parsing example in the bobcat tutorial by simply running

parser = BobcatParser()
diagram = parser.sentence2diagram('A sentence here.')
diagram.draw()

and realised I am getting the following warning, mainly coming from the transformers library:

\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884

Following the suggestions by the huggingface developers in the opened issue, if a parameter clean_up_tokenization_spaces=True is added to the tokenizer definition here:

tokenizer = AutoTokenizer.from_pretrained(model_dir)

the warning will be resolved.
I can confirm this worked for me locally.
Many thanks!

@dimkart
Copy link
Contributor

dimkart commented Sep 2, 2024

@arashmath Thank you very much for the suggestion, we'll add it in a future release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@dimkart @arashmath and others