Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update transformers tests generation util v4.45.2 #1441

Merged
merged 23 commits into from
Dec 11, 2024

Conversation

malkomes
Copy link
Contributor

@malkomes malkomes commented Oct 18, 2024

What does this PR do?

We sync the tests test_utils.py with transformers v4.45.2.

To facilitate review, please check the PRs merged to this branch first:

The list of modifications are:

  • Update all the tests to transformers v4.45.2.
  • Mark tests that are not ready with failure.

@malkomes malkomes requested a review from regisss as a code owner October 18, 2024 14:40
@malkomes malkomes marked this pull request as draft October 18, 2024 14:41
@malkomes malkomes force-pushed the test_utils_update_transformers branch from ff28ba3 to 46b89e3 Compare October 23, 2024 15:40
@malkomes malkomes marked this pull request as ready for review October 23, 2024 16:00
@malkomes
Copy link
Contributor Author

Running

optimum-habana/tests/transformers/tests/generation# python -m pytest test_utils.py

on main give us:

23 failed, 16 passed, 9 skipped, 14 warnings

this branch increases the number of tests and mark Xfail the tests that need to be updated.

32 passed, 26 skipped, 21 xfailed, 16 warnings

@malkomes
Copy link
Contributor Author

Hello @yafshar, here's the PR to update the test_utils synced with main.

@yafshar
Copy link
Contributor

yafshar commented Oct 23, 2024

@malkomes thanks for the update. I will start checking the code

conftest.py Show resolved Hide resolved
conftest.py Show resolved Hide resolved
torch.ones((batch_size, 1), dtype=torch.long, device=device) * decoder_start_token_id
)

if token_idx is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malkomes this is a bit confusing! Are you repeating the same ops again here? Why did you remove the if condition in the first place if you want to repeat the same operation again

            decoder_start_token_id = (
                torch.ones((batch_size, 1), dtype=torch.long, device=device) * decoder_start_token_id
            )

Copy link
Contributor

@yafshar yafshar Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only have the max_length and padding when token_idx is not None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, here we are adding a logic to handle the case where we have a batch of tokens on the decoder_start_token_id. And, indeed. there different ways to implement this. In fact, in the other PR I did something a little bit "more cute" where I handle the decoder_start_token_id batch case and the static shape with more code changes using different variables names and etc.

This time I realize that it's much easier to maintain the code if we keep the code similar to the transformers code. It so happens that the max_length padding works for batch and non-batch if we do it afterwards.

I was hoping that my comments here would explain it: malkomes#2 let me know if that helps or not. And maybe we should add notes on the code since it is a little bit confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation, I agree for readability and maintainability, your separation makes sense!

input_ids = inputs_dict[self.input_name]
# TODO: @raushan or @gante, use `model.main_input_name` as the main input instead of relyinn on `input_ids`
input_ids = inputs_dict.pop(self.input_name)[:batch_size, :]
inputs_dict.pop("attention_mask", None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you removing attention_mask for these tests?

Copy link
Contributor

@yafshar yafshar Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we don’t use filtered_inputs_dict similar to how it’s done in Transformers? Keeping the code closer to the upstream implementation would make it easier to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you are looking at the most recent transformer version here

I was updating it based on 4.45.2

I think for this function there's no harm on using the latest version. Let me try.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several modifications from 4.45.2 to 4.46. We can work on that after this PR. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malkomes Thank you for checking this. Let’s stick with version 4.45.2 for now. We can plan to update to 4.46 later. The next version has some other changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good! Yeah, the 4.46 does a refactor on the tests, so it's better to work on a new PR for 4.46, and I'm happy to do it once we finish this one. (:

Co-authored-by: Yaser Afshar <[email protected]>

# It is important set set the eos_token_id to None to ensure that no sequences
# shorter than `max_length` can be generated
config.eos_token_id = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malkomes this is confusing again!

Above we are setting config.eos_token_id = [config.eos_token_id] and here you are un-setting it config.eos_token_id = None?
If it is needed to be set to None, we should remove the upper code and just set config.pad_token_id there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch! that's something that comes from the upstream version that I was referring. I'll fix it.

eos_token_id=bart_model.config.eos_token_id,
**model_kwargs,
)
max_new_tokens = 20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
max_new_tokens = 20
# Controlling max_length via the configuration is deprecated in favor of max_new_tokens
max_new_tokens = 20

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)
min_length = 10
input_len = input_ids.shape[-1]
out_gen = model.generate(input_ids=input_ids, min_length=min_length, max_new_tokens=20)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same as above, please add a comment for the change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yafshar
Copy link
Contributor

yafshar commented Oct 28, 2024

@malkomes thanks a lot. I finished my first check. Would you please address the comments to wrap up and finish this PR.

@yafshar
Copy link
Contributor

yafshar commented Oct 28, 2024

@malkomes I just checked one of the failure test_cfg_mixin I think we need to enable it for it optimum-habana and update the test. In this one, the gpt2 model does not have a pad token and will create st completely wrong

generated_text=['The dragon flew over Paris, landing on the outskirts of the city centre. He landed safely on the outskirts of Paris, landing on the outskirts of Paris, landing on the outskirts of Paris mosqu']

If I add below to the test

        if tokenizer.pad_token is None:
            tokenizer.pad_token = tokenizer.eos_token
            model.generation_config.pad_token_id = model.generation_config.eos_token_id

It will create

generated_text=['The dragon flew over Paris, landing on the roof of the hotel where he was staying with his wife and children. He then proceeded to climb onto the roof of the hotel where he met his']

different from the original test, but working on optimum-habana, probably should be marked as x

@malkomes
Copy link
Contributor Author

malkomes commented Nov 4, 2024

@yafshar I think I got all the changes. Let me know if I've missed something. I was mostly comparing this test_utils.py file with the transformers one tagged with 4.45.2. Thanks again for your careful and great review

@yafshar
Copy link
Contributor

yafshar commented Nov 4, 2024

@malkomes thanks for the updates.

with this PR

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest test_utils.py
test_utils.py .xssxx..xx........x.....sx.x...xxx.......xsxsx...x.xx.xxx..xxxxxs..sx.s.sssssss                                      [100%]
38 passed, 15 skipped, 26 xfailed, 19 warnings in 301.97s (0:05:01)

Copy link
Contributor

@yafshar yafshar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@regisss @ssarkar2 this PR is ready. Please check!

@malkomes
Copy link
Contributor Author

@regisss @ssarkar2 can you take a look at this proposal to sync the tests? thanks

Copy link
Contributor

@emascarenhas emascarenhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you run through slow tests with latest 1.19 docker on 1.19 host?

@yafshar
Copy link
Contributor

yafshar commented Nov 27, 2024

I ran the test on 1.19

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest test_utils.py
test_utils.py .xssxx..xx........x.....sx.x...xxx.......xsxsx...x.xx.xxx..xxxxxs..sx.s.sssssss 
38 passed, 15 skipped, 26 xfailed, 19 warnings in 295.73s (0:04:55)

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I just left a couple of questions to understand better the changes to conftest.py and pyproject.toml 🙂

pyproject.toml Show resolved Hide resolved
conftest.py Show resolved Hide resolved
conftest.py Show resolved Hide resolved
@malkomes
Copy link
Contributor Author

@regisss sorry for taking long to reply. The last commit should have fix it. Let me know if anything else is missing ;-)

Copy link

The code quality check failed, please run make style.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! No worries @malkomes

@regisss regisss merged commit 2ba520a into huggingface:main Dec 11, 2024
4 checks passed
zzhang37 pushed a commit to zzhang37/optimum-habana that referenced this pull request Dec 11, 2024
Co-authored-by: Gustavo <gustavo.malkomes>
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: regisss <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants