Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Tensor Dimension Mismatch in Padding Operation for Batch Processing #254

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sugary199
Copy link

Previously, pad_embeds was incorrectly constructed by repeating the pad_embed tensor along the wrong dimension, leading to a size mismatch when attempting to concatenate it with inputs['inputs_embeds'].
The error message is as follows:

Process Process-1:
Traceback (most recent call last):
  File "/ML-A100/team/mm/shuyu/anaconda3/envs/intern_clean/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/ML-A100/team/mm/shuyu/anaconda3/envs/intern_clean/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/ML-A100/team/mm/shuyu/workspace/projects/InternLM-XComposer/cap_train.py", line 159, in inferCaptionsAndSave
    inputs = torch.cat([pad_embeds, inputs['inputs_embeds']], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 57 but got size 1 for tensor number 1 in the list.
FINISHED!

Modification: specifying the dimension as 1 when preparing pad_embeds.

This issue was not triggered by the official examples due to the difference in token counts across batches is 1 .Therefore I increase the difference in token numbers between the two examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants