-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable uploading multiple images in demo.py #232
base: main
Are you sure you want to change the base?
Conversation
I previously tried to feed multiple images manually using |
I don't have this issue. Everything works fine. The quoted code is exactly the source of your issue. You can try my branch. |
I tried your branch but am still having the issue, here's my steps:
From this point on asking "Please describe the [first/second] image" only gets me really weird descriptions that seem to mix up the two images. If you can get the model to describe both images at the same time (or do any reasoning on multiple images at once) maybe you can share the prompt you used. EDIT: I realized that the outcome is a bit random, sometimes the descriptions get mixed up and other times they don't, but I can't manage to reliably do accurate reasoning on multiple images |
@LFavano yeah, the IQ of miniGPT4 can fluctuate, especially for small models. I guess your prompt is also a bit misleading. The image embeddings are actually appended to the prompt, so the total embeddings the model see is <embedding of "Please describe the first image provided, a second image is coming after"> + <embedding of the first image>, then you see why sometimes miniGPT4 gives confusing output, because it gets confused as well. |
when an image is uploaded, it gets converted to an embedding and then concatenated to the token embeddings. In the case of multiple images, is anyone here aware of a model that takes in multiple such image embeddings simultaneously and concatenates in the same sequence, a kind of one-pass inference? |
I made a few changes to enable uploading multiple images. This should close #180.
The changes include:
One remaining problem is that we cannot upload multiple images all at once, because I suspect this line may cause issues.
MiniGPT-4/minigpt4/conversation/conversation.py
Line 133 in 22d8888
So now, after uploading an image, a text input should follow before uploading another image.