You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello!
I am currently working on adapting my pre-trained Llama model for text embedding tasks using the LLM2Vec methodology. My objective is to configure the model to generate text embeddings directly from image inputs. I have been utilizing the Unsloth fine-tuning framework, as demonstrated in this Colab notebook.
Current Progress:
Model Output: The model successfully generates descriptive text for a given image input.
Desired Outcome: Instead of generating descriptive text, I aim for the model to produce a text embedding or token representation directly from the image input.
Challenges Encountered:
Integration of LLM2Vec: Uncertainty about how to apply the LLM2Vec methodology to enable the model to produce text embeddings from image inputs.
Unsloth Framework Adaptation: Need guidance on modifying the Unsloth fine-tuning process to accommodate this functionality.
Request for Assistance:
I would greatly appreciate guidance on the following:
Model Configuration: Steps to adjust the Llama model architecture to generate text embeddings directly from image inputs using LLM2Vec.
Fine-Tuning Process: Recommendations on adapting the Unsloth fine-tuning framework to support this functionality.
Implementation Examples: Any available examples or references that demonstrate similar adaptations.
Your expertise and support in this matter would be invaluable to the progression of my project.
Thank you for your assistance.
Best regards!
The text was updated successfully, but these errors were encountered:
Hello!
I am currently working on adapting my pre-trained Llama model for text embedding tasks using the LLM2Vec methodology. My objective is to configure the model to generate text embeddings directly from image inputs. I have been utilizing the Unsloth fine-tuning framework, as demonstrated in this Colab notebook.
Current Progress:
Model Output: The model successfully generates descriptive text for a given image input.
Desired Outcome: Instead of generating descriptive text, I aim for the model to produce a text embedding or token representation directly from the image input.
Challenges Encountered:
Integration of LLM2Vec: Uncertainty about how to apply the LLM2Vec methodology to enable the model to produce text embeddings from image inputs.
Unsloth Framework Adaptation: Need guidance on modifying the Unsloth fine-tuning process to accommodate this functionality.
Request for Assistance:
I would greatly appreciate guidance on the following:
Model Configuration: Steps to adjust the Llama model architecture to generate text embeddings directly from image inputs using LLM2Vec.
Fine-Tuning Process: Recommendations on adapting the Unsloth fine-tuning framework to support this functionality.
Implementation Examples: Any available examples or references that demonstrate similar adaptations.
Your expertise and support in this matter would be invaluable to the progression of my project.
Thank you for your assistance.
Best regards!
The text was updated successfully, but these errors were encountered: