How to Adapt the Supervised Contrastive Training for Custom Dataset? #156

Franciscus-Carolus · 2024-12-16T04:16:16Z

Thank you for sharing the LLM2Vec code and the training methods! I’m interested in using my own dataset for supervised contrastive training, but I noticed that there isn't a specific guide for adapting the training procedure to custom datasets.

Thank you in advance for your help!

Franciscus-Carolus · 2024-12-28T10:10:30Z

My task is to retrieve sentences with similar styles. I am continuing the training based on McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised.

First, I merged the weights of LLaMA-3, LLM2Vec-Meta-Llama-3-8B-Instruct-mntp, and LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised to obtain a single model.

Key changes made:

train_configs/supervised/MetaLlama3.json
(1) The model_name_or_path was updated to the merged model, and the peft_model_name_or_path was removed.
(2) The dataset name and path were updated.
(3) Other parameters were adjusted as needed.
llm2vec/dataset: A new Python script was created for the dataset (using E5data.py as a template).
(1) A dataset similar to E5Data was created, where each entry contains a query, positive, and negative sample.
(2) The prompt was updated.
(3) Parameters in the class E5Data were modified as necessary.
(4) The data organization method is based on the first option in E5Data.py ("allnli_split2"), as my task involves retrieving similar sentences.

Is there anything I've missed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Adapt the Supervised Contrastive Training for Custom Dataset? #156

How to Adapt the Supervised Contrastive Training for Custom Dataset? #156

Franciscus-Carolus commented Dec 16, 2024 •

edited

Loading

Franciscus-Carolus commented Dec 28, 2024

How to Adapt the Supervised Contrastive Training for Custom Dataset? #156

How to Adapt the Supervised Contrastive Training for Custom Dataset? #156

Comments

Franciscus-Carolus commented Dec 16, 2024 • edited Loading

Franciscus-Carolus commented Dec 28, 2024

Franciscus-Carolus commented Dec 16, 2024 •

edited

Loading