You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm interested in using llm-foundry infrastructure for training LLMs for sequence classification/regression tasks. I currently have a fork of llm-foundry where I got this working (in a fairly hacky manner that definitely needs to be cleaned up) within the MPT models provided by the repository (creating a new MPTForSequenceRegression class and associated composer model). HuggingFace also has sequence classification versions of most of the LLMs that they have available (which would just require a composer wrapper.
Is there an interest in having tooling for sequence classification/regression live upstream in llm-foundry? I'd be interested in cleaning up and upstreaming what I have so far in addition to probably writing some documentation on performing finetuning for these tasks if such patches would be accepted.
The text was updated successfully, but these errors were encountered:
Hey @boomanaiden154, the approach seems right! You should still be able to use the base HuggingFaceModel in composer, and just add the head classes as you described. Support for sequence classification/regression would be great!
I'm interested in using
llm-foundry
infrastructure for training LLMs for sequence classification/regression tasks. I currently have a fork ofllm-foundry
where I got this working (in a fairly hacky manner that definitely needs to be cleaned up) within the MPT models provided by the repository (creating a newMPTForSequenceRegression
class and associated composer model). HuggingFace also has sequence classification versions of most of the LLMs that they have available (which would just require a composer wrapper.Is there an interest in having tooling for sequence classification/regression live upstream in
llm-foundry
? I'd be interested in cleaning up and upstreaming what I have so far in addition to probably writing some documentation on performing finetuning for these tasks if such patches would be accepted.The text was updated successfully, but these errors were encountered: