Finetuning Speech Encoders further #28

King-Rafat · 2024-06-29T03:04:30Z

Hi,

I tried finetuning the Swahili speech encoder but the performance only increases to 9.6 BLEU from a base BLEU score of 7.5 on your already finetuned encoder. I finetuned the speech encoder for 5 epochs with augmented data. I am not willing to try more epochs as the performance increase is not I had imagined. I finetuned with about 30hrs of data. The MSE loss in the last epoch was 1.5*10^-6. Any different approach that might help achieve a better BLEU?

Also, what is the finetuned decoder model checkpoint that I read in the paper does well for Swahili? When I try to use it I get the error - ValueError: The input sequence length must be less than or equal to the maximum sequence length (512), but is 513 instead which I do not get for the normal decoder. All my audios are less than or equal to 30 sec.

Thank you for your time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetuning Speech Encoders further #28

Finetuning Speech Encoders further #28

King-Rafat commented Jun 29, 2024 •

edited

Loading

Finetuning Speech Encoders further #28

Finetuning Speech Encoders further #28

Comments

King-Rafat commented Jun 29, 2024 • edited Loading

King-Rafat commented Jun 29, 2024 •

edited

Loading