Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help Needed: Fairseq --user-dir Argument Issue in IndicTrans2 Training #106

Open
sasidhar791 opened this issue Dec 15, 2024 · 10 comments
Open

Comments

@sasidhar791
Copy link

Hi @PranjalChitale ,

I’m trying to fine-tune the IndicTrans2 model using fairseq-train, but I keep encountering the following error:

fairseq-train: error: argument --user-dir: invalid Optional value: ''

I’ve provided the --user-dir argument as the path to the model_configs directory (e.g., --user-dir C:/Users/sasid/Downloads/IndicTrans2/model_configs), but the training script fails with the above error.

I’ve verified the path and ensured Fairseq is installed properly. Could you please clarify the correct usage of the --user-dir argument for IndicTrans2 or suggest a fix for this issue?

Please me know if any particular version of Fairseq need to be installed.

Thanks in advance for your guidance!

Best regards,
Sasidhar Tade.

@PranjalChitale
Copy link
Collaborator

Can you provide your complete training command?

As for the versions, I don't believe it's a version-related issue, but you should use the install.sh script provided in the repository to set up your environment.

@sasidhar791
Copy link
Author

sasidhar791 commented Dec 15, 2024

Yes @PranjalChitale , here is the training command..

fairseq-train $exp_dir/final_bin
--max-source-positions=256
--max-target-positions=256
--source-lang=SRC
--target-lang=TGT
--max-update=1000000
--save-interval-updates=2500
--arch=$model_arch
--activation-fn gelu
--criterion=label_smoothed_cross_entropy
--label-smoothing=0.1
--optimizer adam
--adam-betas "(0.9, 0.98)"
--lr-scheduler=inverse_sqrt
--clip-norm 1.0
--warmup-init-lr 1e-07
--lr 5e-4
--warmup-updates 4000
--dropout 0.2
--save-dir $exp_dir/model
--keep-last-epochs 5
--keep-interval-updates 3
--patience 10
--skip-invalid-size-inputs-valid-test
--fp16
--user-dir /c/Users/sasid/Downloads/IndicTrans2/model_configs
--update-freq=32
--distributed-world-size 8
--num-workers 24
--max-tokens 1024
--eval-bleu
--eval-bleu-args "{"beam": 1, "lenpen": 1.0, "max_len_a": 1.2, "max_len_b": 10}"
--eval-bleu-detok moses
--eval-bleu-remove-bpe sentencepiece
--eval-bleu-print-samples
--best-checkpoint-metric bleu
--maximize-best-checkpoint-metric
--task translation

@sasidhar791
Copy link
Author

And the error is just not with --user-dir.. if I comment the --user-dir, I'm facing command not found error with other flags as well:

AssertionError: Must specify batch size either with --max-tokens or --batch-size
/c/Users/sasid/Downloads/IndicTrans2/train.sh: line 37: --update-freq=32: command not found

Is there any chance that all these errors are related?
I would be grateful for your guidance!

@PranjalChitale
Copy link
Collaborator

To resolve the issue, please ensure that you add a backslash ("\") at the end of each line before continuing the command on the next line, as demonstrated in the finetune.sh script.

Also, it looks like the path you provided might not be correct according to Windows conventions. Double-check the path and make sure to enclose it in quotation marks to avoid any parsing issues.

@sasidhar791
Copy link
Author

sasidhar791 commented Dec 15, 2024

Apologies for confusion, I sent you the path which I tried by removing the ("") but initially I tried with ("") itself and it wasn't working.

fairseq-train $exp_dir/final_bin
--max-source-positions=256
--max-target-positions=256
--source-lang=SRC
--target-lang=TGT
--max-update=1000000
--save-interval-updates=2500
--arch=$model_arch
--activation-fn gelu
--criterion=label_smoothed_cross_entropy
--label-smoothing=0.1
--optimizer adam
--adam-betas "(0.9, 0.98)"
--lr-scheduler=inverse_sqrt
--clip-norm 1.0
--warmup-init-lr 1e-07
--lr 5e-4
--warmup-updates 4000
--dropout 0.2
--save-dir $exp_dir/model
--keep-last-epochs 5
--keep-interval-updates 3
--patience 10
--skip-invalid-size-inputs-valid-test
--fp16
--user-dir /c/Users/sasid/Downloads/IndicTrans2/model_configs
--update-freq=32
--distributed-world-size 8
--num-workers 24
--max-tokens 1024
--eval-bleu
--eval-bleu-args "{"beam": 1, "lenpen": 1.0, "max_len_a": 1.2, "max_len_b": 10}"
--eval-bleu-detok moses
--eval-bleu-remove-bpe sentencepiece
--eval-bleu-print-samples
--best-checkpoint-metric bleu
--maximize-best-checkpoint-metric
--task translation

And path I provided is msys2 version since I'm working in that environment, not in windows.
Please provide guidance on this.

@PranjalChitale
Copy link
Collaborator

I had already suggested the fix to resolve your issue in my previous comment.

You need to add "\" for proper line continuation in the bash script, without doing so it is expected that you would run into these issues.

Nevertheless, please just use the following command instead.

Assuming your paths are correct, this should work.

fairseq-train $exp_dir/final_bin \
--max-source-positions=256 \
--max-target-positions=256 \
--source-lang=SRC \
--target-lang=TGT \
--max-update=1000000 \
--save-interval-updates=2500 \
--arch=$model_arch \
--activation-fn gelu \
--criterion=label_smoothed_cross_entropy \
--label-smoothing=0.1 \
--optimizer adam \
--adam-betas "(0.9, 0.98)" \
--lr-scheduler=inverse_sqrt \
--clip-norm 1.0 \
--warmup-init-lr 1e-07 \
--lr 5e-4 \
--warmup-updates 4000 \
--dropout 0.2 \
--save-dir $exp_dir/model \
--keep-last-epochs 5 \
--keep-interval-updates 3 \
--patience 10 \
--skip-invalid-size-inputs-valid-test \
--fp16 \
--user-dir /c/Users/sasid/Downloads/IndicTrans2/model_configs \
--update-freq=32 \
--distributed-world-size 8 \
--num-workers 24 \
--max-tokens 1024 \
--eval-bleu \
--eval-bleu-args "{"beam": 1, "lenpen": 1.0, "max_len_a": 1.2, "max_len_b": 10}" \
--eval-bleu-detok moses \
--eval-bleu-remove-bpe sentencepiece \
--eval-bleu-print-samples \
--best-checkpoint-metric bleu \
--maximize-best-checkpoint-metric \
--task translation

@sasidhar791
Copy link
Author

sasidhar791 commented Dec 15, 2024

Yes, I tried by adding backslash(" \ ") but I'm getting the same error with the correct paths in my command.

fairseq-train $exp_dir/final_bin \
--max-source-positions=256 \
--max-target-positions=256 \
--source-lang=SRC \
--target-lang=TGT \
--max-update=1000000 \
--save-interval-updates=2500 \
--arch=$model_arch \
--activation-fn gelu \
--criterion=label_smoothed_cross_entropy \
--label-smoothing=0.1 \
--optimizer adam \
--adam-betas "(0.9, 0.98)" \
--lr-scheduler=inverse_sqrt \
--clip-norm 1.0 \
--warmup-init-lr 1e-07 \
--lr 5e-4 \
--warmup-updates 4000 \
--dropout 0.2 \
--save-dir $exp_dir/model \
--keep-last-epochs 5 \
--keep-interval-updates 3 \
--patience 10 \
--skip-invalid-size-inputs-valid-test \
--fp16 \
--user-dir /c/Users/sasid/Downloads/IndicTrans2/model_configs \
--update-freq=32 \
--distributed-world-size 8 \
--num-workers 24 \
--max-tokens 1024 \
--eval-bleu \
--eval-bleu-args "{"beam": 1, "lenpen": 1.0, "max_len_a": 1.2, "max_len_b": 10}" \
--eval-bleu-detok moses \
--eval-bleu-remove-bpe sentencepiece \
--eval-bleu-print-samples \
--best-checkpoint-metric bleu \
--maximize-best-checkpoint-metric \
--task translation




Please provide the alternate solution.

@sasidhar791
Copy link
Author

sasidhar791 commented Dec 15, 2024

Backslash in these comments isn't getting visible since I'm adding it directly as text. Maybe that's why you were thinking that I didn't add backslash. Apologies for it.But I have tried by adding backslash and it wasn't working for me.

In the last comment, I have provided the same command I'm using as a code block having backslash ("").

Please provide required solution for this since this is my current blocker in my work.

@sasidhar791
Copy link
Author

sasidhar791 commented Dec 16, 2024

Hello @PranjalChitale ,

I figured the issue.
Though I give the path as:
C:\Users\sasid\Downloads\en-indic-exp\model_configs
with backslashes,

--user-dir is considering the path with forward slash / itself:
'C:/Users/sasid/Downloads/en-indic-exp/model_configs'

Since the windows convention has backslashes, I'm getting this error.
I'm trying in both git bash and msys2 but in both environments I'm facing the same issue.

Please provide a solution; I would be truly grateful.

@PranjalChitale
Copy link
Collaborator

PranjalChitale commented Dec 16, 2024

The easiest solution is to use relative paths rather than absolute ones.

To simplify things, just run the command from within the experiment directory.

This will help ensure paths are resolved correctly and should eliminate the issues you're encountering.

Assuming you place model_configs also in the $exp_dir.

cd $exp_dir

fairseq-train final_bin \
--max-source-positions=256 \
--max-target-positions=256 \
--source-lang=SRC \
--target-lang=TGT \
--max-update=1000000 \
--save-interval-updates=2500 \
--arch=$model_arch \
--activation-fn gelu \
--criterion=label_smoothed_cross_entropy \
--label-smoothing=0.1 \
--optimizer adam \
--adam-betas "(0.9, 0.98)" \
--lr-scheduler=inverse_sqrt \
--clip-norm 1.0 \
--warmup-init-lr 1e-07 \
--lr 5e-4 \
--warmup-updates 4000 \
--dropout 0.2 \
--save-dir $exp_dir/model \
--keep-last-epochs 5 \
--keep-interval-updates 3 \
--patience 10 \
--skip-invalid-size-inputs-valid-test \
--fp16 \
--user-dir model_configs \
--update-freq=32 \
--distributed-world-size 8 \
--num-workers 24 \
--max-tokens 1024 \
--eval-bleu \
--eval-bleu-args "{"beam": 1, "lenpen": 1.0, "max_len_a": 1.2, "max_len_b": 10}" \
--eval-bleu-detok moses \
--eval-bleu-remove-bpe sentencepiece \
--eval-bleu-print-samples \
--best-checkpoint-metric bleu \
--maximize-best-checkpoint-metric \
--task translation


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants