Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix overwrite bug when adding symbol to dictionary
This bug ignored the tokens that were meant to be overwritten and appends them to the end of the dictionary symbols. For example, a dictionary with 50K tokens that already has `<s>`, `</s>`, `<pad>` and `<unk>` with the #fairseq:overwrite tag will end up having 50004 tokens when loaded.
- Loading branch information