-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some clarifying questions #3
Comments
Hi @kirk86. Thanks for reaching out. I'd be more than happy to help :)
Generally in the code, when we have the
Exactly! Actually there is an example in https://github.com/kazemnejad/pt_hf_base?tab=readme-ov-file#adding-a-new-model for adding new models. Just as a reminder, this is mostly a research codebase, I tried to make it as clean as possible, but be prepared to find some inconsistencies here and there :)
Since SCAN is already a well-established dataset with its splits being available (Lake et al., 2018), we don't generate it from scratch, we rather use the already available files for it, which has 16K examples. Yes, one can use those scripts to generate the dataset. P.S. I've added a brief walk-through of the codebase to the readme. You might find it useful: https://github.com/McGill-NLP/length-generalization?tab=readme-ov-file#code-structure |
Hi @kazemnejad, sorry to bother you. For some reason the variable In the logs I see some messages like the following:
The configs used for the run are the following:
Any ideas what might be going wrong? On another note, in the notebooks I couldn't find the test accuracy of these modes. It's just shown based on mean rank. Which variable is that stored to in wandb Is it If I go to the respective experiments directory I see there's a file named |
Hey @kirk86
In our paper, we only focus on sequence to sequence tasks. The classification tasks were considered in our early exploration and that's you can find their residue in the code. However, they were left out for the rest of the project. Unfortunately, I don't think they're readily usable at the current stage of the codebase.
We were primarily interested in the per-bucket accuracy of these models. But yes, the |
Thanks for the reply.
In figure F.5 there's a classification task, did you solve it as seq2seq or as regular classification (I'm assuming the latter)? |
For all tasks in the paper, we only consider their seq2seq form. |
Thanks again for your reply. |
As explained in Sec. 3, we report the results over three seeds. Please note that as these models are trained from scratch, the |
Thanks for the explanations,
I understand that the model is trained from scratch but does that also mean that the tokenizer of T5 is trained from scratch as well for each task or did you simply |
We don't train the tokenizer from scratch, rather we use the original T5 tokenizer. |
Hi @kazemnejad,
Thanks for making things reproducible.
If you don't mind me asking a quick question.
In the data I see something like this
{"values": [1, 0, 1, 0, 0, 1], "answer": 1, "cat": 6}
what doescat
represent is a category or something else?If I remember correctly somewhere in the paper was stated that the models used are mostly decoder type, so to add new models one has to add a config file in
configs/models
and then the actual model inmodels/
directory?Finally, for some data like
scan
there are only 16K training samples but for others like arithmetic tasks there are 100K, why this difference is there any particular reason?If one wants to generate the data I suppose it suffices to call
dataset_builders/make_..._dataset.py
, right?The text was updated successfully, but these errors were encountered: