-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run_classifier.py: error: unrecognized arguments: --load_model #56
Comments
Oh sorry, that's a README typo that I haven't changed. |
Thanks! It's working now. Do you have any ideas how to improve the quality in terms of preprocessing? Is there a minimum or maximum length input length? |
I tried running the following script It runs smoothly but the result is not accurate for the transformer model, it only predicts the same class for all the reviews in binary_sst/test.csv. However, mLSTM model gives good result did any other person experienced this? many thanks @raulpuric @Joerg99 |
@dadelani same experience with the pre-trained transformer_sst.clf |
that's rather peculiar, I'll be sure to take a look as soon as I can. |
Thank you @raulpuric sentiment-discovery/generate.py Line 70 in fa52435
Here's what invoked the error: michalmucha$ python -i generate.py --load_model pretrained/mlstm.pt
Creating mlstm
Traceback (most recent call last):
File "generate.py", line 92, in <module>
model.load_state_dict(sd)
File "/Users/michalmucha/Python/reddit/sentiment-discovery/model/model.py", line 56, in load_state_dict
self.decoder.load_state_dict(state_dict['decoder'], strict=strict)
File "/Developer/anaconda3/envs/nlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Linear:
size mismatch for weight: copying a param with shape torch.Size([257, 4096]) from checkpoint, the shape in current model is torch.Size([256, 4096]).
size mismatch for bias: copying a param with shape torch.Size([257]) from checkpoint, the shape in current model is torch.Size([256]). Unfortunately after the change, the model generates a sequence of random chars. Thanks for publishing this work |
ahhh that makes sense, the mLSTM was trained with an older version of the tokenizer I suspect that all these innacuracy issues are due to the model vocab mismatching the tokenizer by 1 position. |
ok to fix your random string problem I think it's because the vocab is 257 but python expects characters to be 0-255 for chr, and ord. To fix this you could manually place increment/decrement tokens by 1 where appropriate. Alternatively you can use our CharacterLevelTokenizer class which should handle this for you automatically. |
@MichaMucha @dadelani I think that's a lack of documentation on my part. I think it's because you're not running with sentencepiece tokenization as in https://github.com/NVIDIA/sentiment-discovery#training-language-models--distributedfp16-training. Try adding these arguments: Let me know if this works and I'll add it to the readme |
@raulpuric thanks for looking into it. Unfortunately the binary classifier transformer is still not serving expected results. Here is an example: invocation (basically delicious copy pasta of your suggestion) python run_classifier.py
--load pretrained/transformer_sst.clf
--data test_comments.csv
--model "transformer"
--write-results wtf4.csv
--text-key "text"
--tokenizer-type SentencePieceTokenizer
--vocab-size 32000
--tokenizer-path pretrained/ama_32k_tokenizer/ama_32k_tokenizer.model
--decoder-layers 12
--decoder-embed-dim 768
--decoder-ffn-embed-dim 3072
--decoder-learned-pos
--decoder-attention-heads 8 output with two obvious sentences I just noted down to verify easily:
For the record, the result is different than previously, and also it runs wayy quicker. I think I got lost in the number of argparse arguments and didn't notice that it would spend time training the tokenizer if I hadn't provided one.. Will try the character ord bump |
Character bump worked: ord +1, chr -1 and it goes on reviewing things :)
|
@raulpuric, thanks for your response. @MichaMucha, which part of the code did you modify to prevent the model from generating random characters? I also had this issue but I could not fix it, I'm currently using the older version of the code published over one year ago https://github.com/Athenagoras/sentiment-discovery. It was trained on mlstm and the generate function (visualize.py) works well with a few compatibility issues of pytorch which can be easily fixed. Also, did you generate the text using mlstm or transformer pre-trained model? |
Hi @dadelani , I went to then there are three uses of Hope this helps, I saw you're using |
Sorry the code has gotten so out of sync, we tried to incorporate our latest work with the old codebase. |
Thanks a lot @MichaMucha , the generate.py code produces good sentences after the character modification for the mlstm pretrained model but the transformer is still having isses. Did you try generating texts using the transformer pretrained model? Does the generate.py code support transformer.pt? @raulpuric |
Nope, only lstm at the moment. If you'd like to generate with transformers it will take some modifications, or you can try and use the huggingface evaluation code for gpt-2. |
I see, thanks for the suggestion @raulpuric. Thanks for releasing the code & models for sentiment-discovery |
Here is the notebook. Very great work! I can imagine how big the effort is to maintain it. Feel free to modify |
After spending several hours to read the code, I finally find the reason why the sentiment classifier always gives probability around 0.5. Just add an extra argument |
Hello, I want to know what the files generated after the program runs mean. Finally, clf_results.npy,clf_results.npy.std.npy,clf_results.npy.prob.npy,3 files are generated. I want to know how to convert them into sentiment. |
Hello, I want to know what the files generated after the program runs mean. Finally, clf_results.npy,clf_results.npy.std.npy,clf_results.npy.prob.npy,3 files are generated. I want to know how to convert them into sentiment. |
@wangzyi54 you should add |
@zhaochaocs do you have any explanation for this? |
@imomayiz hi, do you encounter the problem that |
@ArronChan what command is this? |
@imomayiz |
@imomayiz any idea?? OAO |
@ArronChan you need the right version of pytorch Refer to issue 63: |
@tderrmann Yeah, I install torch==1.0.1 torchvision==0.2.2 and finally it works, thank you so much! But its accuracy seems bad. Does the each row of the result from left to right represent anger, anticipation, disgust, fear, joy, sadness, suprise, trust?? |
@ArronChan make sure to use the tokenizer file from google drive with the pretrained model |
@tderrmann How should I use that? (pytorch041) C:\Users\Arron\Desktop\sentiment-discovery>Traceback (most recent call last): |
The results are much better with the tokenizer model. So better to use something like this: |
@pandeconscious I am not how you evaluate the performance of the model. I tried to also add tokenizer, after evaluate it using the data/semeval/val.csv, the results are still terrible with respect to balanced accuracy and f1 score for each emotion category. Only slightly better than random, much worse than the claimed results. |
@YipengUva Can you please share the exact command you are running and also the F1 scores. On data/semeval/val.csv, the F1 scores that I get are the following:
|
@pandeconscious Thanks very much for your reply. The F1 score you got is really good for such a difficult task. The command I used is !python3 run_classifier.py --load pretrained_downloads/transformer_semeval.clf --text-key Tweet --data data/semeval/val.csv --model transformer --write-results results/semeval/val_result.csv --tokenizer-type SentencePieceTokenizer --vocab-size 32000 --tokenizer-path pretrained_downloads/ama_32k_tokenizer.model. The performance I got are as follows.
Thanks a lot. |
init MultiLayerBinaryClassifier with layers [4096, 2048, 1024, 8] and dropout 0.3 getting this error when I run this: can someone please help? |
@YipengUva The command seems to be correct. Please dm me. I will be happy to sit with you for some time to look into the issue if you are still facing these issues. |
@saum7800 Which torch versions are you using. If I remember correctly this is issue was earlier discussed. Just uninstall the current version and reinstall with |
Thanks a lot @pandeconscious . That was indeed the problem. the right versions to use are torch==1.0.1 and torchvision==0.2.2 as mentioned here. thanks once again |
I'm using
|
Thanks very much, Edison. I will give a try.
Best regards, Yipeng
…On Sep 8 2020, at 12:10 am, Edison Chee ***@***.***> wrote:
I'm using torch==1.6.0 and torchvision==0.7.0 and got it to work with the following changes to [loaders.py][https://github.com/NVIDIA/sentiment-discovery/blob/master/data_utils/loaders.py]:
class Dataloader(data.Dataloader):
...
self.dataset = dataset
# added lines
self._dataset_kind = 1
self._IterableDataset_len_called = len(self.dataset)
self.generator = None
self.multiprocessing_context = None
...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub (#56 (comment)), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AD52CBWYI7B7U4O4FY43IX3SEXDELANCNFSM4GSWRVTA).
|
I am not sure if you have test the performance. My previous results, and the results after testing your method are not that good.
Regards, Yipeng
…On Sep 8 2020, at 12:10 am, Edison Chee ***@***.***> wrote:
I'm using torch==1.6.0 and torchvision==0.7.0 and got it to work with the following changes to [loaders.py][https://github.com/NVIDIA/sentiment-discovery/blob/master/data_utils/loaders.py]:
class Dataloader(data.Dataloader):
...
self.dataset = dataset
# added lines
self._dataset_kind = 1
self._IterableDataset_len_called = len(self.dataset)
self.generator = None
self.multiprocessing_context = None
...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub ***@***.***/0?redirect=https%3A%2F%2Fgithub.com%2FNVIDIA%2Fsentiment-discovery%2Fissues%2F56%23issuecomment-688640158&recipient=cmVwbHkrQUQ1MkNCWFlTTkIyTUdXR0pHUkkzQk41TU1BVUxFVkJOSEhCUUVGNDZBQHJlcGx5LmdpdGh1Yi5jb20%3D), or unsubscribe ***@***.***/1?redirect=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAD52CBWYI7B7U4O4FY43IX3SEXDELANCNFSM4GSWRVTA&recipient=cmVwbHkrQUQ1MkNCWFlTTkIyTUdXR0pHUkkzQk41TU1BVUxFVkJOSEhCUUVGNDZBQHJlcGx5LmdpdGh1Yi5jb20%3D).
|
I'd like to classify my data with a pretrained model.
I followed the instructions on the readme page and tried to run one of these commands:
python3 run_classifier.py --load_model ama_sst.pt # classify Binary SST
python3 run_classifier.py --load_model ama_sst_16.pt --fp16 # run classification in fp16
python3 run_classifier.py --load_model ama_sst.pt --text-key --data <path.csv> # classify your own dataset
But it caused this error: run_classifier.py: error: unrecognized arguments: --load_model
The text was updated successfully, but these errors were encountered: