Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have anyone reproduced the result? #13

Open
qiaopt opened this issue Jan 10, 2019 · 8 comments
Open

Have anyone reproduced the result? #13

qiaopt opened this issue Jan 10, 2019 · 8 comments

Comments

@qiaopt
Copy link

qiaopt commented Jan 10, 2019

I have tried to reproduce the result, by got QWK much less than that in the paper. Here is my log for prompt 1, fold_0:

log.txt

Did i do something wrong?

@DamonCC
Copy link

DamonCC commented Sep 5, 2019

I also got a similar result, with Kappa scores ranging from 0.5 to 0.6. fold == 0, prompt == 1, all other parameters are default values.

@ghost
Copy link

ghost commented Sep 14, 2019

How did you get the final result? As I saw in the source code, the author run 50 epochs on one fold and get the final dev score and test score. Should I run 50 epochs individually in each fold, and average their test results as the final experimental result?

@jkdufair
Copy link

I am also trying to replicate the results and getting similar outcomes. I've also tried varying the seed as described in the paper, to no avail.

@jkdufair
Copy link

I was able to replicate the results with QWKs in the .8 range. You'll need to utilize the embeddings file, as described here. Additionally, when you download the file from the link in the FAQ, the embeddings values are separated by commas but this repo expects them to be separated by spaces. I was able to accomplish this with

sed -ri ':a;s/(\ [^,]*),/\1 /;ta' embeddings.w2v.txt

@kavehtp Perhaps you want to update the FAQ to reflect this?

Thanks for making this repo available!

@jkdufair
Copy link

Also, my understanding from the paper is that the best results used a combination of CNN & RNN (LSTM). When I was able to replicate, I passed --cnndim 50 as well. I do not believe CNN is defaulted in parameters.

@nahos
Copy link

nahos commented Mar 11, 2020

What versions of python,theano,keras and tensorflow did you use? I am facing issues with tensorflow.

@NNNNNaaaaaa
Copy link

NNNNNaaaaaa commented Aug 10, 2020

Also, my understanding from the paper is that the best results used a combination of CNN & RNN (LSTM). When I was able to replicate, I passed --cnndim 50 as well. I do not believe CNN is defaulted in parameters.

Even with --cnndim 50 and the embeddings file, I still get the highest QWK with 0.556 for prompt 1, fold_0. Did you use any parameters? And how did you deal with words tagged "<unk> <num> <pad>"? Thank you so much!

@philiphaddad97
Copy link

Did anyone get the same result as mentioned in the paper?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants