-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding new embeddings to a trained model. #58
Comments
this is the image of the full error: https://imgur.com/a/4v8VoSJ |
Hi @S4ltedF1sh The model is loaded here: emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py Line 611 in b709f58
What you need to call on your bilstm-models is this function:
It is important to call it before the method buildModel is invoked. Best |
Hi @nreimers ,
As I understand correctly, the buildmodel method is only called once before the training starts, and isn't invoked while loading a trained model. So where should I call the setmapping method when I load my trained model? Or is it only possible to add more embeddings before the training? I checked the code and threre is a cap for the maximum features which I assume is the index of the token in the embeddding file (line 105, BiLSTM.py, buildmodel function):
So because of this input_dim=self.embeddings.shape[0] I think it's capped at the current size of the embedding file and you can't add anymore embeddings after the training. Is it right? Many thanks in advance, |
Hi @S4ltedF1sh However, it is important that you add these new embeddings before tokens = Embedding(...) is invoked. This buildMethod is invoked when training or inference is started. Best |
Hi, I'm currently using this model for poems sentiment analysis. I trained the model with certain amount of poems, with each line is used as a token and each line has its own embedding in the embedding file. The problem is that after the training, I want to use it on other unseen poems (their embedding are not in the embedding file). When I tried to add their embeddings to the embedding file and ran the model, it just returned this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,4] = 4837 is not in [0, 4827) [[Node: word_embeddings/embedding_lookup = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](word_embeddings/embeddings/read, _arg_words_input_0_0, word_embeddings/embedding_lookup/axis)]]
which I assume that after the training, the embedding size of the model is fixed and you can't add any further embedding. So I want to ask how can I add new embeddings to the model or how can I use the model to predict unseen poems?
The text was updated successfully, but these errors were encountered: