Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train a new model without using the pretrained model weights #19

Open
ishita1995 opened this issue Dec 7, 2018 · 5 comments
Open

Train a new model without using the pretrained model weights #19

ishita1995 opened this issue Dec 7, 2018 · 5 comments

Comments

@ishita1995
Copy link

I am trying to train a new model using the torchmoji architecture. I am not loading the pre-trained weights in
the code. In torchmoji_transfer function calling the TorchMoji class but not loading the weights. The output is coming out to be nan due to which loss is nan.
Can you please help me understand where am I going wrong. I am pretty new to deep learning. Sorry for the inconvenience in advance.

@thomwolf
Copy link
Member

thomwolf commented Dec 7, 2018

Hi, can you post a simple and self-contained example of code showing that is not working?

@ishita1995
Copy link
Author

from __future__ import print_function
import example_helper
import json
import torch
from torchmoji.model_def import torchmoji_transfer
from torchmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH, ROOT_PATH
from torchmoji.finetuning import (
     load_benchmark,
     finetune)


DATASET_PATH = '{}/data/emotion_data/raw.pickle'.format(ROOT_PATH)
nb_classes = 4

with open(VOCAB_PATH, 'r') as f:
    vocab = json.load(f)

# Load dataset.
data = load_benchmark(DATASET_PATH, vocab)

# Set up model and finetune
model = torchmoji_transfer(nb_classes, None,extend_embedding=1412)
print(model)
model, acc = finetune(model, data['texts'], data['labels'], nb_classes, data['batch_size'], method='chain-thaw')
print('Acc: {}'.format(acc))

In this example , if I don't load pretrained weights and just initiate the TorchMoji class the output comes out as nan even when i initiate the weights in __init_weights() What am I doing wrong here.

@graykode
Copy link

Hello I am also trying training the torchmoji without pre-trained model. but I have problem that loss is not decrease well....

@ishita1995
Copy link
Author

@graykode I ultimately wrote my own script with the same architecture to replicate the results.

@DanielJuravski
Copy link

Hi @ishita1995, can you share your implementation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants