A Recurrent Sequence to Sequence, multi-domain generative conversational model chatbot implemented in pytorch
Overview:
- Handle loading and preprocessing of Movie-Dialogs Corpus dataset
- Implement a sequence-to-sequence model with Luong attention mechanism(s)
- Jointly train encoder and decoder models using mini-batches
- Implement greedy-search decoding module
- Interact with trained chatbot
The Brains of the chatbot is a sequence-to-sequence(seq2seq) model. This takes a variable-length sequence as an input and returns a variable-length sequence as an output using a fixed size model. By using two separate recurrent neural nets together, we can accomplish this task. One RNN acts as an encoder, which encodes a variable length input sequence to a fixed-length context vector. In theory, this context vector (the final hidden layer of the RNN) will contain semantic information about the query sentence that is input to the bot. The second RNN is a decoder, which takes an input word and the context vector, and returns a guess for the next word in the sequence and a hidden state to use in the next iteration.
Example Output:
hello?
Bot: hello.
where am I?
Bot: you re in a hospital.
who are you?
Bot: i m a lawyer.
how are you doing?
Bot: i m fine.
are you my friend?
Bot: no.
you're under arrest
Bot: i m trying to help you !
i'm just kidding
Bot: i m sorry.
where are you from?
Bot: san francisco.
it's time for me to leave
Bot: i know.
goodbye
Bot: goodbye.
For a detailed explanation, go through the repo in the following order:
- Preparing the Data
- Defining the Models
- Training & Evaluation
This project implements ideas from the following papers:
Acknowledgements: