Research paper link: https://arxiv.org/pdf/2009.05451.pdf
Paper Title: A Comparison of LSTM and BERT for Small Corpus
Give a small dataset, can we use a large pre-trained model like BERT and get better results than simple models?
-
Results show that bidirectional LSTM model can achieve significantly higher results than a BERT model for a small dataset.
-
Performance of a model is dependent on the task and the data, and therefore before making a model choice, these factors should be taken into consideration.
Intent classification dataset is used. It has 150 intent classes with 100 training observations from each class. For each intent, 20 validation and 30 test queries are provided. There are also out-of-scope queries that do not fall under any of the 150 intent classes.
Dataset | Number of Utterances |
---|---|
Training | 15,101 |
Validation | 3,101 |
Test | 5,501 |
The models were tested on randomly splitted smaller versions by taking X percent of the data where X = {25, 40, 50, 60, 70, 80, 90}
The number of out-of-scope utterances is very low compared to in-scope utterances and the number of classes are high compared to the number of observations available for each intent. Only a total of 1,200 out-of-scope utterances exist in the dataset, with 100 of them in the training set, 100 of them in the validation set and the remaining 1,000 out-of-scope utterances in the test set.
-
In-scope accuracy
: is calculated using only in-scope utterances before calculating the accuracy metric -
Overall accuracy
: is a better metric for measuring the overall performance of the model as it does not remove the challenging utterances from the result and therefore is more realistic
In-scope accuracy results are higher than Overall accuracy results.
The simplest LSTM model(1 bidirectional + 1 unidirectional) performed the best in terms of both overall accuracy and in-scope accuracy.
-
With the test set, in-score accuracy for the LSTM model was 69.65% and overall accuracy was 70.08% whereas BERT model achieved 67.15% accuracy