Unraveling the Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study

Abstract

Natural Language Inference (NLI) plays a critical role in Natural Language Processing (NLP) as it uncovers entailment relationships between text pairings, crucial for Natural Language Understanding (NLU). NLI evaluates entailment between a premise and a hypothesis, where "entailment" denotes logical implication, "contradiction" indicates disagreement, and "neutral" suggests inadequate evidence. Despite the success of Large Language Models (LLMs), challenges persist in NLI, particularly in low-resource domains, model overconfidence, and capturing human judgment. This work investigates the efficacy of LLMs in low-resource languages like Bengali by comparing notable LLMs with state-of-the-art (SOTA) models using the XNLI dataset. Evaluations encompass zero-shot and few-shot scenarios, contrasting GPT-3.5 Turbo and Gemini 1.5 Pro against BanglaBERT, Bangla BERT Base, DistilBERT, mBERT, and SahajBERT. While LLMs exhibit promise, especially in few-shot settings, further research is warranted for a comprehensive understanding, particularly in languages like Bengali. This underscores the significance of continued exploration into LLM capabilities across diverse language contexts. Our implementations are now publicly accessible at https://github.com/fatemafaria142/Large-Language-Models-Over-Transformer-Models-for-Bangla-NLI.

Proposed Methodology

Dataset Availability

In this research paper, we have employed the XNLI dataset, which comprises Bengali-language instances for NLI. The dataset consists of a subset of the MultiNLI data, where each instance includes a premise (sentence1), a hypothesis (sentence2), and a classification label indicating contradiction (0), entailment (1), or neutral (2).

We utilized the following number of instances from each set:

Train set: 381,449 instances
Validation set: 2,419 instances
Test set: 4,895 instances

The dataset can be accessed from here.

Results

Comparative Analysis of PLMs for Different Performance Metrics

Model	Accuracy	Precision	Recall	F1-Score
BanglaBERT	0.8204	0.8222	0.8204	0.8203
Bangla BERT Base	0.6803	0.6907	0.6812	0.6833
DistilBERT	0.6320	0.6358	0.6320	0.6317
mBERT	0.6427	0.6496	0.6428	0.6153
sahajBERT	0.6708	0.6791	0.6709	0.6707

Comparative Analysis of LLMs for Different Performance Metrics

LLMs	Metric	Zero-shot	5-shot	10-shot	15-shot
GPT-3.5 Turbo	Accuracy	0.8503	0.8657	0.8756	0.9205
	Precision	0.7025	0.8683	0.8753	0.9219
	Recall	1.0	0.8624	0.8759	0.9204
	F1-Score	0.8254	0.8640	0.8748	0.9299
Gemini 1.5 Pro	Accuracy	0.7287	0.8625	0.8732	0.9146
	Precision	0.5556	0.8652	0.8763	0.9156
	Recall	0.7143	0.8652	0.8732	0.9146
	F1-Score	0.6251	0.8652	0.8701	0.9136

Contact Information

For any questions, collaboration opportunities, or further inquiries, please feel free to reach out:

Fatema Tuj Johora Faria
- Email: [email protected]
Mukaffi Bin Moin
- Email: [email protected]
Pronay Debnath
- Email: [email protected]
Asif Iftekher Fahim
- Email: [email protected]

Citation

@misc{faria2024unraveling,
      title={Unraveling the Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study}, 
      author={Fatema Tuj Johora Faria and Mukaffi Bin Moin and Asif Iftekher Fahim and Pronay Debnath and Faisal Muhammad Shah},
      year={2024},
      eprint={2405.02937},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Codes		Codes
LICENSE		LICENSE
NLI_with_page-0001.jpg		NLI_with_page-0001.jpg
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unraveling the Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study

Abstract

Table of Contents

Proposed Methodology

Dataset Availability

Results

Comparative Analysis of PLMs for Different Performance Metrics

Comparative Analysis of LLMs for Different Performance Metrics

Contact Information

Citation

About

Releases

Packages

Contributors 4

Languages

License

fatemafaria142/Large-Language-Models-Over-Transformer-Models-for-Bangla-NLI

Folders and files

Latest commit

History

Repository files navigation

Unraveling the Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study

Abstract

Table of Contents

Proposed Methodology

Dataset Availability

Results

Comparative Analysis of PLMs for Different Performance Metrics

Comparative Analysis of LLMs for Different Performance Metrics

Contact Information

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages