Multi-task learning aims to learn multiple different tasks simultaneously while maximizing performance on one or all of the tasks.
- Accuracy of classification.
- Exact Match.
The Chinese Language Understanding Evaluation Benchmark (CLUE) is a benchmark to evaluate the performance of models across a diverse range of existing natural language understanding tasks. Models are evaluated based on the average scores across all tasks.
The state-of-the-art results can be seen on the public CLUE leaderboard.
Suggestions? Changes? Please send email to [email protected]