For an in-depth description, check out the report.
Here is a quick look at the approach:
General Guidelines:
- It is recommended that you run the code on Google Colab (here). You might face installation issues otherwise.
- The features (similarity) were calculated for the dataset: Data for Automatic Short Answer Grading. It took about an hour to run, so that section wasn't included in the report.
- Here is a quick link to the testing dataset we used as a csv file.
- The Model is well suited for short answer grading, it does not work well on long texts.