- Build a Similarity check API using NLP, run and deploy using Docker & Docker-compose.
Document similarity (or distance between documents) is a one of the central themes in Information Retrieval. How humans usually define how similar are documents? Usually documents treated as similar if they are semantically close and describe similar concepts. On other hand “similarity” can be used in context of duplicate detection. We will review several common approaches.
The objective of this API is to handle Similarity of text (PLAGIARISM CHECK)
RESOURCES | URL(PATH) | METHOD | PARAMETERS | STATUSCODE |
---|---|---|---|---|
Register a user | /register | POST | username, password | 200:OK, 301:INVALID USERNAME |
Detect Similarity of docs | /detect | POST | username, password , text1 & text2 | 200:OK RETURN SIMILARITY , 301:INVALID USERNAME, 302:INVALID PASSWORD, 303:OUT OF TOKENS |
Refill | /refill | POST | username, admin_pw, refill_amount | 200:OK, 301:INVALID USERNAME , 304:INVALID ADMIN_PW |
- spacy.io is an open-source software library for advanced Natural Language Processing, written in the programming languages Python, it is very easy python processing module.
Download the spacy model from here
-
Flask framework, see how to install and run the flask framework here , for more details
-
pymongo, PyMongo is a Python distribution containing tools for working with MongoDB download and install pymongo from here
-
Docker-compose.yml