This project contains the solution for a case study that I was given during an interview. The goal of this case study is to group the company names that are coming from the entity using a dataset of manually-written names. The resulting algorithm should be able to map the company names to their possible common entities.
This project contains :
- A jupyter notebook which document my approach for this case study.
Author: Simon Berlendis
Date: 27/09/2021
This project was developed using the following environment and packages:
- Python 3.8.10
- numpy-1.17.4
- pandas-1.3.3
- scipy-1.6.3
- nltk-3.6.2
- scikit-learn-0.24.2
- cleanco
- difflib