Welcome to Whosapp! 🚀 This project was developed as part of the "Fundamentals of Artificial Intelligence" course at the University of Salerno. Our objective is to create a machine learning model capable of identifying the authors of WhatsApp chats. Read on to learn how to use the model! 📱
The project is organized into the following directories:
-
configs/
: Contains configuration files, including:configs/alias.json
: Alias configurationconfigs/config.json
: Feature configuration
-
data/
: Holds project data with subdirectories:data/rawdata/
: Raw data storagedata/dataset/
: Dataset used for trainingdata/wordlist/
: Wordlist used for processing
-
frontend/
: Contains the project's frontend components -
logs/
: Stores project logs, categorized into sections -
models/
: Houses the machine learning models of the project -
src/
: Hosts the source code of the project
-
Create the following folders:
data/rawdata/
configs/
-
In the
configs/
folder, create the following files:configs/alias.json
where you will put the alias configuration. The file must be in the following format:
{ "Username": ["Alias1", "Alias2", "Alias3"], "Username2": ["Alias1", "Alias2", "Alias3"] }
-
Clone the repository and install the requirements:
git clone https://github.com/danlig/WhosApp.git cd WhosApp pip install -r requirements.txt
-
Upload the chat you want to analyze in the
data/rawdata/
folder -
Run the
py src/pipeline.py
script. For more information, runpy src/pipeline.py -h
. -
Run the
py src/main.py
script to load and use the model -
Finally, run
node frontend/index.js
to start the frontend of the project
py src/new_dataset.py
: Creates a new dataset from the raw data indata/raw/
and saves it indata/dataset/
in .parquet formatpy src/new_model.py
: Creates a new model from the dataset created withpy src/new_dataset.py
and saves it inmodels/
in .joblib format.py src/pipeline.py
: Creates a new dataset and model from the raw data indata/raw/
and saves them indata/dataset/
andmodels/
respectively.py src/test_features.py
: test the features that you have configured in theconfigs/config.json
file For more information on how to use these scripts, runpy <script_name>.py -h