Recommender systems are a very popular topic in e-commerce. They are frameworks and engines which help us to implement a recommender system easier. One of these engines is LightFM which is a satisfiable python framework.
This project is part of a 4-person team project, each person has worked on a special package my package has been LightFM.
You can see our final report in Final_Report_Group8.pdf.
The goal is to implement a recommender system with LightFM with 3 of Kaggle datasets. 3 data set and their related files are:
files:
- collabrative_restaurant.py (main file)
- geoplaces2.json
- rating_final_U1011.json
- userprofile.json
- resturant_json.py
files:
- book.py (main file)
- test_train.py
- book_wrap_vs_BPR.py
- book_eval_K_OS.py
- mainbookup_lim1.json (10000 rows of main data in randomly picked)
files:
- size_test&trainWRAPvsBPR.py
- size_test&trainWRAP.py
- random_pick.py
- renttherunway_lim.json (10000 rows of main data in randomly picked)
- sizeRs.py (main file)
I used Conda in Pycharm and install LightFM with:
conda install -c conda-forge lightfm
conda install -c conda-forge/label/gcc7 lightfm
conda install -c conda-forge/label/cf201901 lightfm
conda install -c conda-forge/label/cf202003 lightfm
One of the most important challenges is how to give the data to the package. First, read the Json file and create a dataset of lightfm :
f = open('rating_final_U1011.json', )
ff = open('userprofile.json', )
df = open(r'geoplaces2.json')
data_User = json.load(ff)
data_item = json.load(df)
data = json.load(f)
dataset = Dataset()
Then fit the dataset with your data:
dataset.fit((x['userID'] for x in data),
(x['placeID'] for x in data), (x['budget'] for x in data_User),(x['price'] for x in data_item))
Now it's possible to create the matrixes:
(interactions, weights) = dataset.build_interactions(((x['userID'], x['placeID']) for x in data))
print(repr(interactions))
user_interactions = dataset.build_user_features((x['userID'], [x['budget']]) for x in data_User)
print(repr(user_interactions))
item_interactions = dataset.build_item_features((x['placeID'], [x['price']]) for x in data_item)
print(repr(item_interactions))
This package is a model base package define and fit package:
alpha = 1e-05
epochs = 70
num_components = 32
model = LightFM(no_components=num_components,
loss='warp',
learning_schedule='adadelta',
user_alpha=alpha,
item_alpha=alpha)
For testing and validating the model you need to split data to test and train like in test_train.py.
Testing learning_schedule adadelta vs adagrad for cloth dataset:
Testing loss WARP vs BPR for cloth dataset:
http://www2.informatik.uni-freiburg.de/~cziegler/BX/
https://making.lyst.com/lightfm/docs/home.html
https://github.com/lyst/lightfm
Reach out to me at [email protected].
Thanks @alirezaomidi 😅