Skip to content

Latest commit

 

History

History
89 lines (85 loc) · 3.65 KB

README.md

File metadata and controls

89 lines (85 loc) · 3.65 KB

RecommenderSystem_lightFM 😊

Overview:

Recommender systems are a very popular topic in e-commerce. They are frameworks and engines which help us to implement a recommender system easier. One of these engines is LightFM which is a satisfiable python framework.
This project is part of a 4-person team project, each person has worked on a special package my package has been LightFM.
You can see our final report in Final_Report_Group8.pdf.

Goal:

The goal is to implement a recommender system with LightFM with 3 of Kaggle datasets. 3 data set and their related files are:

files:

  • collabrative_restaurant.py (main file)
  • geoplaces2.json
  • rating_final_U1011.json
  • userprofile.json
  • resturant_json.py

2.Book

files:

  • book.py (main file)
  • test_train.py
  • book_wrap_vs_BPR.py
  • book_eval_K_OS.py
  • mainbookup_lim1.json (10000 rows of main data in randomly picked)

files:

  • size_test&trainWRAPvsBPR.py
  • size_test&trainWRAP.py
  • random_pick.py
  • renttherunway_lim.json (10000 rows of main data in randomly picked)
  • sizeRs.py (main file)

Installation:

I used Conda in Pycharm and install LightFM with:

conda install -c conda-forge lightfm
conda install -c conda-forge/label/gcc7 lightfm
conda install -c conda-forge/label/cf201901 lightfm
conda install -c conda-forge/label/cf202003 lightfm 

Data and LightFM:

One of the most important challenges is how to give the data to the package. First, read the Json file and create a dataset of lightfm :

f = open('rating_final_U1011.json', )
ff = open('userprofile.json', )
df = open(r'geoplaces2.json')
data_User = json.load(ff)
data_item = json.load(df)
data = json.load(f)
dataset = Dataset()

Then fit the dataset with your data:

dataset.fit((x['userID'] for x in data),
            (x['placeID'] for x in data), (x['budget'] for x in data_User),(x['price'] for x in data_item))

Now it's possible to create the matrixes:

(interactions, weights) = dataset.build_interactions(((x['userID'], x['placeID']) for x in data))
print(repr(interactions))
user_interactions = dataset.build_user_features((x['userID'], [x['budget']]) for x in data_User)
print(repr(user_interactions))
item_interactions = dataset.build_item_features((x['placeID'], [x['price']]) for x in data_item)
print(repr(item_interactions))

This package is a model base package define and fit package:

alpha = 1e-05
epochs = 70
num_components = 32
model = LightFM(no_components=num_components,
                loss='warp',
                learning_schedule='adadelta',
                user_alpha=alpha,
                item_alpha=alpha)

For testing and validating the model you need to split data to test and train like in test_train.py.
Testing learning_schedule adadelta vs adagrad for cloth dataset:
Screenshot (994)

Testing loss WARP vs BPR for cloth dataset:

Screenshot (995)

Resourses:

http://www2.informatik.uni-freiburg.de/~cziegler/BX/
https://making.lyst.com/lightfm/docs/home.html
https://github.com/lyst/lightfm

Support:

Reach out to me at [email protected].
Thanks @alirezaomidi 😅