Spotify_Information_Retrieval

A python project implementing sematic focused search on podcast documents from Spotify (TREC collection).

to run this project with a demonstration index please follow these steps:

Clone this repository
Set up the python virtual environment using the IR_venv.yml (windows users please see the note at the bottom of this readme).
Launch the virtual environment

(optional - to re-create the BM25 index using the files found in Sampled_docs)

cd to the location /Spotify_Information_Retrieval/src/indexing/
run python BM25_create_demo.py
cd to the location /Spotify_Information_Retrieval/src/

(to launch the spotify transcript search engine graphical user interface)

run python main.py

(to run the evaluation scripts (with options for 4 types of search strategies))

run python evaluation.py

(to run the unit testing:

First edit the 'evaluation.py' file to comment out the final 2 lines:
evaulate = Evaluate(k=100 ,use_synonym=False, expansion=True, train_test='test')

evaulate.evaluate()

Don't forget to revert after running the unit testing.

Replace the documents in /Spotify_Information_Retrieval/Sampled_docs with the files ts1.json and ts2.json that are in /Spotify_Information_Retrieval/Testing/Sampled_docs_testing.
Move the 'testing_index.pkl' and 'unittest.metadata.csv files into /Spotify_Information_Retrieval/Files/Local_pickles)

run unit_testing.py

NOTE FOR WINDOWS USERS If the .yml files do not work for creating a virtual environment, the following packages should be installed via conda, or installer of choice:

pandas 1.5.2
matplotlib 3.7.1
nltk 3.7
scikit-learn 1.2.0
feedparser
pysimplegui 4.60.4

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
Files		Files
Testing		Testing
delete		delete
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
IR_venv.yml		IR_venv.yml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify_Information_Retrieval

evaulate = Evaluate(k=100 ,use_synonym=False, expansion=True, train_test='test')

evaulate.evaluate()

About

Releases

Packages

Contributors 4

Languages

License

Padmapalita/Portfolio_Information_Retrieval

Folders and files

Latest commit

History

Repository files navigation

Spotify_Information_Retrieval

evaulate = Evaluate(k=100 ,use_synonym=False, expansion=True, train_test='test')

evaulate.evaluate()

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages