This is Python 3 project. please install/ define in IDE the requirements file.
Within python REPL import nltk , nltk.download('stopwords')
The following project include 4 Main folders:
sources - Dataset of text files
tests - small set of tests build with pytest
before running the tests:
update paths in <project_path>/tests/tests_params
generic_tests_params[paths]
in order to run the tests execute:
-pytest <project_path>/TF_IDF/tests
tfidf - source code for tf idf implementation
TF_IDF_LOGS - logs examples that created by running tfidf main
in addition: main - for running the tf_idf as follows : --maxwords <project_path>/TF_IDF/sources/ <project_path>/TF_IDF/sources/
Readme - current file
Log folder will be created in ~/ (can be set in tf_idf_logger.py line 8)
Regards Amir