diff --git a/README.md b/README.md index 58aff95..286c0d4 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@ -# Automatic-Website-Categorizers +### Automatic-Website-Categorizers: These programs are designed to categorize specified websites. -# Title Searcher +## Search Bar: (Powered by Yandex.Translate: http://translate.yandex.com/ ) Download [GoogleScraper](https://github.com/NikolaiT/GoogleScraper): @@ -19,7 +19,7 @@ Run titles_searcher.py that output.txt file: python2.7 titles_searcher.py output.txt output_from_extract.txt (The first entry in the command line after 'titles_searcher.py' should be the name of the input file that you are processing, which is the output from the desired GoogleScraper run. The entry after that should be the name of the output file that you want to create. You will need to put the file lists.py in the same directory as titles_searcher.py.) -# Common Crawl Title Searcher +## Common Crawl Title Searcher: (Powered by Yandex.Translate: http://translate.yandex.com/ ) Open a command prompt and enter the following: @@ -32,7 +32,7 @@ Open a command prompt and enter the following: python2.7 titles_searcher_common.py output.txt (The first entry in the command line after 'titles_searcher.py' should be the name of the output file that you want to create.) -# Description Searcher +## Description About Sercher: (Powered by Yandex.Translate: http://translate.yandex.com/ ) Download the prerequisites: @@ -46,9 +46,9 @@ Create a textfile called websites.txt with all of the websites that you want to This will output results in the form: website address category. -# Aggregator Identifier +## Identifier Create a textfile called websites.txt with all of the websites that you want to search. Then run the file aggregators.py on that. You will need to install BeautifulSoup and GoogleScraper as described above for the Title Searcher program and Description Searcher Program. This program will output two lists, one list will be the list of sites that were identified to be aggregators and the other list will be the list of sites that were not identified to be aggregators. -# Twitter Handle/Facebook Page Extractor +## Twitter/Facebook Page Extractor Create a textfile called websites.txt with all of the websites that you want to search. Then run the file twitterandfacebook.py on that. This program wilAl output the twitter handles and facebook pages that were found for the websites that were searched. At this point, some minor editing of the output is required.