Web Scraping and NLP

Central London Data Sceince Project Nights

Event page - https://www.meetup.com/central_london_data_science/events/247384261/

Natural language processing (NLP) is a popular field of data science. It focuses on the analysis of unstructured text i.e blocks of text. At a previous event, we used NLP to predict if a comment scrapped from YouTube is from a troll using word frequency analysis. A more common use-case is in sentiment analysis which evaluates how negative or positive a piece of text is. This is a useful feature in determining the objectivity of texts such as news articles.

In this meetup we will show you how to scrape text from websites using Python (and a tool in python called 'beatutifulsoup') and then how you can perform NLP on the scraped text. By the end of the event, we aim to get everyone analysing the text of different websites automatically using a scraping to NLP pipeline.

Gettting Started

Work through the Web Scraping and NLP.ipynb notebook and if you get stuck you can look at the [COMPLETED] Web Scraping and NLP.ipynb notebook.

As a backup if you cant get it running on your computer

The notebook is published as a Kaggle kernal aswell:

Create a Kaggle account if you haven't already
Go to https://www.kaggle.com/zackakil/web-scraping-nlp-cldspn/notebook
Fork the kernal

Have fun,

Zack.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Web Scraping and NLP

Central London Data Sceince Project Nights

Gettting Started

As a backup if you cant get it running on your computer

Files

README.md

Latest commit

History

README.md

File metadata and controls

Web Scraping and NLP

Central London Data Sceince Project Nights

Gettting Started

As a backup if you cant get it running on your computer