Skip to content

Latest commit

 

History

History
26 lines (14 loc) · 1.41 KB

README.md

File metadata and controls

26 lines (14 loc) · 1.41 KB

Web Scraping and NLP

Event page - https://www.meetup.com/central_london_data_science/events/247384261/

Natural language processing (NLP) is a popular field of data science. It focuses on the analysis of unstructured text i.e blocks of text. At a previous event, we used NLP to predict if a comment scrapped from YouTube is from a troll using word frequency analysis. A more common use-case is in sentiment analysis which evaluates how negative or positive a piece of text is. This is a useful feature in determining the objectivity of texts such as news articles.

In this meetup we will show you how to scrape text from websites using Python (and a tool in python called 'beatutifulsoup') and then how you can perform NLP on the scraped text. By the end of the event, we aim to get everyone analysing the text of different websites automatically using a scraping to NLP pipeline.

Gettting Started

Work through the Web Scraping and NLP.ipynb notebook and if you get stuck you can look at the [COMPLETED] Web Scraping and NLP.ipynb notebook.

As a backup if you cant get it running on your computer

The notebook is published as a Kaggle kernal aswell:

  1. Create a Kaggle account if you haven't already

  2. Go to https://www.kaggle.com/zackakil/web-scraping-nlp-cldspn/notebook

  3. Fork the kernal

Have fun,

Zack.