https://etherpad.wikimedia.org/p/sentiment_analysis_with_R
A CSV file containing 5000 book reviews web-scrapped from Amazon in 2018
The State of the Union is an annual address by the President of the United States before a joint session of congress. In it, the President reviews the previous year and lays out his legislative agenda for the coming year.
This dataset contains the full text of the State of the Union address from 1989 (Regan) to 2017 (Trump).
This is a nice, clean set of texts perfect for exploring Natural Language Processing techniques
- Topic modelling: Which topics have become more popular over time? Which have become less popular?
- Sentiment analysis: Are there differences in tone between different Presidents? Presidents from different parties?
- Parsing: Can you train implement a parser to automatically extract the syntactic relationships between words?
- Authorship identification: Can you correctly identify the author of a previously unseen address?