-
Notifications
You must be signed in to change notification settings - Fork 28
/
description.txt
15 lines (12 loc) · 1.22 KB
/
description.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Introduction to Text Analysis with Python and the Natural Language ToolKit (NLTK)
Michelle A. McSweeney, PhD
Rachel Rakov
Digital technologies have made vast amounts of text available to researchers, and this same technological moment has provided us with the capacity to analyze that text. The first step in that analysis is to transform texts designed for human consumption into a form a computer can analyze as well. Using Python and the Natural Langauge ToolKit package (commonly called NLTK), this workshop introduces strategies to turn qualitative texts into quantitative objects. Though that process, we will present a variety of strategies for simple analysis of text-based data.
By the end of this workshop, you will be able to:
* Identify strategies for transforming texts into numbers
* Explain what a concordance is, how to find one, and why it matters
* Compare frequency distribution of words in a text to quantify the narrative arc
* Explain what stop words are and why they are often removed
* Remove stop words in a variety of languages
* Utilize Part-of-Speech tagging to gather insights about a text
* Transform any document that you have (or have access to) in a .txt format into a text that can be analyzed computationally