Text Analysis Toolkit

Note: This is my first project! 😊

This repository contains a Python toolkit for text analysis. It provides various functions and tools for processing and analyzing text data, including:

Tokenization and sentence segmentation
Text cleaning (lowercasing, stop word removal, punctuation removal)
Lemmatization
Named Entity Recognition (NER)
Sentiment Analysis
Frequency Analysis
Topic Modeling
Dispersion Plots
Word Cloud Generation

Getting Started

To use this toolkit, follow these steps:

Clone the repository to your local machine:

git clone https://github.com/your-username/text-analysis-toolkit.git

2.Install the required Python packages by running:

pip install -r requirements.txt

3.Place your text files (e.g., "Syrian.txt" and "MyCountry.txt") in the root directory of the project.

4.Modify the main() function in text_analysis.py to process your specific text data and analysis tasks.

5.Run the script with the input file paths as arguments:

python text_analysis.py Syrian.txt MyCountry.txt

6.Explore the results, including charts, dispersion plots, and word clouds generated by the toolkit.

Usage

Here's a brief overview of how to use the functions provided by the toolkit:

1.:closed_book: read_book(file_path): Read and load your text data from the specified file.

📈 perform_sentiment_analysis(): Analyze the sentiment of your text.
👪 find_most_common_names(): Find the most common person names in the text.
📊 create_name_frequency_chart(): Create bar charts of the most common names.
📈 create_dispersion_plot(): Generate dispersion plots for specific words.
📄 perform_topic_modeling_on_real_words(): Perform topic modeling on your text data.
☁️ create_word_cloud(): Create word clouds to visualize word frequency.

Contributing

If you'd like to contribute to this project or have suggestions for improvement, please don't hesitate to reach out! I'm learning as I go and appreciate your input.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Analysis Toolkit

Getting Started

Usage

Contributing

License

About

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
MyCountry.txt		MyCountry.txt
README.md		README.md
Syrian.txt		Syrian.txt
analysis.py		analysis.py

License

cristinamatacuta/NLP-and-the-Syrian-war-literature

Folders and files

Latest commit

History

Repository files navigation

Text Analysis Toolkit

Getting Started

Usage

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages