ProText-Analyzer v1.0.0
Pre-release
Pre-release
📖 Overview
ProText-Analyzer is a powerful tool designed to extract and analyze textual data from online articles. This project focuses on sentiment analysis and readability assessment, providing valuable insights into the nature and complexity of textual content.
🎯 Objectives
- Extract article content from a list of URLs.
- Perform sentiment and readability analysis.
- Present results in a structured format for easy interpretation.
🚀 Features
-
Data Extraction:
- Retrieves article titles and bodies from specified URLs.
- Saves content in organized text files for further analysis.
-
Text Analysis:
- Sentiment Analysis: Calculate positive, negative, and subjectivity scores using TextBlob.
- Readability Metrics: Evaluate average sentence length, percentage of complex words, and compute the FOG index.
- Word-Level Metrics: Measure total word count, syllable count per word, average word length, and identify personal pronouns.
🛠️ Technologies Used
- Programming Language: Python
- Libraries:
requests
- for fetching HTML contentbeautifulsoup4
- for parsing HTMLtextblob
- for sentiment analysisspacy
- for advanced text processingsyllapy
- for counting syllablespandas
- for data manipulation and analysis
📥 Installation
To get started with ProText-Analyzer, follow these steps:
-
Clone the repository:
git clone https://github.com/rubydamodar/ProText-Analyzer.git cd ProText-Analyzer
-
Install the required libraries:
pip install requests beautifulsoup4 textblob spacy syllapy pandas
📄 Usage
- Prepare your URLs in the
Input.xlsx
file. - Run the data extraction script:
python dataextraction.py
- Analyze the extracted text using the provided analysis functions.
📝 Output Structure
The results are saved in a structured format (CSV or Excel) containing the following variables:
- Positive Score
- Negative Score
- Polarity Score
- Subjectivity Score
- Average Sentence Length
- Percentage of Complex Words
- FOG Index
- Average Number of Words Per Sentence
- Complex Word Count
- Word Count
- Syllable Count Per Word
- Personal Pronouns
- Average Word Length
💡 Future Enhancements
- Implement advanced NLP techniques for improved sentiment analysis.
- Extend support for multiple languages.
- Enhance user interface and error handling.
🤝 Acknowledgments
- Thank you to all contributors and libraries that made this project possible.
📧 Contact
For any inquiries or collaborations, feel free to reach out at [email protected].