Basic Named Entity Recogniztion and Word Frecuency without the use of NLP libraries
This project performs text analysis on a given book chapter/book, providing various insights such as the longest sentence, longest word, named entities, and the top 10 most used words.
- Installation
- Usage
- Features
- Contributing
- License
- Clone the repository:
git clone https://github.com/cristinamatacuta/ManualTextProcessing
python processing.py input_file_path
Replace input_file_path with the path to the text file you want to analyze.
- 📚 Longest Sentence: Find and display the longest sentence in the text.
- 📗 Longest Word: Identify and display the longest word in the text.
- 🌆 Named Entities A-L and M-Z: Extract and list named entities from A to L and M to Z, excluding stop words.
- 📊 Top 10 Most Used Words: Display the top 10 most frequently used words in the text. 📌 Note: The file containing stop words is included in the repository.
If you'd like to contribute to this project or have suggestions for improvement, please don't hesitate to reach out! I'm learning as I go and appreciate your input.
This project is licensed under the MIT License - see the LICENSE file for details.
Copyright (c) [2023] [Cristina Matacuta]