This script performs text processing tasks, including sentence segmentation, word tokenization, and analysis of sentence lengths and longest words.
To use this script, run it from the command line with the path to your text file as an argument:
python script.py path/to/your/text.txt
Segments the text into sentences based on full stops and spaces.
Tokenizes words, removing punctuation for analysis.
Orders sentences by length and prints the ordinal position of the six longest sentences.
Identifies and prints the five longest unique words in the text.
This script uses Python and the re module for regular expressions.
If you'd like to contribute or improve the script, please follow the standard GitHub Fork and Pull Request workflow.
This project is licensed under the MIT License - see the LICENSE file for details.