This Python script demonstrates text summarization using the Natural Language Toolkit (NLTK). It generates a summary of input text by selecting the top-ranked sentences based on word frequency.
- Python 3.x
- NLTK library
Install NLTK using pip if you haven't already:
pip install nltk
- Run the script in a Python environment.
- You will be prompted to enter your text. Input the text you want to summarize.
- The script will generate a summary of the input text using NLTK.
- Input Text: The script prompts the user to input the text they want to summarize.
- Preprocessing: The input text is tokenized into sentences using NLTK's
sent_tokenize
function. Each sentence is then processed to remove punctuation, stopwords, and perform stemming. - Word Frequency: The word frequencies in the input text are calculated using NLTK's
FreqDist
function. - Sentence Scoring: Each sentence is scored based on the sum of word frequencies of the words it contains.
- Summary Generation: The top-ranked sentences are selected to form the summary. By default, it selects the top third of sentences with the highest scores.
- Output: The script prints the generated summary.