Skip to content

A command line application for fast and efficient spell checking, synonym suggestions and text metrics.

Notifications You must be signed in to change notification settings

nacht-falter/text-inspector

Repository files navigation

Text Inspector

A quick and lightweight text analysis tool

Text Inspector is a quick and easy to use command line tool for text analysis in the English language, written in python using NLTK and pyspellchecker. The tool provides features such as spell checking, synonym suggestion and text metrics and can process plain text files or read from user input.

It aims to provide a quick and lightweight command line alternative to more comprehensive tools. The application targets all writers, who want to quickly gain a deeper understanding of a text, without getting distracted by flashy user interfaces or browser extensions.

Text Inspector includes an import/export feature, which allows for storing and recovering texts from previous sessions.

Text Inspector mockups

Features

Existing features

Import texts from storage

  • On starting the application, you can decide if you want to import texts from the database. If you have previously used Text Inspector and exported your texts, you can enter your recovery key to restore your texts. The texts will then be available from the text selection menu.
  • For demonstration purposes, you can enter examples in the recovery key input field, which will import some example texts from the database.

Importing texts

Text selection

  • From the text selection menu, you can select a text, either by loading it from storage or by creating a new text. The option to load a text will only be available if you have already created a new text item or if you have imported texts from the database.
  • When you decide to load an existing text, you can preview the available texts before selecting one. You can also delete texts you don't need anymore from this menu.

Text selection

Text creation

  • When you create a new text item, you will be asked to provide a title for the text. Next, you can choose to enter the text from the command line or to provide a text file.
  • If you decide to enter text from the command line, you can paste the text or enter it manually. To save your input, enter Done! on a new line and press Enter. (Alternatively, you can try pressing Ctrl-D (or Ctrl-Z on Windows) on a new line.)
  • Providing a text file will only work, if you are running Text Inspector locally on your machine.

User input

Text processing

  • Once you have created or selected a text, you can select one of four options:
    • Spell check: This will check your text for spelling errors and display suggestions for each mistake found. You can decide to accept a correction, provide a custom suggestion, or skip to the next mistake.
    • Suggest synonyms: This will check the text for repeatedly used words and suggest synonyms for each word. This feature is meant to provide insight into frequently used words in the text and does not provide the option to replace the original words with the suggested synonyms (may be added in the future). You will have to do that yourself using your favorite text editor.
    • Text metrics: This will display metrics for the selected text:
      • Total word count
      • Unique word count
      • Sentence count
      • Longest/shortest sentence
      • Average words per sentence
      • Frequently used words (lemmatized and very common words not included)
    • Save text: This will save changes made to the text and return to the text selection menu.

Text processing

Exporting texts

When you exit the application from the text selection menu, you can decide, if you want to store your text items in the database. If you choose to do that, you will be provided with a recovery key, which you can use to restore your saved texts on your next visit.

The current version of Text Inspector uses Google Sheets to store the text items, and your texts will be stored in plain text. Please make sure your exported texts do not contain any sensitive information!

Exporting texts

Future features

  • Read input from URL: Let the user provide a URL to a text file as an alternative to command line input or reading a local file.
  • User dictionary: Let the user add words to a custom dictionary, serving as a white-list for the spell check feature.
  • Let user accept or reject synonym suggestions.
  • Add readability score to text metrics
  • Add support for other languages than English.

Design

Data model

The application is based on a class as the primary data model. For each text item created by the user, the application creates an instance of the Text class, which stores the title and the text contents as instance attributes. Furthermore, the class provides the central functionality of the application by supplying methods for spell checking, synonym suggestion and text metrics.

Workflow

The application has a basic workflow with two main menus:

  • Text selection menu: The text selection menu allows the user to create new text items and manage existing texts.
  • Text processing menu: The text processing menu allows the user to perform different tasks on the currently selected text, such as spell checking, synonym suggestion and displaying text metrics.

Flowchart of application workflow

Flowchart of the initial project scope

Security and privacy

  • All user input is validated to make sure that it is in the expected format and doesn't contain unexpected characters or values. After validation, user input is only processed as a string to prevent security issues like code injection.
  • Before exporting texts to the database, users will see a warning, telling them that the texts should not contain any sensitive information, since they will be stored in plain text. The Google spreadsheet used as text storage is not publicly accessible.
  • The credentials for the Google Drive API are not included in the repository. An example credentials file has been included for reference.

Installation

  • Clone the repository: git clone https://github.com/nacht-falter/text-inspector.git
  • For a local installation, you will only need three files:
    • run.py → The main application file
    • requirements.txt → A list of dependencies
    • creds.json → Google Drive API credentials (not included in the repository, you need to create it yourself)
  • The remaining files are only necessary for deployment to Heroku

Installing dependencies

  • Enter the folder: cd text-inspector and run: pip install -r requirements.txt to install all dependencies.
  • Then download the required NLTK modules:
     python3
     >>> import nltk
     >>> nltk.download("punkt", "wordnet", "stopwords")
    
  • Follow these instructions, if you get an error message like:
     [nltk_data]     CERTIFICATE_VERIFY_FAILED] certificate verify failed:
     [nltk_data]     unable to get local issuer certificate (_ssl.c:997)>
    

Google API credentials

That's it!

You can now run the application: python3 run.py

Technologies Used

Languages

Libraries and other software

External Python libraries

The application uses the following external Python libraries:

Git

  • Git is being used for version control by committing changes to Git and pushing them to GitHub from the command line.

GitHub

Lucidchart

  • Lucidchart was used to create the flowcharts for the application workflow.

Google Sheets and Google Drive

Am I Responsive

  • Am I Responsive Mockup Generator was used to create the mockup image in this README.

regex101

  • regex101 was used to build and test regular expressions.

Testing

PEP 8 Linter

The code in run.py passes through the Code Institute python linter with no issues.

Test result.

Manual testing

  • All features of the application were thoroughly tested to ensure that they work as expected.
  • All user input validations were tested by giving invalid values, such as empty strings, out of bound values or wrong data types.
  • The code has been tested in a local terminal on macOS and in the Code Institute Heroku terminal.
  • All user stories have been tested: User story test results

Bugs

Bug Fix
Terminating command line input by pressing Ctrl-D (Ctrl-Z on Windows) does not work in the Heroku terminal. The process of reading lines is now terminated by typing Done! on a new line. In a local terminal it is still possible to use Ctrl-D or Ctrl-Z
The spell checking function terminated early before the end of the text was reached, when iterating over the list of misspelled words. Use a list comprehension instead of a set to return misspelled words from spellchecker to ensure that the number of misspelled words and the number of matching words always match.
The export function was causing a NameError because it was checking for a variable, which is not always defined. Use a try-except statement to catch the NameError

Deployment

  • The project was initially deployed to Heroku using the Code Institute mock terminal template. This live demo is no longer available.
  • For the installation of NLTK in Heroku, a file named nltk.txt containing all NLTK modules to be installed needs to be present in the root directory of the repository
  • The necessary steps to deploy the project are:
    • Clone or fork the repository.
    • Create a new app from the Heroku dashboard.
    • Go to the Settings tab and click on Reveal Config Vars in the Config Vars section.
    • Add a config var named CREDS and paste the contents of your creds.json file into the value field.
    • Add another config var named PORT with a value of 8000.
    • Add Python and NodeJS to the Buildpacks section (in that order).
    • Click on the Deploy tab and connect the Heroku app to the GitHub repository.
    • Choose the branch you want to deploy in the Manual deploy section and click on Deploy Branch.

Credits

The following resources were used in the project:

Code

Example texts

The example texts in the files example1.txt and example2.md and in the Google Sheets database were found at:

Acknowledgements

  • I would like to thank my Code Institute mentor Can for his continued support and helpful advice.
  • I would like to thank the Code Institute tutors for their support.
  • I would like to thank all friends and family members who have tested the application for their helpful feedback and suggestions.

About

A command line application for fast and efficient spell checking, synonym suggestions and text metrics.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published