Skip to content

Google Translate for Images. Built for CV1430 Final Project using Python. Poster attached at bottom of README includes more details.

Notifications You must be signed in to change notification settings

jahaskell53/cv-finalproject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Computer Vision Final Project

Computer Vision (CS1430) Final Project by Jakobi Haskell, Anh Duong, Ayman Benjelloun Touimi & Adam Mroueh. Full Colab notebook here: https://colab.research.google.com/drive/1tCy18ThUYPCqvCGPS7Sx6-C2hfNiaveT?authuser=1#scrollTo=yYxtkRxM_Hdn

Description

The project is a mini version of Google Translate by images.

We wrote our own scripts to generate and prepare data (generating masks) to comply to COCO format. The data is a list of thousands of images with randomly sized, colored and fonted alphabetical lowercase characters. Example: Unknown Unknown-4

We then trained Mask-RCNN on character detection & classification:

Unknown-2 Unknown-3

Finally, we wrote our own parsing algorithm that parses the character into words, and words into string. These strings are then translated using Google Translate API, and finally overlaid on top of the original image, also using another algorithm we wrote.

Screenshot 2023-07-07 at 11 24 02 PM

Poster

Text Recognition, Translation, and Transformation with Mask-RCNN.pdf

About

Google Translate for Images. Built for CV1430 Final Project using Python. Poster attached at bottom of README includes more details.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •