This library is a collaborative effort developed as the final group project for the COMS W4995: Design Using C++ course in Fall 2023, taught by Professor Bjarne Stroustrup.
Authors: Yana Botvinnik, Maitar Asher, Noam Zaid, Elvina Wibisono, Elifia Muthia
This is a Gesture Recognition library written for C++ that allows users to effortlessly create models capable of recognizing gestures in images, live video streams, or recordings. Note: As of December 2023, this library is exclusively tested and compatible with MacOS. Our library is designed to run alongside Google's MediaPipe Libraries. We've set up a server that leverages Google’s MediaPipe and has been configured to work for MacOS as a default. The server plays a vital role in obtaining hand landmarks, a crucial step in our program that simplifies the classification problem. Instead of dealing with numerous pixels for each image, our approach involves working with 21 landmarks for each image.
This project takes inspiration from a similar gesture recognition library in Python, the GRLib, that's algorithm has been nicely documented here.
-
Gesture Recognition Tutorial: Explore the step-by-step tutorial on setting up the environment, processing data, training models, and making predictions.
-
Gesture Recognition Manual: For more detailed information about each step, configuration options, and advanced features, refer to our comprehensive manual.
-
Design Documentation: Read our design documentation for an in-depth understanding of the architecture, system components, and implementation details.
-
ASL Alphabet Recognition Application Demo: See the tool in action here!
-
Powerpoint Gesture Control Application Demo: See the tool in action here!
- Install Homebrew (a package manager to install library dependencies)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- Install OpenCV
brew install opencv
- Install bazelisk
brew install bazelisk
- Install opencv@3
brew install opencv@3
- Install ffmpeg
brew install ffmpeg
- Install Numpy
brew install numpy
- Install XCode from the App Store
-
Clone the modified Mediapipe repository
-
Move into the newly cloned Mediapipe repository
cd mediapipe
- Build the file
bazel build -c opt --define MEDIAPIPE_DISABLE_GPU=1 mediapipe/mediapipe_samples/mediapipe_sample:mediapipe_sample
- Execute the following command to run the server
GLOG_logtostderr=1 bazel-bin/mediapipe/mediapipe_samples/mediapipe_sample/mediapipe_sample --calculator_graph_config_file=mediapipe/graphs/hand_tracking/hand_tracking_desktop_live.pbtxt
Execute the following command to clone this repository
git clone https://github.com/maitarasher/Gesture-Recognition.git
- Optional: Customize your augmentation pipeline
You may add/remove stages into your pipeline. The code is located at: Gesture-Recognition/processing_data/processing.cpp
-
Navigate to Gesture-Recognition root directory
-
Build the processing application
cd processing_data
mkdir build
cd build
cmake ..
make
- Compile the processing application to generate landmarks representation of your data
./processing <training_images_dir_path> <output_folder>
-
Navigate to Gesture-Recognition root directory
-
Build the application
cd gesture_asl
mkdir build
cd build
cmake ..
make
- Run the Application
./asl_application ../../data/asl
- Run the commands below for the dependencies
brew install jsoncpp
brew install pkg-config
-
Navigate to Gesture-Recognition root directory
-
Run the script to prepare the data
cd processing_data/processing_cocodataset
mkdir build
cd build
cmake ..
make
./coco_dataset_export <coco_folder_path> <output_folder>
-
Navigate to Gesture-Recognition root directory
-
Build the application
cd gesture_pptx
mkdir build
cd build
cmake ..
make
./gesture_pptx ../../data/pptx <path_to_pptx>