Skip to content

Framework for developing graph convolutional networks for prediction of binary movement outcome

License

Notifications You must be signed in to change notification settings

DeepInMotion/MovementOutcome

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MovementOutcome

How to setup on a Windows machine

GPU activation

We strongly advise the use of workstation with NVIDIA GPU to speed up training of models. To enable use of GPU, follow these instructions (NB: Please ignore this step if similar GPU activation has been performed while setting up Markerless framework):

  1. Download Visual Studio 2017 Free Community Edition and install the program by following the necessary steps.
  2. Download CUDA Toolkit 11.1 Update 1 and follow instructions to perform installation.
  3. Copy the file 'ptxas.exe' in the folder 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin' to 'Desktop'.
  4. Download CUDA Toolkit 11.0 Update 1 and follow instructions to perform installation.
  5. Copy the file 'ptxas.exe' from 'Desktop' to the folder 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin'.
  6. Create a user at NVIDIA.com and download CUDNN 8.0.4.
  7. Open 'cudnn-11.0-windows-x64-v8.0.4.30.zip' in 'Downloads' and move the files in the folders 'bin', 'include', and 'lib' under 'cuda' to associated folders ('bin', 'include', and 'lib') in 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0'.
  8. Restart the computer.

Setup MovementOutcome framework

To setup the MovementOutcome framework, follow these instructions:

  1. Download Anaconda and perform the installation (if you have not previously downloaded and installed Anaconda).
  2. Open a command prompt and clone the MovementOutcome framework: git clone https://github.com/DeepInMotion/MovementOutcome.git
  3. Navigate to the MovementOutcome folder: cd MovementOutcome
  4. Create the virtual environment movementoutcome: conda env create -f environment.yml

How to use on a Windows machine

Neural architecture search, cross-validation, and evaluation

This is a step by step procedure for how to use the MovementOutcome framework to search for, cross-validate, and evaluate graph convolutional networks (GCNs) suitable for a particular dataset of movements of individuals related to a specific movement outcome:

  1. Open a command prompt and activate the virtual environment: activate movementoutcome
  2. Navigate to the MovementOutcome folder: cd MovementOutcome
  3. Open the code library in a web browser: jupyter lab
  4. Create a new project folder under 'projects' with a specified name (e.g., 'im2021').
  5. Create a subfolder within your project folder with name 'searches' (e.g., 'im2021/searches'). Your results from neural architecture search (NAS), cross-validation, and evaluation will be stored in this folder.
  6. Create a subfolder within your project folder with name 'data' (e.g., 'im2021/data').
  7. Upload coordinate files and outcomes
  • Alternative a) If you have raw coordinate CSVs (e.g., generated by the Markerless framework) not sorted into cross-validation folds and test set: Create a subfolder 'raw' within 'data', and upload your raw coordinate files (i.e., with prefix 'orgcoords_') into a folder named 'coords' (e.g., 'im2021/data/raw/coords') and outcome file (i.e., 'outcomes.csv') into 'outcomes' folder (e.g., 'im2021/data/raw/outcomes'). The procedure will randomize the coordinate files into folders for cross-validation folds (e.g., 'val1') and test set (i.e., 'test') and preprocess the coordinate files to generate Numpy array files for datasets that are stored in the 'processed' subfolder (e.g., 'im2021/data/processed/test_coords.npy').
  • Alternative b) If you have previously determined the dataset split and generated separate Numpy array files for coordinate files (e.g., 'test_coords.npy'), individual IDs (e.g., 'test_ids.npy'), and outcomes (e.g., 'test_labels.npy') of each dataset: Create a subfolder 'processed' within the 'data' folder (e.g., 'im2021/data/processed) and directly upload the three Numpy array files of each dataset into this folder.
  1. Set choices for NAS, cross-validation and/or evaluation in 'main.py':
  • Line 10: Set name of your project folder.
  • Line 22: Set name of the search. Hyperparameters of the search and all data related to individual search experiments will be stored inside a folder with the given search name within the 'searches' subfolder.
  • Line 25: Set search = True if you want to run NAS to find a suitable GCN, otherwise set search = False if you have previously run NAS.
  • Line 26: Set crossval = True if you want to cross-validate the GCN with highest performance (i.e., Area Under ROC Curve) on the NAS, otherwise use crossval = False to skip cross-validation.
  • Line 27: Set evaluate = True if you want to evaluate on the test set the GCN instances obtained from cross-validation, otherwise use evaluate = False. The evaluation will use the GCN instances as an ensemble where the final classification is based on the aggregated prediction across the instances.
  • Line 30-31: Define computational device responsible for the analysis (i.e., output_device) and number of workers responsible for data handling.
  • Line 34: Set reference to model script for defining GCN (e.g., model_script = 'models.gcn_search_model' for script 'gcn_search_model.py' in 'models').
  • Line 35: Set number of dimensions in coordinate files (e.g., input_dimensions = 2 for 2D coordinates).
  • Line 36: Set number of body parts in coordinate files (e.g., input_spatial_resolution = 19).
  • Line 37: Set number of time steps in a movement window (e.g., input_temporal_resolution = 150).
  • Line 38-40: Set additional fixed hyperparameters of a GCN.
  • Line 43-48: Define the human skeleton (e.g., body parts, neighboring body parts, and center of skeleton).
  • Line 52-57: Set biomechanical properties to use as input for GCN.
  • Line 60-63: Set temporal resolution of coordinate files and skeleton sequences (i.e., raw_frames_per_second and processed_frames_per_second) and options for preprocessing skeleton sequences with Butterworth filter.
  • Line 66-67: Set batch size for training and validation (e.g., trainval_batch_size = 32) and number of epochs taken into account for computing smoothed validation loss (e.g., loss_filter_size = 5).
  • Line 70: Set number of positive samples required per negative sample (i.e., train_num_positive_samples_per_negative_sample) to compensate for unbalanced datasets. Should ideally be set to number of individuals with negative outcome divided by number of individuals with positive outcome.
  • Line 75-91: Adjust hyperparameters of the optimizer and data augmentation if desired. However, we suggest using the default values as a starting point as they have worked well across a wide variety of GCNs.
  • Line 94-101: Set preferences for the evaluation process, including portion of individuals in the test set (i.e., test_size), distance between subsequent movement windows (e.g., parts_distance = 75 for 50% overlapping windows with input_temporal_resolution = 150), scheme for aggregating predictions across movement windows, and prediction threshold for classifying an individual as positive outcome (e.g., prediction_threshold = 0.5).
  • Line 104-137: Specify the details of the NAS, including hyperparameters of the K-Best Search strategy (e.g., k and performance_threshold), choices and associated alternatives in the search space (i.e., search_space), training and validation sets of the search (i.e., search_train_dataset and search_val_dataset), number of epochs (search_num_epochs), and performance requirements per epoch (i.e., search_critical_epochs and search_critical_epoch_values).
  • Line 140-147: Specify the details for cross-validation, including number of validation folds (i.e., crossval_folds) and number of epochs for each cross-validation run (i.e., crossval_num_epochs).
  1. Save 'main.py' (with the chosen hyperparameter setting).
  2. Open a new terminal window from the jupyter lab tab in the web browser.
  3. Run NAS, cross-validation and/or evaluation in the terminal window: python main.py
  4. The results of the NAS, cross-validation, and evaluation processes are stored in the folder of the current search within the 'searches' folder (e.g., 'im2021/searches/21092022 1522 IM2021').

Skeleton-based prediction of movement outcome

To employ a cross-validated GCN obtained by NAS for prediction of movement outcome from raw coordinate files we suggest the following steps:

  1. Set search details in prediction script (i.e., 'predict/prediction.py'):
  • Line 104: Set name of your project folder (e.g., 'im2021').
  • Line 114: Set name of the search used to obtain the GCN (e.g., '21092022 1522 IM2021').
  • Line 117: Set save = True if you want to save predicted risk of outcome, classification and associated certainty in CSV file, otherwise set save = False.
  • Line 118: Set visualize = True if you want to store class activation map visualization of body parts with highest contribution towards predicted risk of outcome, otherwise set visualize = False.
  • Line 121: Define computational device responsible for the analysis (i.e., output_device) and number of workers responsible for data handling.
  • Line 125: Set reference to model script for defining GCN (e.g., model_script = 'models.gcn_search_model' for script 'gcn_search_model.py' in 'models').
  • Line 126: Set number of dimensions in coordinate files (e.g., input_dimensions = 2 for 2D coordinates).
  • Line 127: Set number of body parts in coordinate files (e.g., input_spatial_resolution = 19).
  • Line 128: Set number of time steps in a movement window (e.g., input_temporal_resolution = 150).
  • Line 129-131: Set additional fixed hyperparameters of a GCN.
  • Line 134-139: Define the human skeleton (e.g., body parts, neighboring body parts, and center of skeleton).
  • Line 141: Set sample coordinates of human skeleton (i.e., sample_coords).
  • Line 144-149: Set biomechanical properties to use as input for GCN.
  • Line 152-155: Set temporal resolution of coordinate files and skeleton sequences (i.e., raw_frames_per_second and processed_frames_per_second) and options for preprocessing skeleton sequences with Butterworth filter.
  • Line 158-162: Set hyperparameters for the evaluation process, including batch size (i.e., evaluation_batch_size), distance between subsequent movement windows (e.g., parts_distance = 75 for 50% overlapping windows with input_temporal_resolution = 150), scheme for aggregating predictions across movement windows, and prediction threshold for classifying an individual as positive outcome (e.g., prediction_threshold = 0.5).
  • Line 165: Set the number of cross-validation folds (i.e., crossval_folds).
  1. Create a folder with coordinate files that should be analyzed by the prediction model (e.g., 'coords').
  2. Run the prediction script on the coordinate files in the created folder. E.g.: python predict/prediction.py coords
  3. The results of the prediction model are stored in a folder with the same name as the folder of the coordinate files in the specific search folder within 'searches' (e.g., 'im2021/searches/21092022 1522 IM2021/coords').
  • Predicted risk of outcome, classification, and associated classification certainty of each individual are stored in CSV file (e.g., 'im2021/searches/21092022 1522 IM2021/coords/individual1_results.png').
  • Class activation map visualizations of body part contributions towards risk of movement outcome for the individual are stored as PNG file (e.g., 'im2021/searches/21092022 1522 IM2021/coords/individual1_cam.png').

Tip: The ensemble script (i.e., 'predict/prediction_ensemble.py') performs prediction by combining the outputs of different GCNs obtained from seperate searches.

About

Framework for developing graph convolutional networks for prediction of binary movement outcome

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages