Skip to content

KelianB/MonoFaceCompute

Repository files navigation

MonoFaceCompute

This repository aims to facilitate preprocessing of monocular human face videos, covering a range of commonly used outputs from semantic segmentation to face tracking. The goal is to provide a convenient and coherent repository for research work.

Computations include:

Pull requests for other computations are welcome!

Setup

  1. Pull the submodules: ./pull_submodules.sh
  2. Run the setup script: ./setup.sh to build a conda environment with all required dependencies.
  3. Download pretrained models and other required files: ./download_all_assets.sh
  4. Configure your dataset according to the examples in datasets.

One dataset consists of one or multiple monocular videos. Several parameters can be tweaked, such as the strategy for cropping the videos, what face tracker to use, what dimensions the crops should be resized to or what steps of the preprocessing pipeline to run.

This was tested on Ubuntu 22.04 with a NVIDIA A5000 GPU.

Usage

All computations are aggregated in a single entry point. Run the following command to process one dataset:

python process.py --datasets datasets/example.yaml

By default, the script will run the following steps:

  • Video extraction using FFMPEG
  • Face detection and cropping
  • Matting
  • Semantic segmentation
  • Landmarks detection
  • Tracking
  • Tracking refinement through a landmarks-based optimization

Dataset config

Supported fields of the dataset configuration files:

Parameter Help
base_dir Base directory from which to retrieve the video(s).
output_dir Where to save the processed data.
shape_sequence Name of the sequence to use for estimating face shape.
crop_mode fixed
crop_scale Scaling factor for the detected face boxes for cropping.
resize What size to resize the cropped image.
smooth_tracking Apply a low-pass filter to the optimized pose and expression values.
tracker What face tracker to use (DECA / EMOCA / FaceReconstruction / SMIRK).
shape_tracker Optionally specify a different face tracker for recovering shape parameters (DECA / EMOCA / FaceReconstruction / SMIRK).
steps What steps to launch (extract, crop, matte, segment, landmarks, track, optimize)
sequences Array of:
  • source: input video file, relative to base_dir (e.g. "1.mp4")
  • crop_mode: fixed / constant / smooth
  • face_selection_strategy: strategy to use for selecting a detection when there are multiple (max_confidence / leftmost / rightmost) (only used if crop_mode=constant or crop_mode=smooth)
  • fixed_crop: [center_x, center_y, size] (only used if crop_mode=fixed)

License

We refer to the individual submodules for their licensing information.
MonoFaceCompute itself is provided under a Attribution-NonCommercial-ShareAlike 4.0 license.