Code for the paper "Free-View Expressive Talking Head Video Editing" (ICASSP 2023)
Project Page: https://sky24h.github.io/websites/icassp2023_free-view_video-editing
Huggingface Demo: https://huggingface.co/spaces/sky24h/Free-View_Expressive_Talking_Head_Video_Editing
Python >= 3.9
pip install -r requirements.txt
Due to licensing issues, we cannot provide the full dataset. Instead, the URLs of videos and preprocessing scripts will be provided soon.
An example is provided in inference.sh:
bash inference.sh
Pretrained models will be automatically downloaded when running the code.
Please check inference.py for more details.
Our training includes two stages: 1) training the Multi-Attribute Discriminator for syncing the audio, attributes, and video, and 2) training the Generator for generating the talking head video.
python train_sync.sh
python train_gen.sh
Please check corresponding scripts for more details.
If you find this code useful, please cite our paper:
@inproceedings{Huang2023FETE,
author = {Huang, Yuantian and Iizuka, Satoshi and Fukui, Kazuhiro},
booktitle = {ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title = {Free-View Expressive Talking Head Video Editing},
year = {2023},
pages = {1-5},
doi = {10.1109/ICASSP49357.2023.10095745},
}