Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA acceleration for VMAF #163

Open
TychoRasch opened this issue Dec 21, 2023 · 13 comments
Open

CUDA acceleration for VMAF #163

TychoRasch opened this issue Dec 21, 2023 · 13 comments

Comments

@TychoRasch
Copy link

With release v3.0.0 of VMAF, CUDA support has been added.
https://github.com/Netflix/vmaf/releases/tag/v3.0.0

Now it is my understanding that this speeds up the VMAF calculation enormously, with users on reddit claiming a 10x speedup. Is it possible to add an option to ab-av1 that allows for CUDA acceleration on the VMAF calculation if the user has an NVidia GPU?

@alexheretic
Copy link
Owner

Seems possible. What we'd probably need is:

  • How to call ffmpeg vmaf with cuda, e.g. a CLI example.
  • How to detect if cuda is supported (and system ffmpeg + vmaf supports it) to use the above automatically.
  • Someone with nvidia hardware to test (I don't).

@alexdns1
Copy link

I would like to see it implemented too but unfortunately i dont think the ffmpeg part is ready so maybe an option to run vmaf from its binary and not via ffmpeg ?

@TychoRasch
Copy link
Author

@alexheretic I'll keep an eye out for the ffmpeg implementation and will update accordingly. I also have nvidia hardware to test and will look into cuda detection.

@alexdns1
Copy link

@TychoRasch https://github.com/Netflix/vmaf/blob/master/Dockerfile.cuda#L39 looks like their docker is running a patched ffmpeg

@alexdns1
Copy link

@TychoRasch correction looks like libvmaf_cuda is in latest ffmpeg

@zachron
Copy link

zachron commented Jan 14, 2024

@alexheretic I know i was not part of the initial request, but i was looking for this and saw this in the FFMPEG docs,
https://ffmpeg.org/ffmpeg-filters.html#libvmaf_005fcuda its the CLI example. i also have nvidia hardware and am willing to test it.

@alexdns1
Copy link

@alexheretic I know i was not part of the initial request, but i was looking for this and saw this in the FFMPEG docs, https://ffmpeg.org/ffmpeg-filters.html#libvmaf_005fcuda its the CLI example. i also have nvidia hardware and am willing to test it.

Works for me

@alexdns1
Copy link

`/opt/ffmpeg_vmaf/bin/ffmpeg -hwaccel cuda -hwaccel_output_format cuda -codec:v h264_cuvid -i test.mp4 -hwaccel cuda -hwaccel_output_format cuda -codec:v h264_cuvid -i test.mp4 -filter_complex "
[0:v]scale_cuda=format=yuv420p[ref];
[1:v]scale_cuda=format=yuv420p[dis];
[dis][ref]libvmaf_cuda=log_fmt=json:log_path=output.json
" -f null -
ffmpeg version N-113111-g4fee63b241 Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 11 (GCC)
configuration: --enable-nonfree --enable-ffnvcodec --enable-cuda-llvm --enable-cuda-nvcc --enable-libvmaf --enable-vapoursynth --enable-shared --prefix=/opt/ffmpeg_vmaf
libavutil 58. 36.100 / 58. 36.100
libavcodec 60. 36.100 / 60. 36.100
libavformat 60. 20.100 / 60. 20.100
libavdevice 60. 4.100 / 60. 4.100
libavfilter 9. 14.101 / 9. 14.101
libswscale 7. 6.100 / 7. 6.100
libswresample 4. 13.100 / 4. 13.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'test.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
creation_time : 2021-07-24T18:17:54.000000Z
Duration: 00:22:43.80, start: 0.000000, bitrate: 8266 kb/s
Stream #0:00x1: Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x816, 7943 kb/s, 25 fps, 25 tbr, 25k tbn (default)
Metadata:
creation_time : 2021-07-24T18:17:54.000000Z
handler_name : ?Mainconcept Video Media Handler
vendor_id : [0][0][0][0]
encoder : AVC Coding
Stream #0:10x2: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
Metadata:
creation_time : 2021-07-24T18:17:54.000000Z
handler_name : #Mainconcept MP4 Sound Media Handler
vendor_id : [0][0][0][0]
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from 'test.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
creation_time : 2021-07-24T18:17:54.000000Z
Duration: 00:22:43.80, start: 0.000000, bitrate: 8266 kb/s
Stream #1:00x1: Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x816, 7943 kb/s, 25 fps, 25 tbr, 25k tbn (default)
Metadata:
creation_time : 2021-07-24T18:17:54.000000Z
handler_name : ?Mainconcept Video Media Handler
vendor_id : [0][0][0][0]
encoder : AVC Coding
Stream #1:10x2: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
Metadata:
creation_time : 2021-07-24T18:17:54.000000Z
handler_name : #Mainconcept MP4 Sound Media Handler
vendor_id : [0][0][0][0]
Stream mapping:
Stream #0:0 (h264_cuvid) -> scale_cuda:default (graph 0)
Stream #1:0 (h264_cuvid) -> scale_cuda:default (graph 0)
libvmaf_cuda:default (graph 0) -> Stream #0:0 (wrapped_avframe)
Stream #0:1 -> #0:1 (aac (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
encoder : Lavf60.20.100
Stream #0:0: Video: wrapped_avframe, cuda(tv, bt709, progressive), 1920x832 [SAR 1:1 DAR 30:13], q=2-31, 200 kb/s, 25 fps, 25 tbn
Metadata:
encoder : Lavc60.36.100 wrapped_avframe
Stream #0:1(eng): Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s (default)
Metadata:
creation_time : 2021-07-24T18:17:54.000000Z
handler_name : #Mainconcept MP4 Sound Media Handler
vendor_id : [0][0][0][0]
encoder : Lavc60.36.100 pcm_s16le
frame= 4421 fps=553 q=-0.0 size=N/A time=00:02:56.84 bitrate=N/A speed=22.1x

[q] command received. Exiting.

[Parsed_libvmaf_cuda_2 @ 0x7fb038005080] VMAF score: 99.265020
[out#0/null @ 0x2237ac0] video:2212kB audio:35328kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
frame= 4718 fps=548 q=-0.0 Lsize=N/A time=00:03:08.41 bitrate=N/A speed=21.9x

tail -n 20 output.json
"max": 1.000000,
"mean": 0.999991,
"harmonic_mean": 0.999991
},
"integer_vif_scale3": {
"min": 0.999983,
"max": 1.000000,
"mean": 0.999991,
"harmonic_mean": 0.999991
},
"vmaf": {
"min": 97.422102,
"max": 100.000000,
"mean": 99.265020,
"harmonic_mean": 99.256884
}
},
"aggregate_metrics": {
}
}
`

@alexheretic
Copy link
Owner

I've added an experimental branch throwing in the example args for CUDA accelerated vmaf #178.

This should be easier to test now vmaf runs are simpler single process calls (since #177). Since I can't test myself please let me know how the args should be changed in the PR. E.g. how -c:v should be determined.

The PR can be installed locally with cargo install --git https://github.com/alexheretic/ab-av1 --branch cuda-vmaf

@sven-pke
Copy link

sven-pke commented May 8, 2024

Here is what I get as output using the above PR with vmaf cuda ffmpeg build:

D:>ab-av1 vmaf --cuda --reference source.mov --distorted test.mkv
⠙ 00:00:00 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- (vmaf running, eta 0s)
DEBUG: Using ffmpeg -filter_complex [0:v]scale_cuda=format=yuv444p10le,setpts=PTS-STARTPTS[dis];[1:v]scale_cuda=format=yuv444p10le,setpts=PTS-STARTPTS[ref];[dis][ref]libvmaf_cuda=:model=version=vmaf_4k_v0.6.1
Error: ffmpeg vmaf exit code -22
---stderr---
ffmpeg version N-115146-ga71e46383d-gf8715d0300+3 Copyright (c) 2000-2024 the FFmpeg developers
built with gcc 13.2.0 (Rev6, Built by MSYS2 project)
configuration: --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache g++' --ld='ccache g++' --extra-cxxflags=-fpermissive --extra-cflags=-Wno-int-conversion --disable-autodetect --enable-amf --enable-bzlib --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2 --enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libdav1d --enable-libaom --disable-debug --enable-libfdk-aac --enable-fontconfig --enable-libass --enable-libbluray --enable-libfreetype --enable-libmfx --enable-libmysofa --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp --enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl --enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3 --enable-librav1e --enable-libsrt --enable-libgsm --enable-libvmaf --enable-libsvtav1 --enable-chromaprint --enable-decklink --enable-frei0r --enable-libaribb24 --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfribidi --enable-libgme --enable-libilbc --enable-libsvthevc --enable-libsvtvp9 --enable-libkvazaar --enable-libmodplug --enable-librist --enable-librtmp --enable-librubberband --enable-libtesseract --enable-libxavs --enable-libzmq --enable-libzvbi --enable-openal --enable-libcodec2 --enable-ladspa --enable-libglslang --enable-vulkan --enable-libdavs2 --enable-libxavs2 --enable-libuavs3d --enable-libplacebo --enable-libjxl --enable-opencl --enable-opengl --enable-libnpp --enable-libopenh264 --enable-openssl --extra-cflags=-DLIBTWOLAME_STATIC --extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC --extra-cflags=-DCHROMAPRINT_NODLL --extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi --extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads --extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree --extra-cflags='-IC:/PROGRA1/NVIDIA2/CUDA/v12.4/include' --extra-ldflags='-LC:/PROGRA1/NVIDIA2/CUDA/v12.4/lib/x64' --extra-cflags=-DAL_LIBTYPE_STATIC --extra-cflags='-ID:/media-autobuild_suite-master/local64/include' --extra-cflags='-ID:/media-autobuild_suite-master/local64/include/AL'
libavutil 59. 17.100 / 59. 17.100
libavcodec 61. 5.103 / 61. 5.103
libavformat 61. 3.103 / 61. 3.103
libavdevice 61. 2.100 / 61. 2.100
libavfilter 10. 2.101 / 10. 2.101
libswscale 8. 2.100 / 8. 2.100
libswresample 5. 2.100 / 5. 2.100
libpostproc 58. 2.100 / 58. 2.100
[AVFilterGraph @ 000001e293171400] No option name near ':model=version=vmaf_4k_v0.6.1'
[AVFilterGraph @ 000001e293171400] Error parsing a filter description around:
[AVFilterGraph @ 000001e293171400] Error parsing filterchain '[dis][ref]libvmaf_cuda=:model=version=vmaf_4k_v0.6.1' around:
Failed to set value '[0:v]scale_cuda=format=yuv444p10le,setpts=PTS-STARTPTS[dis];[1:v]scale_cuda=format=yuv444p10le,setpts=PTS-STARTPTS[ref];[dis][ref]libvmaf_cuda=:model=version=vmaf_4k_v0.6.1' for option 'filter_complex': Invalid argument
Error parsing global options: Invalid argument

Could be wrong but the path should look like: libvmaf_cuda=model_path=vmaf_4k_v0.6.1

@allrobot
Copy link

In addition to N-cards, there are also A-cards and Intel integrated graphics. Might consider purchasing a second-hand graphics card, which would be cheaper.

Making full use of these hardware to accelerate VMAF calculations is a good thing.

@baconsalad
Copy link

Any intention of adding this into main? I see the cuda branch is quote dated now. There is already a way to encode with av1_nvenc (from #201) so this would be the final step for full hw with nvidia cards.

@fernvenue
Copy link

Any update here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants