Skip to content

Latest commit

 

History

History
67 lines (44 loc) · 2.74 KB

README.md

File metadata and controls

67 lines (44 loc) · 2.74 KB

VapourSynth-BilateralGPU

Copyright© 2021 WolframRhodium

Bilateral filter in CUDA for VapourSynth.

Description

Bilateral filter is a non-linear, edge-preserving and noise-reducing smoothing filter for images.

The intensity value at each pixel in an image is replaced by a weighted average of intensity values from nearby pixels. This weight can be based on a Gaussian distribution.

Special thanks to Kice for doing most of the work in previous implementation.

Requirements

  • CPU with AVX2 support.

  • CUDA-enabled GPU(s) of compute capability 5.0 or higher (Maxwell+).

  • GPU driver >= v452.39 for GeForce or bilateral_rtc users or >= v496.13 in general.

The plugin can run on older generation of GPUs or CPU without AVX2 support by manual compilation.

The _rtc version requires compute capability 3.5 or higher, GPU driver 465 or newer and has dependencies on nvrtc64_112_0.dll/libnvrtc.so.11.2 and nvrtc-builtins64_114.dll/libnvrtc-builtins.so.11.4.50.

Supported Formats

sample type: 8-16 bit integer or 32 bit float Gray/YUV/RGB input

Usage

core.{bilateralgpu, bilateralgpu_rtc}.Bilateral(clip clip, float[] sigma_spatial=3.0, float[] sigma_color=0.02, int[] radius=0, int device_id=0, int num_streams=4, bool use_shared_memory=True)
  • clip: The input clip.

  • sigma_spatial: (Default: 3.0) Filter sigma in the coordinate space. Use an array to assign it for each plane. If "sigma_spatial" for the second plane is not specified, it will be set according to the sigma_spatial of first plane and sub-sampling.

  • sigma_color: (Default: 0.02) Filter sigma in the color space. Use an array to assign it for each plane, otherwise the same sigma_color is used for all the planes. It will be normalized internally, so that for clips with different bit depths, the same values get similar results.

  • radius: (Default: 0) Kernel window size. 0 = automatic calculatation based on "sigma_spatial".

  • device_id: (Default: 0) CUDA device ID.

  • num_streams: (Default: 4) Number of CUDA streams, enables concurrent kernel execution and data transfer.

  • use_shared_memory: (Default: True) Use on-chip memory to reduce bandwidth requirements on memory operations.

  • The _rtc version has two experimental parameters:

    • block_x, block_y: (Default: 16, 8) Block size of launch configuration of the kernel. Don't modify it unless you know what you are doing.

Compilation

cmake -S . -B build -D CMAKE_BUILD_TYPE=Release -D CMAKE_CUDA_FLAGS="--threads 0 --use_fast_math -Wno-deprecated-gpu-targets" -D CMAKE_CUDA_ARCHITECTURES="50;61-real;75-real;86"

cmake --build build --config Release