KTGMC

KTGMC is a CUDA port of QTGMC.

Installation

Requires the following three files and AviSynthNeo:

KTGMC.avsi
KTGMC.dll
KNNEDI3.dll

Environment

Runs on NVIDIA GPUs with compute capability 3.5+.
Check your GPU support at NVIDIA’s CUDA GPU list (e.g. https://developer.nvidia.com/cuda-gpus).

Usage

Wrap processing with OnCPU and OnCUDA:

SetMemoryMax(2048, type=DEV_TYPE_CUDA)
srcfile="..."
LWLibavVideoSource(srcfile)
OnCPU(2).KTGMC(SourceMatch=3, Lossless=2, tr0=1, tr1=1, tr2=1).OnCUDA(2)

See AviSynthNeo for details on OnCPU, OnCUDA, and SetMemoryMax.

By default, the CUDA memory limit is 768MB, which is often insufficient.
Use SetMemoryMax to raise the CUDA memory to around 2GB.

Some GPUs may not have that much memory. In that case, increase Preset (simplify processing).
With Preset="Fast" or higher, the default 768MB usually does not cause performance drops.

Function

KTGMC(...)

CUDA version of QTGMC. Arguments are basically the same as QTGMC (see below for supported status). Both input and output are CUDA frames.

Additional arguments:

int useFlag = 0
- Available values:
  - 0: Normal processing
  - 1: Interpolate using previous field only
  - 2: Interpolate using next field only
- Intended for use with DecombUCF. Normally interpolation uses both previous and next fields; if either is corrupted, the output can be contaminated. With useFlag=1 or 2, you can avoid the corrupted field when generating interpolated frames.
int dev = 0
- GPU index to use (0–)
int analyzeBatch = 4
- Number of frames processed per KTGMC_MAnalyze call. Depending on image/block size, a single frame may not provide enough parallelism; batching multiple frames improves performance.

KTGMC limitations

Currently only YV12 (8‑bit) is supported.
Both width and height must be multiples of 4.

Many features are not yet implemented.
Without specifying other parameters, Preset supports Slower–Faster.
SourceMatch and Lossless are supported.

If you use unsupported features, a "Device unmatch" error is thrown.
Adjust parameters to use only supported features.

Supported TR is up to 2. Noise reduction features are not supported.
Motion estimation Overlap supports only half of Blocksize (i.e., 16 for Blocksize=32, 8 for Blocksize=16).
With Preset="Very Faster" or higher, Blocksize=32 and Overlap=8 is forced, which causes an error.

EDI supports only NNEDI3. With Preset="Faster" and SourceMatch ≥ 1, it attempts to use unsupported Yadif, so beware.

Internal processing is optimized for forward frame retrieval; scrubbing backwards in an editor may be slow.

KNNEDI3(...)

CUDA version of NNEDI3. Arguments are the same as NNEDI3. CPU processing is also supported, so calling it on the CPU behaves like NNEDI3.

KNNEDI3 limitations

Only field=-2, dh=false are tested; others likely won’t work.
RGB and YUY2 are unsupported. YUV planar may work, but only YV12 has been tested.
Of int16/float internal arithmetic, only int16 is supported; thus 16‑bit definitely won’t work. Up to 15‑bit may work.
Without fapprox&2, it switches to float arithmetic and fails.
pscrn supports only ≥2. opt and threads are irrelevant for CUDA mode.

KSMDegrain limitations

The following parameters accept only these values:

tr: 1, 2
pel: 1, 2
blksize: 8, 16, 32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KTGMC

KTGMC

Installation

Environment

Usage

Function

KTGMC(...)

KTGMC limitations

KNNEDI3(...)

KNNEDI3 limitations

KSMDegrain limitations

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally