CUDASynth

These CUDA filters are packaged into DGDecodeNV, which is part of DGDecNV.
User avatar
DJATOM
Posts: 32
Joined: Fri Oct 16, 2015 6:14 pm

Re: CUDASynth

Post by DJATOM » Mon Oct 08, 2018 2:14 pm

avscompat layer seems to be working, but speed is near the same as in "cpu" mode.
But still relatively fast - near 65 fps (default settings, DGSource -> DGDenoise -> DGSharpen) and 105 fps (default settings, DGSource -> DGDenoise).
Hardware: GTX 750, i5-4670k

User avatar
admin
Site Admin
Posts: 4002
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin » Mon Oct 08, 2018 4:04 pm

Thanks for the results, DJ. Just out of interest I'd like to see a benchmark of a script for these three (no CUDASynth):

Avisynth+
Vapoursynth native
Vapoursynth avscompat

I seem to recall when doing some testing recently both Vapoursynth ways fell short compared to Avisynth+, but I haven't tried it recently.

User avatar
DJATOM
Posts: 32
Joined: Fri Oct 16, 2015 6:14 pm

Re: CUDASynth

Post by DJATOM » Mon Oct 08, 2018 4:37 pm

Ok, for now I've checked same script
ClearAutoloadDirs()
LoadPlugin("C:\322\x64\DGDecodeNV.dll")
DGSource("J:\Darling6\STREAM\EP16.dgi")
DGDenoise()
DGSharpen()
trim(0,6000)
and
ClearAutoloadDirs()
LoadPlugin("C:\322\x64\DGDecodeNV.dll")
DGSource("J:\Darling6\STREAM\EP16.dgi",fdst="gpu0")
DGDenoise(fsrc="gpu0",fdst="gpu0")
DGSharpen(fsrc="gpu0",fdst="cpu")
trim(0,6000)
So CUDASynth works in the native avs+.
C:\322>avs2yuv64 EP16.avs -o NUL
Avs2YUV 0.28
Script file: EP16.avs
Resolution: 1920x1080
Frames per sec: 24000/1001 (23.976)
Total frames: 6001
CSP: YV12
Progress Frames FPS Elapsed Remain
[100.0%] 6000/6001 86.72 0:01:09 0:00:00
Started: Tue Oct 9 00:24:34 2018
Finished: Tue Oct 9 00:25:43 2018
Elapsed: 0:01:09

C:\322>avs2yuv64 EP16.avs -o NUL
Avs2YUV 0.28
Script file: EP16.avs
Resolution: 1920x1080
Frames per sec: 24000/1001 (23.976)
Total frames: 6001
CSP: YV12
Progress Frames FPS Elapsed Remain
[100.0%] 6000/6001 102.32 0:00:58 0:00:00
Started: Tue Oct 9 00:26:26 2018
Finished: Tue Oct 9 00:27:25 2018
Elapsed: 0:00:59
I'll measure avscompat (without and with fsrc/fdst) soon, need to close browser to have more GPU RAM for testing.
And as there are no native Vapoursynth versions for DGDenoise/DGSharpen, should I check them in avscompat and DGSource in the native modes?
Last edited by DJATOM on Mon Oct 08, 2018 4:53 pm, edited 1 time in total.

gonca
Distinguished Member
Distinguished Member
Posts: 607
Joined: Sun Apr 08, 2012 6:12 pm

Re: CUDASynth

Post by gonca » Mon Oct 08, 2018 4:46 pm

LoadPlugin("C:/Program Files (Portable)/dgdecnv/x64 Binaries/DGDecodeNV.dll")
DGSource("I:\test.dgi", fieldop=0, fulldepth=True)
ConvertBits(10)
FPS 92.3
import vapoursynth as vs
core = vs.get_core()
core.std.LoadPlugin("C:/Program Files (Portable)/dgdecnv/x64 Binaries/DGDecodeNV.dll")
clip = core.dgdecodenv.DGSource(r'I:\test.dgi', fieldop=0, fulldepth=True)
clip = core.resize.Point(clip, format=vs.YUV420P10)
clip.set_output()
FPS 133.0
import vapoursynth as vs
core = vs.get_core()
core.avs.LoadPlugin("C:/Program Files (Portable)/dgdecnv/x64 Binaries/DGDecodeNV.dll")
clip = core.avs.DGSource("I:/test.dgi", fieldop=0, fulldepth=True)
clip = core.resize.Point(clip, format=vs.YUV420P10)
clip.set_output()
FPS 129.8
Source was a 4K clip

Edit
Avs compatability is 2x faster with cudasynth than with the regular version, 4K sample with DGHDRtoSDR (default) and DGSharpen (default)

User avatar
DJATOM
Posts: 32
Joined: Fri Oct 16, 2015 6:14 pm

Re: CUDASynth

Post by DJATOM » Mon Oct 08, 2018 4:52 pm

cudasynth in avscompat:
import vapoursynth as vs
core = vs.get_core()

core.avs.LoadPlugin(r'C:\322\x64\DGDecodeNV.dll')

clip = core.avs.DGSource(r'J:\Darling6\STREAM\EP16.dgi', fdst="gpu0")
clip = core.avs.DGDenoise(clip, fsrc="gpu0", fdst="gpu0")
clip = core.avs.DGSharpen(clip, fsrc="gpu0", fdst="cpu")
clip = core.std.Trim(clip, 0, 6000)
clip.set_output()
Image

no cudasynth in avscompat:
import vapoursynth as vs
core = vs.get_core()

core.avs.LoadPlugin(r'C:\322\x64\DGDecodeNV.dll')

clip = core.avs.DGSource(r'J:\Darling6\STREAM\EP16.dgi")
clip = core.avs.DGDenoise(clip)
clip = core.avs.DGSharpen(clip)
clip = core.std.Trim(clip, 0, 6000)
clip.set_output()
Image

native DGSource + avscompat DGDenoise and DGSharpen:
import vapoursynth as vs
core = vs.get_core()

core.std.LoadPlugin(r'C:\322\x64\DGDecodeNV.dll')
core.avs.LoadPlugin(r'C:\322\x64\DGDecodeNV.dll')

clip = core.dgdecodenv.DGSource(r'J:\Darling6\STREAM\EP16.dgi')
clip = core.avs.DGDenoise(clip)
clip = core.avs.DGSharpen(clip)
clip = core.std.Trim(clip, 0, 6000)
clip.set_output()
Image

I don't know why we have such results, at least I tried to compare with minimum differences in the resource usage (with closed browser, etc).

gonca
Distinguished Member
Distinguished Member
Posts: 607
Joined: Sun Apr 08, 2012 6:12 pm

Re: CUDASynth

Post by gonca » Mon Oct 08, 2018 4:57 pm

DJATOM

Two items
Don't know if DGDenoise is actually cudasynth enabled yet
clip = core.avs.DGDenoise(clip, fsrc="gpu0", fdst="gpu0")
clip = core.avs.DGSharpen(clip, fsrc="gpu0", fdst="cpu")
should actually be
clip = core.avs.DGDenoise(clip, fsrc="gpu0", fdst="gpu1")
clip = core.avs.DGSharpen(clip, fsrc="gpu1", fdst="cpu")
to get the ping pong effect

User avatar
DJATOM
Posts: 32
Joined: Fri Oct 16, 2015 6:14 pm

Re: CUDASynth

Post by DJATOM » Mon Oct 08, 2018 5:04 pm

Oh, I thought gpu0/gpu1 is for 2 cards setup (I have only one).

gonca
Distinguished Member
Distinguished Member
Posts: 607
Joined: Sun Apr 08, 2012 6:12 pm

Re: CUDASynth

Post by gonca » Mon Oct 08, 2018 5:09 pm

I only have one card as well
I think it has to do with the pipelines/kernels???

Try it and see if it makes a difference

User avatar
DJATOM
Posts: 32
Joined: Fri Oct 16, 2015 6:14 pm

Re: CUDASynth

Post by DJATOM » Mon Oct 08, 2018 5:16 pm

Tried and...
Image

gonca
Distinguished Member
Distinguished Member
Posts: 607
Joined: Sun Apr 08, 2012 6:12 pm

Re: CUDASynth

Post by gonca » Mon Oct 08, 2018 5:28 pm

Could you check on what your GPU usage is while running the script?

Post Reply