CUDASynth

These CUDA filters are packaged into DGDecodeNV, which is part of DGDecNV.
Post Reply
User avatar
hydra3333
Posts: 394
Joined: Wed Oct 06, 2010 3:34 am
Contact:

Re: CUDASynth

Post by hydra3333 »

Graft, D. A., 2016, “ Clauser-Horne/Eberhard inequality violation by a local model”, Advanced Science, Engineering and Medicine, 8: 496–502.
Nice work, Sir.
(I cited without reading, on the basis of not smart enough to comprehend it :))
I really do like it here.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

I'm sure you're smart enough; it's just a matter of studying the field first to avoid common errors.

That paper is probably my favorite. The idea hit me in the middle of coaching a swimming workout, and I developed the math on the back of the whiteboard while people were swimming a set. They asked me why I was suddenly in such a good mood! Just refuted a large number of experiments purporting to prove quantum nonlocality.

I'm going to do something about TrueHD demuxing then I will come back to CUDASynth. I can do it in parallel with my current physics paper. It's about EPR steering and Luders projection. Whee!
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

I'm back to CUDASynth for a while. I had to update everything because the CUDASynth DGDecodeNV was at 253.0.0.158 and current DGDecNV is 2053.0.0.179. Note that the DGHDRtoSDR used below is 1.11 and needs to be upgraded to 1.12.

Here is a test script:

loadplugin("D:\Don\Programming\C++\Avisynth filters\CUDASynth\DGDecodeNV\x64\release\dgdecodenv.dll")
loadplugin("d:\don\Programming\C++\avisynth filters\CUDASynth\DGHDRtoSDR\x64\release\dghdrtosdr.dll")
dgsource("LG Chess 4K Demo.dgi",fulldepth=true,fdst="gpu0")
dghdrtosdr(impl="255",light=250,fsrc="gpu0",fdst="gpu1",fulldepth=true)
dgdenoise(fsrc="gpu1",fdst="gpu0",chroma=true)
dgsharpen(fsrc="gpu0")
trim(0,999)

For non-CUDASynth operation all fsrc and fdst are replaced with "cpu".

Here are the test results showing a very healthy FPS improvement of x3.6 , enough to make the difference between non-real-time and real-time playback:

-----
D:\Don\Programming\C++\Avisynth filters\CUDASynth\CUDASynth Test 1>avsmeter64 "LG Chess 4K Demo - No CudaSynth.avs"

AviSynth+ 0.1 (r2728, MT, x86_64) (0.1.0.0)

Number of frames: 1000
Length (hh:mm:ss.ms): 00:00:16.683
Frame width: 3840
Frame height: 2160
Framerate: 59.940 (60000/1001)
Colorspace: YUV420P16

Frames processed: 1000 (0 - 999)
FPS (min | max | average): 8.496 | 30.45 | 27.05
Memory usage (phys | virt): 316 | 1362 MiB
Thread count: 19
CPU usage (average): 12%

Time (elapsed): 00:00:36.974

D:\Don\Programming\C++\Avisynth filters\CUDASynth\CUDASynth Test 1>avsmeter64 "LG Chess 4K Demo.avs"

AviSynth+ 0.1 (r2728, MT, x86_64) (0.1.0.0)

Number of frames: 1000
Length (hh:mm:ss.ms): 00:00:16.683
Frame width: 3840
Frame height: 2160
Framerate: 59.940 (60000/1001)
Colorspace: YUV420P16

Frames processed: 1000 (0 - 999)
FPS (min | max | average): 54.09 | 103.0 | 97.40
Memory usage (phys | virt): 316 | 1356 MiB
Thread count: 19
CPU usage (average): 12%

Time (elapsed): 00:00:10.267
-----
DAE avatar
Guest

Re: CUDASynth

Post by Guest »

Do you still have your 1080ti lying around?
Would be interesting to see two card performance if possible
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

I don't have a system that properly supports two cards.
DAE avatar
Guest

Re: CUDASynth

Post by Guest »

It was just an idea.
I am sure the average user doesn't have two cards installed either
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

Need all that extra power and power cables, extra PCIe x16 slot, and extra lanes on the CPU. I hope my next system will have all that. Anyway, expect about a x1.5 to x2.0 boost. ;)
DAE avatar
Guest

Re: CUDASynth

Post by Guest »

I am sure the average user doesn't have two cards installed either
I am a pretty average user, don't have two cards and don't know if my system could handle it for the reasons you mentioned
Just curiosity
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

Curiosity killed the cat. :lol:
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

Re: CUDASynth

Post by Guest 2 »

I'd really love to see MVTools2 and KNLMeans ported to CudaSynth :bow:

(My mostly used filter is SMDegrain ;) )
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

DGDenoise is my CUDASynth version of KNLMeans. It's included in DGDecodeNV.dll (standard and CUDASynth versions). Have you looked at it? It's already CUDASynth enabled. I will release the latest CUDASynth today.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

CUDASynth 0.3:

http://rationalqm.us/misc/CUDASynth_0.3.rar

Make your DGI files with DGIndexNV 2053.0.0.179.
User avatar
hydra3333
Posts: 394
Joined: Wed Oct 06, 2010 3:34 am
Contact:

Re: CUDASynth

Post by hydra3333 »

:bravo:

Thank you ! :D
I really do like it here.
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

Re: CUDASynth

Post by Guest 2 »

admin wrote:
Wed Aug 07, 2019 5:12 am
DGDenoise is my CUDASynth version of KNLMeans. It's included in DGDecodeNV.dll (standard and CUDASynth versions). Have you looked at it?
Yes, thanks and it has great results when used on its own. :salute:

I have tried to use it as prefilter for SMDegrain but I can't replicate the same results of KNLMeans as the latest is more "integrated" in the script and has some optimizations.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

Guest 2 wrote:
Wed Aug 07, 2019 12:14 pm
I have tried to use it as prefilter for SMDegrain but I can't replicate the same results of KNLMeans as the latest is more "integrated" in the script and has some optimizations.
Lack of relevant explanation and details will guarantee that you are ignored. Tough love!
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

Re: CUDASynth

Post by Guest 2 »

admin wrote:
Wed Aug 07, 2019 12:27 pm
Lack of relevant explanation and details will guarantee that you are ignored. Tough love!
Sorry, I was a bit in a hurry.

I am talking about this part of SMDegrain.avsi:

Code: Select all

# SMDegrain prefilters
 
function SMDegrain_prefilters (clip input, int "prefilter", bool "chroma", int "Chr", int "Chr2", int "bug_wa", bool "lsb", bool "lsb_in", bool "Interlaced", val "if5", int "pel", String "device_type", int "device_id", int "d", int "a", bool "slices", bool "planar", clip "inputP", clip "input8", clip "input8y", clip "inputY", val "input8h", float "h", String "knlm_params", String "cplace")
{
Interlaced   = Default( Interlaced  ,false)
slices       = default(slices, true)
if5          = Default( if5  ,interlaced ? (GetParity(input)                            ? true : false) : nop())
lsb_in       = Default( lsb_in  ,false)
lsb          = Default( lsb     ,lsb_in)
pel          = default( pel, (input.width () > 1099 ||  input.height() > (lsb_in ? 1199 : 599)) ? 1 : 2 )
sisphbd      = AvsPlusVersionNumber > 2294
chroma       = default( chroma, true)
planar       = Default( planar , input.isyuy2())
prefilter    = Default( prefilter, 3)
cplace    = Default( cplace, "mpeg2")
 
Chr          = Default(Chr,     chroma ? 3 : 1)
Chr2         = Default(Chr2,    chroma ? 3 : (prefilter==3 ? 2 : 1))
avs26        = !(VersionNumber() < 2.60)
 
Assert(!(!defined(inputP) && prefilter==-1), "prefilter must be between 0~4: "+string(prefilter))
lsb_native = sisphbd ? !(Input.BitsPerComponent() > 8 && (lsb)) : true
sisphbd ? Assert(lsb_native, "lsb hack is not Compatible with native high bit depth" ) : nop()
 
# Input preparation for: LSB_IN, Interlacing, Planar and MSuper optimization when pel=2
 
inputY  = defined(inputY )                    ? inputY  : planar      ? (lsb_in   ? Dither_YUY2toPlanar16(input)         : Interleaved2planar(input))                           : input
 
inputP  = defined(inputP )                    ? inputP  : !interlaced ? (pel == 2 ? inputY.AssumeFrameBased()            : inputY)                                             : \
                                                                        (if5      ? inputY.AssumeTFF().SeparateFields()  : inputY.AssumeBFF().SeparateFields())
 
input8h = defined(input8h) && isclip(input8h) ? input8h : lsb_in      ?             inputP. Ditherpost(mode=6, slice=slices)                                                    : nop()
input8y = defined(input8y)                    ? input8y : planar      ? (lsb_in   ? input8h.Dither_YUY2toInterleaved()   :  inputP)                                             : inputP
input8  = defined(input8 )                    ? input8  : lsb_in      ? (planar   ? input8y.Interleaved2planar()         : input8h)                                             : input8y
 
inputd  = (prefilter == 3 || prefilter == 4)  ? Interlaced ? if5 ? input.AssumeTFF().SeparateFields() : \
                                                                   input.AssumeBFF().SeparateFields() : \
                                                             input : nop()
 
bug_wa    = defined(bug_wa) ? bug_wa : interlaced && planar && chroma && !avs26 ? 2 : Chr # bug: crash prevention workaround
 
# The mt_merge() line for prefilter=3 should be swapped with a high bitdepth variant (Dither_merge16_8() ?) due to a 1 point limited range
# in both range ends, but then it won't work with planar sources. This isn't as critical since we are only trying to find motion vectors.
                     (prefilter==-1) ?  inputP                                                                                                                                      : \
                     (prefilter==0)  ?  input8.MinBlur(0,Chr,planar)                                               : \
                     (prefilter==1)  ?  input8.MinBlur(1,Chr,planar)                                               : \
                     (prefilter==2)  ?  input8.MinBlur(2,Chr,planar)                                               : \
                     (prefilter==3)  ?  (!planar && lsb ? Dither_merge16_8( inputd.Dfttest(sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,lsb=true,lsb_in=lsb_in,quiet=true), lsb_in?inputP:inputP.Dither_convert_8_to_16(),                                \
                                                                   lsb_in?inputP.Dither_lut16("x 4096 < 65535 x 19200 > 0 65535 x 4096 - 4.338916843220339 * - ? ?",u=1,v=1).Ditherpost(mode=6, slice=slices, u=Chr,   v=Chr)                                           \
                                                                         :inputd.mt_lut(      "x 16 < 255 x 75 > 0 255 x 16 - 255 75 16 - / * - ? ?",u=1,v=1), luma=chroma,                       u=Chr2,  v=Chr2)                                                       : \
                                 avs26 && planar && lsb ? Dither_merge16_8( inputd.Dfttest(sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,lsb=true,lsb_in=lsb_in,quiet=true).ConvertToYV16(), lsb_in?inputP:inputd.ConvertToYV16().Dither_convert_8_to_16(),                                \
                                                                   lsb_in?inputP.Dither_lut16("x 4096 < 65535 x 19200 > 0 65535 x 4096 - 4.338916843220339 * - ? ?",u=1,v=1).Ditherpost(mode=6, slice=slices, u=Chr,   v=Chr)                                           \
                                                                         :inputd.ConvertToYV16().mt_lut(      "x 16 < 255 x 75 > 0 255 x 16 - 255 75 16 - / * - ? ?",u=1,v=1), luma=chroma,                       u=Chr2,  v=Chr2).ConvertToYUY2().Interleaved2planar(!chroma)                                                       : \
                                           avs26 && planar ? mt_merge     (        Dfttest(!lsb_in?inputd:input8y,sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,dither=1).ConvertToYV16(),!lsb_in?input8y.Planar2Interleaved(!chroma).ConvertToYV16():input8y.ConvertToYV16(),                                                   \
                                                                          input8.Planar2Interleaved(!chroma).ConvertToYV16().mt_lut(      "x 16 < 255 x 75 > 0 255 x 16 - 4.322033898305085 * - ? ?",u=1,v=1), luma=planar?false:chroma,          u=bug_wa,v=bug_wa).ConvertToYUY2().Interleaved2planar(!chroma)                                                    : \
                                                     avs26 ? mt_merge     (Dfttest(inputd,sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,dither=1),input8,   \
                                                                          input8.mt_lut("x 16 scalef < range_max x 75 scalef > 0 range_max x 16 scalef - range_max 75 scalef 16 scalef - / * - ? ?",use_expr=2,u=1,v=1), luma=chroma, cplace=cplace, u=chr,v=chr)                                                   : \
                                                             mt_merge     (planar ? Dfttest(!lsb_in?inputd:input8y,sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,dither=1).Interleaved2planar(!chroma)                          : \
                                                                                    Dfttest(!lsb_in?inputd:input8 ,sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,dither=1),input8,   \
                                                                          input8.mt_lut(      "x 16 < 255 x 75 > 0 255 x 16 - 4.322033898305085 * - ? ?",u=1,v=1), luma=chroma,          u=bug_wa,v=bug_wa))                                                   : \
                     (prefilter==4)  ?  planar ? inputd.SMDegrain_KNLMeansCL(lsb=lsb, lsb_in=lsb_in, device_type=device_type, device_id=device_id, chroma=chroma, h=h, d=d, a=a, knlm_params=knlm_params).Interleaved2planar(!chroma) : \
                                                 inputd.SMDegrain_KNLMeansCL(lsb=lsb, lsb_in=lsb_in, device_type=device_type, device_id=device_id, chroma=chroma, h=h, d=d, a=a, knlm_params=knlm_params) : \
                                         Assert(false,    "prefilter must be between -1~4: "+string(prefilter))
}
And this above all:

Code: Select all

# SMDegrain_KNLMeansCL
 
function SMDegrain_KNLMeansCL (clip input, String "device_type", int "device_id", bool "chroma", bool "lsb", bool "lsb_in", float "h", int "d", int "a", String "knlm_params")
{
d            = Default( d ,0)
a            = Default( a ,1)
h            = Default( h ,7.0)
deviceid     = Default( device_id ,0)
knlm_params  = default(knlm_params, "")
chroma       = Default( chroma  ,true)
lsb_in       = Default( lsb_in  ,false)
lsb          = Default( lsb     ,lsb_in)
 
                                              sisphbd = AvsPlusVersionNumber > 2294
                                              fullchr = sisphbd ? input.is444() : input.isyv24()
                                              chr420  = sisphbd ? input.is420() : input.isyv12()
                                              nochr   = sisphbd ?   input.isy() : input.isy8()
                                              chrlsb  = chroma && !fullchr && !nochr
                                              NL_in   = lsb && !lsb_in ? input.Dither_convert_8_to_16() : input
                                              cnl     = chrlsb ? "Y" : input.isrgb() ? "auto" : chroma ? "YUV" : "Y"
 
NL_in = !chrlsb && input.isyuy2() ? NL_in.converttoyv16() : NL_in
 
                               chrlsb ? eval("""
                                             # In a more lucid state I could probably have laid out this block much better... or not...
 
                                             NL_W    = width(NL_in)
                                             Uclip   = sisphbd ? ExtractU(NL_in) : UToY8(NL_in)
                                             Vclip   = sisphbd ? ExtractV(NL_in) : VToY8(NL_in)
                                             NL_lsb  = (chr420 ? StackVertical(  lsb ? StackVertical(Dither_get_msb(uclip),Dither_get_msb(vclip)) : uclip,\
                                                                                 lsb ? StackVertical(Dither_get_lsb(uclip),Dither_get_lsb(vclip)) : vclip) : \
                                                                 StackHorizontal(uclip,vclip))
 
                                             nlc     = StackHorizontal(sisphbd ? ConvertToY(NL_in) : ConvertToY8(NL_in),NL_lsb)
 
                                             nlc     = Eval("nlc.KNLMeansCL(D=d, A=a, h=h,stacked=lsb_in || lsb,device_type=device_type,device_id=deviceid,channels=cnl" + knlm_params + ")")
 
                                             uvh = lsb_in || lsb ? uclip.height()/2 : uclip.height()
                                             uvw = uclip.width()
 
                                             nly = nlc.crop(0,0,chr420 ? -uvw : -(uvw+uvw),0)
 
                                             nlu = chr420 ? lsb_in || lsb ? StackVertical(Dither_get_msb(nlc).crop(NL_W,0,0,-uvh),Dither_get_lsb(nlc).crop(NL_W,0,0,-uvh)) : nlc.crop(NL_W,0,0,-uvh)                : \
                                                             nlc.crop(NL_W    ,0,-uvw,0)
                                             nlv = chr420 ? lsb_in || lsb ? StackVertical(Dither_get_msb(nlc).crop(NL_W,uvh, 0,0),Dither_get_lsb(nlc).crop(NL_W,uvh, 0,0)) : nlc.crop(NL_W,uvh, 0,0)                : \
                                                             nlc.crop(NL_W+uvw,0,   0,0)
                                             YToUV(nlu, nlv, nly)
                                            """) :   Eval("NL_in.KNLMeansCL(D=d, A=a, h=h,stacked=lsb_in || lsb,device_type=device_type,device_id=deviceid,channels=cnl" + knlm_params + ")")
 
!lsb && lsb_in ? Ditherpost(mode=6,slice=false) : last
input.isyuy2() ? converttoyuy2() : last
}
I am a script noob and totally helpless to recreate such procedures to adopt DGDenoise :|
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

OK, got it. I'll try to find some time to see what is feasible here.
DAE avatar
fidodkk
Posts: 4
Joined: Fri Jul 20, 2018 10:30 am

Re: CUDASynth

Post by fidodkk »

Oh a build for accessible.

I will love to play with it, when the weather gets colder here in north europe.
User avatar
hydra3333
Posts: 394
Joined: Wed Oct 06, 2010 3:34 am
Contact:

Re: CUDASynth

Post by hydra3333 »

admin wrote:
Wed Aug 07, 2019 8:29 am
CUDASynth 0.3:

http://rationalqm.us/misc/CUDASynth_0.3.rar

Make your DGI files with DGIndexNV 2053.0.0.179.
Hello. There's been new releases since DGIndexNV 2053.0.0.179 and I wonder whether 0.3 is compatible ?

And ... perhaps even maybe a CUDASynth v1.0 release say in 2020 ?

edit: I unfortunately muddied the waters over at https://forum.videohelp.com/threads/393 ... ost2565866 ... if you wished to stick your oar in to correct me, that's up to you.
PS great work all round, mate.
I really do like it here.
User avatar
Rocky
Posts: 3557
Joined: Fri Sep 06, 2019 12:57 pm

Re: CUDASynth

Post by Rocky »

0.3 should work with the current DGIndexNV. If there is a problem please advise.

Thanks for the kind words. I'll pop in over there and see what's cooking.
User avatar
Natasha
Posts: 150
Joined: Wed Nov 20, 2019 11:11 am

Re: CUDASynth

Post by Natasha »

hydra3333, you are real man for me. Would love to know you better. Send PM we make music together. Intimate theme. I'm hot, no?

your wish my command,
Natasha
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: CUDASynth

Post by admin »

Natasha, please find a dating site.
User avatar
Natasha
Posts: 150
Joined: Wed Nov 20, 2019 11:11 am

Re: CUDASynth

Post by Natasha »

Forum Mr. Big! Big boy, which site you hang at? We can hang out.
User avatar
hydra3333
Posts: 394
Joined: Wed Oct 06, 2010 3:34 am
Contact:

Re: CUDASynth

Post by hydra3333 »

I'd love to liaise with young Natty, however my liaising days have been over for about 40 years :)
Natasha is welcome to dream though !
I really do like it here.
User avatar
hydra3333
Posts: 394
Joined: Wed Oct 06, 2010 3:34 am
Contact:

Re: CUDASynth

Post by hydra3333 »

And, wishing you a Happy Christmas !!
I really do like it here.
Post Reply