Page 5 of 15

Re: CUDASynth

Posted: Tue May 21, 2019 10:00 pm
by hydra3333
Graft, D. A., 2016, “ Clauser-Horne/Eberhard inequality violation by a local model”, Advanced Science, Engineering and Medicine, 8: 496–502.
Nice work, Sir.
(I cited without reading, on the basis of not smart enough to comprehend it :))

Re: CUDASynth

Posted: Tue May 21, 2019 10:49 pm
by admin
I'm sure you're smart enough; it's just a matter of studying the field first to avoid common errors.

That paper is probably my favorite. The idea hit me in the middle of coaching a swimming workout, and I developed the math on the back of the whiteboard while people were swimming a set. They asked me why I was suddenly in such a good mood! Just refuted a large number of experiments purporting to prove quantum nonlocality.

I'm going to do something about TrueHD demuxing then I will come back to CUDASynth. I can do it in parallel with my current physics paper. It's about EPR steering and Luders projection. Whee!

Re: CUDASynth

Posted: Sat Aug 03, 2019 12:55 pm
by admin
I'm back to CUDASynth for a while. I had to update everything because the CUDASynth DGDecodeNV was at 253.0.0.158 and current DGDecNV is 2053.0.0.179. Note that the DGHDRtoSDR used below is 1.11 and needs to be upgraded to 1.12.

Here is a test script:

loadplugin("D:\Don\Programming\C++\Avisynth filters\CUDASynth\DGDecodeNV\x64\release\dgdecodenv.dll")
loadplugin("d:\don\Programming\C++\avisynth filters\CUDASynth\DGHDRtoSDR\x64\release\dghdrtosdr.dll")
dgsource("LG Chess 4K Demo.dgi",fulldepth=true,fdst="gpu0")
dghdrtosdr(impl="255",light=250,fsrc="gpu0",fdst="gpu1",fulldepth=true)
dgdenoise(fsrc="gpu1",fdst="gpu0",chroma=true)
dgsharpen(fsrc="gpu0")
trim(0,999)

For non-CUDASynth operation all fsrc and fdst are replaced with "cpu".

Here are the test results showing a very healthy FPS improvement of x3.6 , enough to make the difference between non-real-time and real-time playback:

-----
D:\Don\Programming\C++\Avisynth filters\CUDASynth\CUDASynth Test 1>avsmeter64 "LG Chess 4K Demo - No CudaSynth.avs"

AviSynth+ 0.1 (r2728, MT, x86_64) (0.1.0.0)

Number of frames: 1000
Length (hh:mm:ss.ms): 00:00:16.683
Frame width: 3840
Frame height: 2160
Framerate: 59.940 (60000/1001)
Colorspace: YUV420P16

Frames processed: 1000 (0 - 999)
FPS (min | max | average): 8.496 | 30.45 | 27.05
Memory usage (phys | virt): 316 | 1362 MiB
Thread count: 19
CPU usage (average): 12%

Time (elapsed): 00:00:36.974

D:\Don\Programming\C++\Avisynth filters\CUDASynth\CUDASynth Test 1>avsmeter64 "LG Chess 4K Demo.avs"

AviSynth+ 0.1 (r2728, MT, x86_64) (0.1.0.0)

Number of frames: 1000
Length (hh:mm:ss.ms): 00:00:16.683
Frame width: 3840
Frame height: 2160
Framerate: 59.940 (60000/1001)
Colorspace: YUV420P16

Frames processed: 1000 (0 - 999)
FPS (min | max | average): 54.09 | 103.0 | 97.40
Memory usage (phys | virt): 316 | 1356 MiB
Thread count: 19
CPU usage (average): 12%

Time (elapsed): 00:00:10.267
-----

Re: CUDASynth

Posted: Sat Aug 03, 2019 3:12 pm
by Guest
Do you still have your 1080ti lying around?
Would be interesting to see two card performance if possible

Re: CUDASynth

Posted: Sun Aug 04, 2019 5:53 am
by admin
I don't have a system that properly supports two cards.

Re: CUDASynth

Posted: Sun Aug 04, 2019 6:32 am
by Guest
It was just an idea.
I am sure the average user doesn't have two cards installed either

Re: CUDASynth

Posted: Sun Aug 04, 2019 10:40 am
by admin
Need all that extra power and power cables, extra PCIe x16 slot, and extra lanes on the CPU. I hope my next system will have all that. Anyway, expect about a x1.5 to x2.0 boost. ;)

Re: CUDASynth

Posted: Sun Aug 04, 2019 10:57 am
by Guest
I am sure the average user doesn't have two cards installed either
I am a pretty average user, don't have two cards and don't know if my system could handle it for the reasons you mentioned
Just curiosity

Re: CUDASynth

Posted: Sun Aug 04, 2019 11:06 am
by admin
Curiosity killed the cat. :lol:

Re: CUDASynth

Posted: Wed Aug 07, 2019 2:06 am
by Guest 2
I'd really love to see MVTools2 and KNLMeans ported to CudaSynth :bow:

(My mostly used filter is SMDegrain ;) )

Re: CUDASynth

Posted: Wed Aug 07, 2019 5:12 am
by admin
DGDenoise is my CUDASynth version of KNLMeans. It's included in DGDecodeNV.dll (standard and CUDASynth versions). Have you looked at it? It's already CUDASynth enabled. I will release the latest CUDASynth today.

Re: CUDASynth

Posted: Wed Aug 07, 2019 8:29 am
by admin
CUDASynth 0.3:

http://rationalqm.us/misc/CUDASynth_0.3.rar

Make your DGI files with DGIndexNV 2053.0.0.179.

Re: CUDASynth

Posted: Wed Aug 07, 2019 9:28 am
by hydra3333
:bravo:

Thank you ! :D

Re: CUDASynth

Posted: Wed Aug 07, 2019 12:14 pm
by Guest 2
admin wrote:
Wed Aug 07, 2019 5:12 am
DGDenoise is my CUDASynth version of KNLMeans. It's included in DGDecodeNV.dll (standard and CUDASynth versions). Have you looked at it?
Yes, thanks and it has great results when used on its own. :salute:

I have tried to use it as prefilter for SMDegrain but I can't replicate the same results of KNLMeans as the latest is more "integrated" in the script and has some optimizations.

Re: CUDASynth

Posted: Wed Aug 07, 2019 12:27 pm
by admin
Guest 2 wrote:
Wed Aug 07, 2019 12:14 pm
I have tried to use it as prefilter for SMDegrain but I can't replicate the same results of KNLMeans as the latest is more "integrated" in the script and has some optimizations.
Lack of relevant explanation and details will guarantee that you are ignored. Tough love!

Re: CUDASynth

Posted: Thu Aug 08, 2019 5:43 am
by Guest 2
admin wrote:
Wed Aug 07, 2019 12:27 pm
Lack of relevant explanation and details will guarantee that you are ignored. Tough love!
Sorry, I was a bit in a hurry.

I am talking about this part of SMDegrain.avsi:

Code: Select all

# SMDegrain prefilters
 
function SMDegrain_prefilters (clip input, int "prefilter", bool "chroma", int "Chr", int "Chr2", int "bug_wa", bool "lsb", bool "lsb_in", bool "Interlaced", val "if5", int "pel", String "device_type", int "device_id", int "d", int "a", bool "slices", bool "planar", clip "inputP", clip "input8", clip "input8y", clip "inputY", val "input8h", float "h", String "knlm_params", String "cplace")
{
Interlaced   = Default( Interlaced  ,false)
slices       = default(slices, true)
if5          = Default( if5  ,interlaced ? (GetParity(input)                            ? true : false) : nop())
lsb_in       = Default( lsb_in  ,false)
lsb          = Default( lsb     ,lsb_in)
pel          = default( pel, (input.width () > 1099 ||  input.height() > (lsb_in ? 1199 : 599)) ? 1 : 2 )
sisphbd      = AvsPlusVersionNumber > 2294
chroma       = default( chroma, true)
planar       = Default( planar , input.isyuy2())
prefilter    = Default( prefilter, 3)
cplace    = Default( cplace, "mpeg2")
 
Chr          = Default(Chr,     chroma ? 3 : 1)
Chr2         = Default(Chr2,    chroma ? 3 : (prefilter==3 ? 2 : 1))
avs26        = !(VersionNumber() < 2.60)
 
Assert(!(!defined(inputP) && prefilter==-1), "prefilter must be between 0~4: "+string(prefilter))
lsb_native = sisphbd ? !(Input.BitsPerComponent() > 8 && (lsb)) : true
sisphbd ? Assert(lsb_native, "lsb hack is not Compatible with native high bit depth" ) : nop()
 
# Input preparation for: LSB_IN, Interlacing, Planar and MSuper optimization when pel=2
 
inputY  = defined(inputY )                    ? inputY  : planar      ? (lsb_in   ? Dither_YUY2toPlanar16(input)         : Interleaved2planar(input))                           : input
 
inputP  = defined(inputP )                    ? inputP  : !interlaced ? (pel == 2 ? inputY.AssumeFrameBased()            : inputY)                                             : \
                                                                        (if5      ? inputY.AssumeTFF().SeparateFields()  : inputY.AssumeBFF().SeparateFields())
 
input8h = defined(input8h) && isclip(input8h) ? input8h : lsb_in      ?             inputP. Ditherpost(mode=6, slice=slices)                                                    : nop()
input8y = defined(input8y)                    ? input8y : planar      ? (lsb_in   ? input8h.Dither_YUY2toInterleaved()   :  inputP)                                             : inputP
input8  = defined(input8 )                    ? input8  : lsb_in      ? (planar   ? input8y.Interleaved2planar()         : input8h)                                             : input8y
 
inputd  = (prefilter == 3 || prefilter == 4)  ? Interlaced ? if5 ? input.AssumeTFF().SeparateFields() : \
                                                                   input.AssumeBFF().SeparateFields() : \
                                                             input : nop()
 
bug_wa    = defined(bug_wa) ? bug_wa : interlaced && planar && chroma && !avs26 ? 2 : Chr # bug: crash prevention workaround
 
# The mt_merge() line for prefilter=3 should be swapped with a high bitdepth variant (Dither_merge16_8() ?) due to a 1 point limited range
# in both range ends, but then it won't work with planar sources. This isn't as critical since we are only trying to find motion vectors.
                     (prefilter==-1) ?  inputP                                                                                                                                      : \
                     (prefilter==0)  ?  input8.MinBlur(0,Chr,planar)                                               : \
                     (prefilter==1)  ?  input8.MinBlur(1,Chr,planar)                                               : \
                     (prefilter==2)  ?  input8.MinBlur(2,Chr,planar)                                               : \
                     (prefilter==3)  ?  (!planar && lsb ? Dither_merge16_8( inputd.Dfttest(sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,lsb=true,lsb_in=lsb_in,quiet=true), lsb_in?inputP:inputP.Dither_convert_8_to_16(),                                \
                                                                   lsb_in?inputP.Dither_lut16("x 4096 < 65535 x 19200 > 0 65535 x 4096 - 4.338916843220339 * - ? ?",u=1,v=1).Ditherpost(mode=6, slice=slices, u=Chr,   v=Chr)                                           \
                                                                         :inputd.mt_lut(      "x 16 < 255 x 75 > 0 255 x 16 - 255 75 16 - / * - ? ?",u=1,v=1), luma=chroma,                       u=Chr2,  v=Chr2)                                                       : \
                                 avs26 && planar && lsb ? Dither_merge16_8( inputd.Dfttest(sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,lsb=true,lsb_in=lsb_in,quiet=true).ConvertToYV16(), lsb_in?inputP:inputd.ConvertToYV16().Dither_convert_8_to_16(),                                \
                                                                   lsb_in?inputP.Dither_lut16("x 4096 < 65535 x 19200 > 0 65535 x 4096 - 4.338916843220339 * - ? ?",u=1,v=1).Ditherpost(mode=6, slice=slices, u=Chr,   v=Chr)                                           \
                                                                         :inputd.ConvertToYV16().mt_lut(      "x 16 < 255 x 75 > 0 255 x 16 - 255 75 16 - / * - ? ?",u=1,v=1), luma=chroma,                       u=Chr2,  v=Chr2).ConvertToYUY2().Interleaved2planar(!chroma)                                                       : \
                                           avs26 && planar ? mt_merge     (        Dfttest(!lsb_in?inputd:input8y,sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,dither=1).ConvertToYV16(),!lsb_in?input8y.Planar2Interleaved(!chroma).ConvertToYV16():input8y.ConvertToYV16(),                                                   \
                                                                          input8.Planar2Interleaved(!chroma).ConvertToYV16().mt_lut(      "x 16 < 255 x 75 > 0 255 x 16 - 4.322033898305085 * - ? ?",u=1,v=1), luma=planar?false:chroma,          u=bug_wa,v=bug_wa).ConvertToYUY2().Interleaved2planar(!chroma)                                                    : \
                                                     avs26 ? mt_merge     (Dfttest(inputd,sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,dither=1),input8,   \
                                                                          input8.mt_lut("x 16 scalef < range_max x 75 scalef > 0 range_max x 16 scalef - range_max 75 scalef 16 scalef - / * - ? ?",use_expr=2,u=1,v=1), luma=chroma, cplace=cplace, u=chr,v=chr)                                                   : \
                                                             mt_merge     (planar ? Dfttest(!lsb_in?inputd:input8y,sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,dither=1).Interleaved2planar(!chroma)                          : \
                                                                                    Dfttest(!lsb_in?inputd:input8 ,sstring="0.0:4.0 0.2:9.0 1.0:15.0",tbsize=1,u=chroma,v=chroma,dither=1),input8,   \
                                                                          input8.mt_lut(      "x 16 < 255 x 75 > 0 255 x 16 - 4.322033898305085 * - ? ?",u=1,v=1), luma=chroma,          u=bug_wa,v=bug_wa))                                                   : \
                     (prefilter==4)  ?  planar ? inputd.SMDegrain_KNLMeansCL(lsb=lsb, lsb_in=lsb_in, device_type=device_type, device_id=device_id, chroma=chroma, h=h, d=d, a=a, knlm_params=knlm_params).Interleaved2planar(!chroma) : \
                                                 inputd.SMDegrain_KNLMeansCL(lsb=lsb, lsb_in=lsb_in, device_type=device_type, device_id=device_id, chroma=chroma, h=h, d=d, a=a, knlm_params=knlm_params) : \
                                         Assert(false,    "prefilter must be between -1~4: "+string(prefilter))
}
And this above all:

Code: Select all

# SMDegrain_KNLMeansCL
 
function SMDegrain_KNLMeansCL (clip input, String "device_type", int "device_id", bool "chroma", bool "lsb", bool "lsb_in", float "h", int "d", int "a", String "knlm_params")
{
d            = Default( d ,0)
a            = Default( a ,1)
h            = Default( h ,7.0)
deviceid     = Default( device_id ,0)
knlm_params  = default(knlm_params, "")
chroma       = Default( chroma  ,true)
lsb_in       = Default( lsb_in  ,false)
lsb          = Default( lsb     ,lsb_in)
 
                                              sisphbd = AvsPlusVersionNumber > 2294
                                              fullchr = sisphbd ? input.is444() : input.isyv24()
                                              chr420  = sisphbd ? input.is420() : input.isyv12()
                                              nochr   = sisphbd ?   input.isy() : input.isy8()
                                              chrlsb  = chroma && !fullchr && !nochr
                                              NL_in   = lsb && !lsb_in ? input.Dither_convert_8_to_16() : input
                                              cnl     = chrlsb ? "Y" : input.isrgb() ? "auto" : chroma ? "YUV" : "Y"
 
NL_in = !chrlsb && input.isyuy2() ? NL_in.converttoyv16() : NL_in
 
                               chrlsb ? eval("""
                                             # In a more lucid state I could probably have laid out this block much better... or not...
 
                                             NL_W    = width(NL_in)
                                             Uclip   = sisphbd ? ExtractU(NL_in) : UToY8(NL_in)
                                             Vclip   = sisphbd ? ExtractV(NL_in) : VToY8(NL_in)
                                             NL_lsb  = (chr420 ? StackVertical(  lsb ? StackVertical(Dither_get_msb(uclip),Dither_get_msb(vclip)) : uclip,\
                                                                                 lsb ? StackVertical(Dither_get_lsb(uclip),Dither_get_lsb(vclip)) : vclip) : \
                                                                 StackHorizontal(uclip,vclip))
 
                                             nlc     = StackHorizontal(sisphbd ? ConvertToY(NL_in) : ConvertToY8(NL_in),NL_lsb)
 
                                             nlc     = Eval("nlc.KNLMeansCL(D=d, A=a, h=h,stacked=lsb_in || lsb,device_type=device_type,device_id=deviceid,channels=cnl" + knlm_params + ")")
 
                                             uvh = lsb_in || lsb ? uclip.height()/2 : uclip.height()
                                             uvw = uclip.width()
 
                                             nly = nlc.crop(0,0,chr420 ? -uvw : -(uvw+uvw),0)
 
                                             nlu = chr420 ? lsb_in || lsb ? StackVertical(Dither_get_msb(nlc).crop(NL_W,0,0,-uvh),Dither_get_lsb(nlc).crop(NL_W,0,0,-uvh)) : nlc.crop(NL_W,0,0,-uvh)                : \
                                                             nlc.crop(NL_W    ,0,-uvw,0)
                                             nlv = chr420 ? lsb_in || lsb ? StackVertical(Dither_get_msb(nlc).crop(NL_W,uvh, 0,0),Dither_get_lsb(nlc).crop(NL_W,uvh, 0,0)) : nlc.crop(NL_W,uvh, 0,0)                : \
                                                             nlc.crop(NL_W+uvw,0,   0,0)
                                             YToUV(nlu, nlv, nly)
                                            """) :   Eval("NL_in.KNLMeansCL(D=d, A=a, h=h,stacked=lsb_in || lsb,device_type=device_type,device_id=deviceid,channels=cnl" + knlm_params + ")")
 
!lsb && lsb_in ? Ditherpost(mode=6,slice=false) : last
input.isyuy2() ? converttoyuy2() : last
}
I am a script noob and totally helpless to recreate such procedures to adopt DGDenoise :|

Re: CUDASynth

Posted: Thu Aug 08, 2019 10:42 am
by admin
OK, got it. I'll try to find some time to see what is feasible here.

Re: CUDASynth

Posted: Wed Aug 28, 2019 6:23 pm
by fidodkk
Oh a build for accessible.

I will love to play with it, when the weather gets colder here in north europe.

Re: CUDASynth

Posted: Wed Nov 20, 2019 7:23 pm
by hydra3333
admin wrote:
Wed Aug 07, 2019 8:29 am
CUDASynth 0.3:

http://rationalqm.us/misc/CUDASynth_0.3.rar

Make your DGI files with DGIndexNV 2053.0.0.179.
Hello. There's been new releases since DGIndexNV 2053.0.0.179 and I wonder whether 0.3 is compatible ?

And ... perhaps even maybe a CUDASynth v1.0 release say in 2020 ?

edit: I unfortunately muddied the waters over at https://forum.videohelp.com/threads/393 ... ost2565866 ... if you wished to stick your oar in to correct me, that's up to you.
PS great work all round, mate.

Re: CUDASynth

Posted: Wed Nov 20, 2019 8:16 pm
by Rocky
0.3 should work with the current DGIndexNV. If there is a problem please advise.

Thanks for the kind words. I'll pop in over there and see what's cooking.

Re: CUDASynth

Posted: Tue Dec 10, 2019 12:24 pm
by Natasha
hydra3333, you are real man for me. Would love to know you better. Send PM we make music together. Intimate theme. I'm hot, no?

your wish my command,
Natasha

Re: CUDASynth

Posted: Tue Dec 10, 2019 1:41 pm
by admin
Natasha, please find a dating site.

Re: CUDASynth

Posted: Tue Dec 10, 2019 8:18 pm
by Natasha
Forum Mr. Big! Big boy, which site you hang at? We can hang out.

Re: CUDASynth

Posted: Thu Dec 19, 2019 11:38 pm
by hydra3333
I'd love to liaise with young Natty, however my liaising days have been over for about 40 years :)
Natasha is welcome to dream though !

Re: CUDASynth

Posted: Tue Dec 24, 2019 1:09 am
by hydra3333
And, wishing you a Happy Christmas !!