Page 1 of 16

Re: CUDASynth

Posted: Mon Sep 17, 2018 4:40 pm
by Guest
I am just happy you never give up
We keep getting new and better avs/vpy tools to use
Thank you
:bravo:

Re: CUDASynth

Posted: Tue Sep 18, 2018 9:34 am
by admin
I have CUDASynth-enabled DGHDRtoSDR and have some preliminary performance numbers for your enjoyment. The source is the same as previously used: 3840x2160 59.94 fps HDR10. The script with GPU pipelining is:

dgsource("LG Chess 4K Demo.dgi",fulldepth=true,fdst="gpu0")
dghdrtosdr(impl="255",light=250,fsrc="gpu0") # outputs YV12
prefetch(4)

Not pipelined on GPU: 80 fps, CPU 13%
Pipelined on GPU: 204 fps, CPU 8%

Quite a substantial performance boost!

Re: CUDASynth

Posted: Tue Sep 18, 2018 3:34 pm
by Guest
Nice boost, 250%

Re: CUDASynth

Posted: Tue Sep 18, 2018 5:08 pm
by Guest
Any ETA on public testing?
No rush, just getting antsy

Re: CUDASynth

Posted: Tue Sep 18, 2018 8:25 pm
by admin
Still have some things to finish up: Vapoursynth support, fdst parameter for DGHDRtoSDR, CUDASynth-enable DGDenoise, documentation, source code example. And I'm timesharing with DGIndex MKV support. Hang in there.

BTW, the CPU reduction is also important as it leaves more CPU for encoding.

It's hard to find 2080 Ti's:

https://www.nowinstock.net/computers/vi ... rtx2080ti/

Puts the lie to some of the whining at other forums by people saying it's too expensive, nobody wants it, nVidia are dirty rotten criminal capitalists, how dare they make a GPU I can't afford, blah blah blah.

gonca, what's the fastest most powerful Threadripper likely to be available within a few months?

I saved a lot of dough doing my own bathroom remodeling so I have the ready green to dish out for the best hardware. :twisted:

Re: CUDASynth

Posted: Tue Sep 18, 2018 9:00 pm
by Guest

Re: CUDASynth

Posted: Tue Sep 18, 2018 9:40 pm
by admin
Looks good. Any suggestions for a compatible mobo?

Re: CUDASynth

Posted: Wed Sep 19, 2018 4:48 am
by Guest

Re: CUDASynth

Posted: Wed Sep 19, 2018 12:08 pm
by admin
Thank you!

Re: CUDASynth

Posted: Wed Sep 19, 2018 3:37 pm
by Guest
The other option you might want to consider, seeing as you have this thing about GPUs (NVidia), is to drop the CPU to the 16 core/ 32 thread version and maybe go with 2 GPUs
or
(note: here goes the budget)
Get the 32 core CPU and 2 GPUs

Re: CUDASynth

Posted: Wed Sep 19, 2018 4:14 pm
by admin
gonca wrote:
Wed Sep 19, 2018 3:37 pm
(note: here goes the budget)
Get the 32 core CPU and 2 GPUs
That sounds good. I don't have a budget. I'm getting older so I'm going to blow it all on hardware and travel adventures. :P

BestBuy went to preorder on the 2080 Ti but by the time I saw the alert it was all taken. Nobody wants these things. ;)

Thanks for making this thread go into flames (the icon on the forum list).

Re: CUDASynth

Posted: Wed Sep 19, 2018 4:28 pm
by Guest
Thanks for making this thread go into flames
That is how I learned about the hardware side of things.
Making things go up in flames
burn baby burn.jpg

Re: CUDASynth

Posted: Wed Sep 19, 2018 5:03 pm
by admin
Wow, parallel ATA connectors. :wow: What's the CPU, a 386?

The first processor I coded for was an 8080. The OS was CP/M. It had a 100K floppy drive that raised and lowered the head on each sector access (the infamous head-loading solenoid), causing a pleasing bang-bang-bang that the neighbors loved. I clearly remember tossing that system (Heathkit H8) in the dumpster when I upgraded to a 386-based system. :lol:

Re: CUDASynth

Posted: Wed Sep 19, 2018 5:38 pm
by Guest
8080 and 386
Those were the days, never to be seen again, thank whichever supreme deity for that small favor
Gee-sh, now I am getting politically correct

Re: CUDASynth

Posted: Wed Sep 19, 2018 8:26 pm
by admin
If using the word "God" is politically incorrect, then we are done. Praise the Lord!

Re: CUDASynth

Posted: Sun Sep 30, 2018 11:31 am
by admin
Status report...

CUDASynth-enabling of DGDenoise is complete. It was an involved thing because I had to code and test all combinations of: fulldepth=true/false [times] chroma=true/false [times] fsrc=cpu/gpu0/gpu1 [times] fdst= cpu/gpu0/gpu1. That is a total of 2x2x3x3 = 36 combinations, each with its unique mix of CUDA kernel launches and pitched 2D memcpy's. It all seems to be working, thankfully. Don't ever accuse me of not being persistent.

There are four things to do now:

1) Ensure all filters still work fine when the source filter does not declare GPU buffers and a lock, i.e., when the source filter is not CUDASynth-enabled. This is context creation/management stuff. I also want to add an integer pipeline ID that can be specified in the script, thereby allowing for multiple pipelines to run simultaneously.

2) Thorough code review and any needed refactoring.

3) Vapoursynth native support.

4) Documentation and code sample (open source).

I'm going to hold off on 3) for now, do the others and then give y'all the new toy to play around with.

Have to open source the CUDA filter framework to recruit others to develop compatible filters. The core filters should be CUDASynth-enabled as well.

Re: CUDASynth

Posted: Mon Oct 01, 2018 5:31 am
by hydra3333
:hat: :bravo: Thank you very much.

Re: CUDASynth

Posted: Mon Oct 01, 2018 8:44 pm
by admin
You're welcome, hydra3333!

Item 1) above is finished, but without the pipeline ID. That is going to need some deeper thinking, as it must allow any filter to create ping-pong buffers, etc. Let's hold off on that for now. So tomorrow, I want to make the documentation and source code example and get it into your hands. Code review can be done in parallel with your testing.

Consider this script:

dgsource("LG Chess 4K Demo.dgi",fulldepth=true,fdst="gpu0")
dghdrtosdr(light=250,fsrc="gpu0",fdst="gpu1",fulldepth=true)
dgdenoise(fsrc="gpu1",fdst="gpu0",chroma=true)
dgsharpen(fsrc="gpu0")
prefetch(2)

The source is 3840x2160 59.94 HDR10. The resulting frame rate is 95 fps, 1.6 x real-time. If the pipeline is not used (all fsrc and fdst set to "cpu", then the frame rate is 30 fps, half of real-time. So for this real-world example CUDASynth is increasing performance by over 300%. Gotta love it, am I wrong?

Re: CUDASynth

Posted: Tue Oct 02, 2018 5:00 am
by Guest
Gotta love it, am I wrong?
You are right

Re: CUDASynth

Posted: Tue Oct 02, 2018 11:48 am
by admin
The source code example filter is done. I used DGSharpen (debundled from DGDecodeNV) and removed the licensing and CUDA code encryption. Now I just have to write some documentation.

Re: CUDASynth

Posted: Wed Oct 03, 2018 12:06 pm
by admin
I am happy to announce CUDASynth 0.1:

http://rationalqm.us/misc/CUDASynth_0.1.rar

Testing and feedback will be appreciated.

Re: CUDASynth

Posted: Wed Oct 03, 2018 3:48 pm
by Guest
Got a couple of things to finish off and then the testing will begin

Re: CUDASynth

Posted: Wed Oct 03, 2018 5:43 pm
by Guest
Speed looks good
cudasynth.log
(719.6 KiB) Downloaded 659 times
test.log
(244.36 KiB) Downloaded 647 times
Can't see the results of the cudasynth file in VDub or MPC-HC so I will run quick encode and check

Re: CUDASynth

Posted: Wed Oct 03, 2018 6:34 pm
by Guest
No visible issues that I could see on the encoded file
CPU usage down, and surprisingly so is GPU and VPU load.
Speed is right up there though
Looks good

Re: CUDASynth

Posted: Wed Oct 03, 2018 8:24 pm
by admin
Thanks for the test results, gonca. Now to get some critical mass we need to make more CUDASynth-enabled filters. Feel free to suggest possibilities. If there are any good open source ones it would not be hard to port them.