Port Cube

These CUDA filters are packaged into DGDecodeNV, which is part of DGDecNV.
User avatar
Levi
Posts: 52
Joined: Sat Apr 18, 2020 6:12 pm

Port Cube

Post by Levi »

User avatar
Sherman
Posts: 578
Joined: Mon Jan 06, 2020 10:19 pm

Port Cube

Post by Sherman »

The malcontents should learn to code. It's so easy!
User avatar
Curly
Posts: 716
Joined: Sun Mar 15, 2020 11:05 am

Port Cube

Post by Curly »

Easy 4 u 2 say.
User avatar
Baltasar
Posts: 60
Joined: Tue Nov 02, 2021 9:51 am

Port Cube

Post by Baltasar »

This worked for me.

User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

There are ample resources for learning CUDA. We published sample source code for a CUDA filter, and documented our full dialog with nVidia during the development of DGDecNV. Everything you need is in the nVidia SDKs, API documentation, and developer forum. Focus, persistence, and attention to detail!

The zimg/avsresize authors...maybe they'd be willing to work with us to port their stuff to CUDA. I'd be more than happy to help. I could add some CUDASynth magic to eliminate extra PCIe transfers, etc. Their filters can remain standalone and fully in their control. We'll help for free. That seems like the most likely path to reach the promised land.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Guest 2 wrote:
Wed Aug 10, 2022 10:08 am
P.S: Will you add tetrahedral to AVSCube too?
sekrit-twc has added tetrahedral to timecube so after it is ported to AVS we can ditch DGCube and stop all the associated nonsense. ;)
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by Guest 2 »

Rocky wrote:
Sun Jul 02, 2023 7:55 am
Then we can ditch DGCube and stop all the associated nonsense. ;)
Not everybody has a 56 core CPU. Someone still relies on GPU. (me)
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Oh OK. We'll keep it then.
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by Guest 2 »

Rocky wrote:
Sun Jul 02, 2023 4:34 pm
Oh OK. We'll keep it then.
https://github.com/rigaya/NVEnc/blob/ma ... ram2value2

lut3d=<string>
Apply a 3D LUT to an input video. Currently supports .cube file only.

lut3d_interp=<string>
nearest, trilinear, tetrahedral, pyramid, prism

I think this could ease my pain.

I could use nvenc to encode to lossless intermediate and then proceed with x265.

Do you think is now feasible to use the source code to implement it in DGCube? :)
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Guest 2 wrote:
Tue Jul 04, 2023 3:52 am
Do you think is now feasible to use the source code to implement it in DGCube? :)
Implement what? Please be specific and precise.

And what is your pain exactly? You cannot implement some desired processing? Or you can but it doesn't run as fast as you'd like?
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by Guest 2 »

Rocky wrote:
Tue Jul 04, 2023 7:42 am
Implement what? Please be specific and precise.
Nothing, Rocky. I think I won't bug you again about DGCube.

I have eased my pains applying LUT with NVEnc to an intermediate lossless HEVC and then encoded it with standard x265. Easy peasy and unbelievably fast.

Thanks again for having implemented HEVC 4:4:4 decoding :)
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Guest 2 wrote:
Tue Jul 04, 2023 12:55 pm
I have eased my pains...
Won't you have the grace to explain what your pains are after I explicitly asked and after all I've done for you over the years.
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by Guest 2 »

Rocky wrote:
Tue Jul 04, 2023 1:28 pm
Won't you have the grace to explain what your pains are after I explicitly asked and after all I've done for you over the years.
As I told many times, my CPU is too old and slow to comfortably apply zimg conversion and have a correct PQ to HLG transformation, using DGCube.

I have squeezed my brain to find a workaround and I am testing NVEnc to do all the job but the final encode.

It does everything in HW, fast and clean, tetrahedral included.

1080p SDR to HLG 160.65 fps
2160p PQ to HLG 42.90 fps

The only issue is storage requirements but I can cope with that.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

It's not just storage, it's the time to write out massive lossless streams and then read them again. And having to encode twice. That's not fast and it's certainly not clean. Nevertheless I'm happy you have what you consider to be an adequate workaround for your pains.
As I told many times
That feels rude and unfriendly. And who knows what "comfortable" means for you?
my CPU is too old and slow
When you upgrade your HW for HEVC lossless, think too about upgrading your CPU.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Hehe, I have successfully ported sekrit-twc's latest vscube to AVS+. Still have to fix up some loose ends but it's running fine with tetrahedral and all cpu modes. It took just less than 3 hours to port.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Here is a test release of the AVS+ support for sekrit-twc's latest vscube. Refer to the user manual for syntax and examples. Your test results will be greatly appreciated. My testing shows the AVS+ version with prefetch(6) to be faster than the Vapoursynth version.

https://rationalqm.us/cube/AVSCube_test.rar

Say thank you.
User avatar
Sherman
Posts: 578
Joined: Mon Jan 06, 2020 10:19 pm

Port Cube

Post by Sherman »

Rocky wrote:
Fri Jul 07, 2023 10:05 am
It took just less than 3 hours to port.
You're slipping, Rocky. Do you need to get some young blood involved?
User avatar
Natasha
Posts: 150
Joined: Wed Nov 20, 2019 11:11 am

Port Cube

Post by Natasha »

Sherman wrote:
Sat Jul 08, 2023 12:59 pm
young blood
The best kind!
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

The timecube+AVS support test build was relocated:

https://rationalqm.us/cube/

All the cube stuff is now together in directory cube.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Here is an updated test build for AVS+ support for sekrit-twc's vscube. It includes sekrit-twc's bug fix for the AVX2 support of tetrahedral mode.

https://rationalqm.us/cube/AVSCube_test.rar
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Guest 2 wrote:
Wed Aug 31, 2022 12:18 pm
I see tiny discrepancies between the product of external (identical to AVSCube) and of internal processing (look at graphs, mostly).
Well guys, I finally wrapped my rodent brain around this stuff and discovered the reasons for this. My matrices are off. I did research and now fully understand how to generate the matrices for any combination of:

8 vs. 16 bits
limited vs full range for input and output
601 vs. 709 vs. 2020 space
constant vs. non-constant luminance

Working through that for DGHDRtoSDR(), I saw that the equations I was using (can't even remember where I got them) were off by enough to account for discrepancies.

So I will fix that and, more importantly, I will fix DGCube's internal conversions and properly extend them to support all needed conversions. This will eliminate the need for external conversions using zimg, greatly improving performance.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Well guys, my optimism was premature. I was under the impression that all gamma-related stuff would be implemented in the LUT. But looking at the script (the one that revealed discrepancies) shows that the specified gamma inverse is being applied to create linear RGB to be passed to the script. So it is not enough for me to fix the coefficients in the YUV->RGB->YUV conversions. I also have to implement all the gamma stuff. And who knows, maybe also primaries stuff. So it's back to having to recreate the whole of z_ConvertFormat() if we are to have everything on the GPU. I'm not going to do that as it is a massive undertaking with zero benefit for me.

If you are wondering about DGHDRtoSDR() everything is fine as it does the needed gamma processing for
PQ/HLG->709.
User avatar
hydra3333
Posts: 406
Joined: Wed Oct 06, 2010 3:34 am
Contact:

Port Cube

Post by hydra3333 »

OK and thanks for looking into it. :salute:

At a guess, I suppose it also means no gpu HDRAGC ? Or even a hybrid ?
I really do like it here.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

No m8 it has no relevance for HDRAGC and curves-type stuff. I am still developing my own curves filter.
User avatar
hydra3333
Posts: 406
Joined: Wed Oct 06, 2010 3:34 am
Contact:

Port Cube

Post by hydra3333 »

Beaut, thanks bloke.
I really do like it here.
Post Reply