Port Cube

These CUDA filters are packaged into DGDecodeNV, which is part of DGDecNV.
User avatar
Sherman
Moose Approved
Posts: 360
Joined: Mon Jan 06, 2020 10:19 pm

Port Cube

Post by Sherman »

I can multitask. Nothing worthwhile is easy. Balti told me that.
User avatar
Britney
Curly Approved
Posts: 80
Joined: Sun Aug 09, 2020 3:24 pm

Port Cube

Post by Britney »

Thrilled to help! I'll rub your shoulders, Rocky. There, that's better.
User avatar
new_guy
Posts: 20
Joined: Fri Jan 15, 2021 11:12 am

Port Cube

Post by new_guy »

Can you rub mine too?
User avatar
Britney
Curly Approved
Posts: 80
Joined: Sun Aug 09, 2020 3:24 pm

Port Cube

Post by Britney »

Eeeuww, no way.
User avatar
Rocky
Moose Approved
Posts: 2418
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Thank you, Brit! I found the bug that was kicking my patootie. Should be plain sailing now.

EDIT: But that wasn't the only bug, see next post.
User avatar
Rocky
Moose Approved
Posts: 2418
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Ha, plain sailing. Famous last words. Two full days of agonizing over why my conversions were producing an over-saturated mess, even though I took the YUV<-->RGB conversions directly from DGHDRtoSDR (minus the depth reduction). Guys, I tried EVERYTHING. Well, except for one little thing that shouldn't have mattered. ;)

So, I was going to try writing out the intermediate RGB to see if that was already messed up, or if it was a bug in the final RGB->YUV. Of course, to do that you cannot have an in-place filter (get src, read/write src, return src) because the received and returned formats would be different. Am I boring you? You need to do get src, read src, write dst, return dst. So I did that with dst = NewVideoFrameP() etc. but still I left in the final RGB->YUV and writing YUV just to test my dst handling. Whoa, suddenly the output was correct.

Image

It shouldn't have made a difference unless Avisynth+ is doing something different with MakeWritable() etc. that I am used to for in-place filters. But I don't care about that, just that it is working.

That was all done in C code. Now I'll port it to the CUDA kernel. That should be plain sailing.
User avatar
Natasha
Mosquito
Posts: 111
Joined: Wed Nov 20, 2019 11:11 am

Port Cube

Post by Natasha »

Where are your teeth?
User avatar
Sherman
Moose Approved
Posts: 360
Joined: Mon Jan 06, 2020 10:19 pm

Port Cube

Post by Sherman »

You should know. Remember the accident?
User avatar
Curly
Moose Approved
Posts: 188
Joined: Sun Mar 15, 2020 11:05 am

Port Cube

Post by Curly »

Not every day I can say I told you so. Obvious! Nyuck.
User avatar
Boris
Posts: 64
Joined: Sun Nov 10, 2019 2:55 pm

Port Cube

Post by Boris »

Just wait til the lights go pfizz. It's over.
User avatar
Wonder Woman
Curly Approved
Posts: 26
Joined: Sun Feb 07, 2021 10:46 am

Port Cube

Post by Wonder Woman »

I know you missed me but here I am ready to slay dragons.
User avatar
new_guy
Posts: 20
Joined: Fri Jan 15, 2021 11:12 am

Port Cube

Post by new_guy »

Slay my dragon.
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 685
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Rocky wrote:
Tue Aug 16, 2022 4:38 pm
That was all done in C code. Now I'll port it to the CUDA kernel. That should be plain sailing.
Eager to test it. :salute:
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 685
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Perhaps you already know but LUT cubes can work in RGB or Studio RGB, i.e. full TV range or limited TV range.

On PQ to HLG, for full TV range, such as BBC ones, you would need:

#From 4:2:0 16bit planar Narrow Range to RGB Planar 16bit Full Range
z_ConvertFormat(pixel_type="RGBP16", colorspace_op="2020:st2084:2020:limited=>rgb:st2084:2020:full", resample_filter_uv="spline64", dither_type="error_diffusion")

#From PQ to HLG with 16bit precision
DGCube("7a_HLG_PQ1000_mode-nar_in-nar_out-nar_nocomp.cube", fullrange=true)

#From RGB 16bit planar Full Range to YUV420 10bit planar Narrow Range with dithering
z_ConvertFormat(pixel_type="YUV420P10", colorspace_op="rgb:std-b67:2020:full=>2020:std-b67:2020:limited", resample_filter_uv="spline64", dither_type="error_diffusion")

On PQ to HLG, for limited TV range, such as Warner ones, you would need:

#From 4:2:0 16bit planar Narrow Range to RGB Planar 16bit Narrow Range
z_ConvertFormat(pixel_type="RGBP16", colorspace_op="2020:st2084:2020:limited=>rgb:st2084:2020:limited", resample_filter_uv="spline64", dither_type="error_diffusion")

#From PQ to HLG with 16bit precision
DGCube("WarnerBros_PQToHLG_MaxCLL_2508.cube", fullrange=true)

#From RGB 16bit planar Narrow Range to YUV420 10bit planar Narrow Range with dithering
z_ConvertFormat(pixel_type="YUV420P10", colorspace_op="rgb:std-b67:2020:limited=>2020:std-b67:2020:limited", resample_filter_uv="spline64", dither_type="error_diffusion")

I hope that your, releasing, internal conversion could take this difference in good care.

PS: I have some doubts about the fullrange switch in the DGCube command in the second case.
User avatar
Rocky
Moose Approved
Posts: 2418
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Let's test what we have and I'll look into the range stuff in parallel.

Please re-download DGCube.zip to get support for direct YUV420P16 as received from
DGSource() for high-bit-depth sources. Support for RGBP16 is also still supported. Vapoursynth
is supported. The DGCube text document was updated and includes relevant sample scripts.

Here is a very simple script:

DGSource("THE GREAT WALL.dgi")
DGCube("PQ_to_BT709_slope.cube", fullrange=false, interp="tetrahedral")

Your test results will be appreciated.

https://rationalqm.us/misc/DGCube.zip.

DG said we might be able to afford dentures for me later this year. Gnawing acorns is a challenge with my little stubs.
User avatar
Natasha
Mosquito
Posts: 111
Joined: Wed Nov 20, 2019 11:11 am

Port Cube

Post by Natasha »

That new_guy is really desperate. I tasted his blood one night. Sour!
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 685
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Rocky wrote:
Wed Aug 17, 2022 11:36 am
Let's test what we have and I'll look into the range stuff in parallel.
In the examples you write:

loadplugin("...\dgdecodenv.dll")
loadplugin("...\dgcube.dll")
dgsource("THE GREAT WALL.dgi")
DGCube("PQ_to_BT709_slope.cube", fullrange=false, interp="tetrahedral")

What is the output format and range? Does still need a z to have a proper space out?

Why you put fullrange=false? What would happen with true?

As you can see I am mostly interested in PQ to HLG transformation.
User avatar
Rocky
Moose Approved
Posts: 2418
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Oh sure. Depends on at least three things. Is the LUT made for limited or full-range input/output? Is the source limited or full range? Should the output be limited or full range?

For the example, I assumed the LUT is made for full range. The source here I assume has limited range. So I set fullrange=false to expand it to full in DGCube. The output then would presumably be full range.

Not being au fait with the current practice for use of 3D LUTs, I would appeal to users to say what they need. There are 3 considerations, previously alluded to:

The source range.
The LUT assumed ranges.
The output range.

Everything is possible but what do we seek?

Maybe for DGCube, specifying input range and output range are enough. Intermediate processing would assume a full range LUT.
User avatar
Curly
Moose Approved
Posts: 188
Joined: Sun Mar 15, 2020 11:05 am

Port Cube

Post by Curly »

tormento wrote:
Wed Aug 17, 2022 12:21 pm
As you can see I am mostly interested in PQ to HLG transformation.
Good move. PQ sucks rocks.

Kernel stuff is free, so make it as general as possible. Generally speaking by the General. Am I wrong?
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 685
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Rocky wrote:
Wed Aug 17, 2022 12:41 pm
Depends on at least three things. Is the LUT made for limited or full-range input/output? Is the source limited or full range? Should the output be limited or full range?
1) Input, in my case, is PQ video from UHD, so we can assume it's limited range.

2) My LUT wants full range.

3) The output should be limited range.

So, how should I write the script, with that in mind?

My findings: the script

Code: Select all

SetFilterMTMode("DEFAULT_MT_MODE", 2)
LoadPlugin("D:\Eseguibili\Media\DGDecNV\DGDecodeNV.dll")
LoadPlugin("D:\Eseguibili\Media\DgCube\DGCube.dll")
DGSource("F:\In\2_0446 Akira\akira.dgi",ct=48,cb=48,cl=0,cr=0)
propClearAll()
DGCube("D:\Programmi\Media\AviSynth+\cube\1a_PQ1000_HLG_mode-nar_in-nar_out-nar_nocomp.cube", fullrange=true)
From RGB 16bit planar Full Range to YUV422 10bit planar Narrow Range with dithering
z_ConvertFormat(pixel_type="YUV420P10", colorspace_op="rgb:std-b67:2020:full=>2020:std-b67:2020:limited", resample_filter_uv="spline64", dither_type="error_diffusion")
ConvertBits(32)
BM3D_CUDA(sigma=3, radius=2)
BM3D_VAggregate(radius=2)
fmtc_bitdepth (bits=10,dmode=8)
neo_f3kdb(range=15, Y=65, Cb=40, Cr=40, grainY=0, grainC=0, sample_mode=2, blur_first=true, dynamic_grain=false, mt=false, keep_tv_range=true)
Prefetch(3)
gives the error on z_ConvertFormat line.

Code: Select all

YUV color family cannot have RGB matrix coefficients
The same with fullrange=false.
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 685
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Curly wrote:
Wed Aug 17, 2022 12:57 pm
Good move. PQ sucks rocks.
Indeed. :mrgreen:
User avatar
Rocky
Moose Approved
Posts: 2418
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

The way it is designed to work is that if fullrange is false then the pixels are scaled up to fullrange on input and scaled back down to limited on output. In between the LUT is applied. So the fullrange parameter has the meaning "when false we specify that the input is limited and the output should be limited".

If this is not what is happening or is not what you need, then please advise.
User avatar
Rocky
Moose Approved
Posts: 2418
Joined: Fri Sep 06, 2019 12:57 pm

Port Cube

Post by Rocky »

Ah, I forgot that my YUV <-> RGB conversions from DGHDRtoSDR already do limited -> full and back (the tonemapping is done in full RGB). I remember getting the coefficients for input by multiplying the YUV->RGB matrix by the limited-to-full matrix, which is more efficient than doing a separate scaling. Similarly on output. So, fullrange is inapplicable for YUV420P16 input, i.e., just leave it as fullrange=true. It would still be applicable to RGBP16 input. Gonna rethink the interface, e.g., error out when fullrange=false for YUV input? Or just silently ignore it?
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 685
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Rocky wrote:
Thu Aug 18, 2022 8:18 am
when false we specify that the input is limited and the output should be limited
I don't know if possible but if the LUT produces limited from full or full from limited? Perhaps it should be better to specify what we are giving as input and what we want as output?
User avatar
tormento
DG Approved/Curly Approved/Moose Approved
Posts: 685
Joined: Mon Sep 20, 2010 2:18 pm

Port Cube

Post by tormento »

Rocky wrote:
Thu Aug 18, 2022 9:30 am
Ah, I forgot that my YUV <-> RGB conversions
And what about my error? Any idea?
Rocky wrote:
Thu Aug 18, 2022 9:30 am
fullrange is inapplicable for YUV420P16 input
What would happen if DGDecodeNV outputs a 444 stream?
Post Reply