Page 3 of 12

Port Cube

Posted: Sun Aug 14, 2022 3:34 pm
by Sherman
I can multitask. Nothing worthwhile is easy. Balti told me that.

Port Cube

Posted: Sun Aug 14, 2022 3:36 pm
by Britney
Thrilled to help! I'll rub your shoulders, Rocky. There, that's better.

Port Cube

Posted: Sun Aug 14, 2022 3:39 pm
by new_guy
Can you rub mine too?

Port Cube

Posted: Sun Aug 14, 2022 3:40 pm
by Britney
Eeeuww, no way.

Port Cube

Posted: Mon Aug 15, 2022 8:43 am
by Rocky
Thank you, Brit! I found the bug that was kicking my patootie. Should be plain sailing now.

EDIT: But that wasn't the only bug, see next post.

Port Cube

Posted: Tue Aug 16, 2022 4:38 pm
by Rocky
Ha, plain sailing. Famous last words. Two full days of agonizing over why my conversions were producing an over-saturated mess, even though I took the YUV<-->RGB conversions directly from DGHDRtoSDR (minus the depth reduction). Guys, I tried EVERYTHING. Well, except for one little thing that shouldn't have mattered. ;)

So, I was going to try writing out the intermediate RGB to see if that was already messed up, or if it was a bug in the final RGB->YUV. Of course, to do that you cannot have an in-place filter (get src, read/write src, return src) because the received and returned formats would be different. Am I boring you? You need to do get src, read src, write dst, return dst. So I did that with dst = NewVideoFrameP() etc. but still I left in the final RGB->YUV and writing YUV just to test my dst handling. Whoa, suddenly the output was correct.

Image

It shouldn't have made a difference unless Avisynth+ is doing something different with MakeWritable() etc. that I am used to for in-place filters. But I don't care about that, just that it is working.

That was all done in C code. Now I'll port it to the CUDA kernel. That should be plain sailing.

Port Cube

Posted: Tue Aug 16, 2022 5:14 pm
by Natasha
Where are your teeth?

Port Cube

Posted: Tue Aug 16, 2022 5:14 pm
by Sherman
You should know. Remember the accident?

Port Cube

Posted: Tue Aug 16, 2022 5:36 pm
by Curly
Not every day I can say I told you so. Obvious! Nyuck.

Port Cube

Posted: Tue Aug 16, 2022 6:01 pm
by Boris
Just wait til the lights go pfizz. It's over.

Port Cube

Posted: Tue Aug 16, 2022 6:40 pm
by Wonder Woman
I know you missed me but here I am ready to slay dragons.

Port Cube

Posted: Tue Aug 16, 2022 6:41 pm
by new_guy
Slay my dragon.

Port Cube

Posted: Wed Aug 17, 2022 5:47 am
by Guest 2
Rocky wrote:
Tue Aug 16, 2022 4:38 pm
That was all done in C code. Now I'll port it to the CUDA kernel. That should be plain sailing.
Eager to test it. :salute:

Port Cube

Posted: Wed Aug 17, 2022 8:51 am
by Guest 2
Perhaps you already know but LUT cubes can work in RGB or Studio RGB, i.e. full TV range or limited TV range.

On PQ to HLG, for full TV range, such as BBC ones, you would need:

#From 4:2:0 16bit planar Narrow Range to RGB Planar 16bit Full Range
z_ConvertFormat(pixel_type="RGBP16", colorspace_op="2020:st2084:2020:limited=>rgb:st2084:2020:full", resample_filter_uv="spline64", dither_type="error_diffusion")

#From PQ to HLG with 16bit precision
DGCube("7a_HLG_PQ1000_mode-nar_in-nar_out-nar_nocomp.cube", fullrange=true)

#From RGB 16bit planar Full Range to YUV420 10bit planar Narrow Range with dithering
z_ConvertFormat(pixel_type="YUV420P10", colorspace_op="rgb:std-b67:2020:full=>2020:std-b67:2020:limited", resample_filter_uv="spline64", dither_type="error_diffusion")

On PQ to HLG, for limited TV range, such as Warner ones, you would need:

#From 4:2:0 16bit planar Narrow Range to RGB Planar 16bit Narrow Range
z_ConvertFormat(pixel_type="RGBP16", colorspace_op="2020:st2084:2020:limited=>rgb:st2084:2020:limited", resample_filter_uv="spline64", dither_type="error_diffusion")

#From PQ to HLG with 16bit precision
DGCube("WarnerBros_PQToHLG_MaxCLL_2508.cube", fullrange=true)

#From RGB 16bit planar Narrow Range to YUV420 10bit planar Narrow Range with dithering
z_ConvertFormat(pixel_type="YUV420P10", colorspace_op="rgb:std-b67:2020:limited=>2020:std-b67:2020:limited", resample_filter_uv="spline64", dither_type="error_diffusion")

I hope that your, releasing, internal conversion could take this difference in good care.

PS: I have some doubts about the fullrange switch in the DGCube command in the second case.

Port Cube

Posted: Wed Aug 17, 2022 11:36 am
by Rocky
Let's test what we have and I'll look into the range stuff in parallel.

Please re-download DGCube.zip to get support for direct YUV420P16 as received from
DGSource() for high-bit-depth sources. Support for RGBP16 is also still supported. Vapoursynth
is supported. The DGCube text document was updated and includes relevant sample scripts.

Here is a very simple script:

DGSource("THE GREAT WALL.dgi")
DGCube("PQ_to_BT709_slope.cube", fullrange=false, interp="tetrahedral")

Your test results will be appreciated.

https://rationalqm.us/misc/DGCube.zip.

DG said we might be able to afford dentures for me later this year. Gnawing acorns is a challenge with my little stubs.

Port Cube

Posted: Wed Aug 17, 2022 11:41 am
by Natasha
That new_guy is really desperate. I tasted his blood one night. Sour!

Port Cube

Posted: Wed Aug 17, 2022 12:21 pm
by Guest 2
Rocky wrote:
Wed Aug 17, 2022 11:36 am
Let's test what we have and I'll look into the range stuff in parallel.
In the examples you write:

loadplugin("...\dgdecodenv.dll")
loadplugin("...\dgcube.dll")
dgsource("THE GREAT WALL.dgi")
DGCube("PQ_to_BT709_slope.cube", fullrange=false, interp="tetrahedral")

What is the output format and range? Does still need a z to have a proper space out?

Why you put fullrange=false? What would happen with true?

As you can see I am mostly interested in PQ to HLG transformation.

Port Cube

Posted: Wed Aug 17, 2022 12:41 pm
by Rocky
Oh sure. Depends on at least three things. Is the LUT made for limited or full-range input/output? Is the source limited or full range? Should the output be limited or full range?

For the example, I assumed the LUT is made for full range. The source here I assume has limited range. So I set fullrange=false to expand it to full in DGCube. The output then would presumably be full range.

Not being au fait with the current practice for use of 3D LUTs, I would appeal to users to say what they need. There are 3 considerations, previously alluded to:

The source range.
The LUT assumed ranges.
The output range.

Everything is possible but what do we seek?

Maybe for DGCube, specifying input range and output range are enough. Intermediate processing would assume a full range LUT.

Port Cube

Posted: Wed Aug 17, 2022 12:57 pm
by Curly
Guest 2 wrote:
Wed Aug 17, 2022 12:21 pm
As you can see I am mostly interested in PQ to HLG transformation.
Good move. PQ sucks rocks.

Kernel stuff is free, so make it as general as possible. Generally speaking by the General. Am I wrong?

Port Cube

Posted: Wed Aug 17, 2022 1:01 pm
by Guest 2
Rocky wrote:
Wed Aug 17, 2022 12:41 pm
Depends on at least three things. Is the LUT made for limited or full-range input/output? Is the source limited or full range? Should the output be limited or full range?
1) Input, in my case, is PQ video from UHD, so we can assume it's limited range.

2) My LUT wants full range.

3) The output should be limited range.

So, how should I write the script, with that in mind?

My findings: the script

Code: Select all

SetFilterMTMode("DEFAULT_MT_MODE", 2)
LoadPlugin("D:\Eseguibili\Media\DGDecNV\DGDecodeNV.dll")
LoadPlugin("D:\Eseguibili\Media\DgCube\DGCube.dll")
DGSource("F:\In\2_0446 Akira\akira.dgi",ct=48,cb=48,cl=0,cr=0)
propClearAll()
DGCube("D:\Programmi\Media\AviSynth+\cube\1a_PQ1000_HLG_mode-nar_in-nar_out-nar_nocomp.cube", fullrange=true)
From RGB 16bit planar Full Range to YUV422 10bit planar Narrow Range with dithering
z_ConvertFormat(pixel_type="YUV420P10", colorspace_op="rgb:std-b67:2020:full=>2020:std-b67:2020:limited", resample_filter_uv="spline64", dither_type="error_diffusion")
ConvertBits(32)
BM3D_CUDA(sigma=3, radius=2)
BM3D_VAggregate(radius=2)
fmtc_bitdepth (bits=10,dmode=8)
neo_f3kdb(range=15, Y=65, Cb=40, Cr=40, grainY=0, grainC=0, sample_mode=2, blur_first=true, dynamic_grain=false, mt=false, keep_tv_range=true)
Prefetch(3)
gives the error on z_ConvertFormat line.

Code: Select all

YUV color family cannot have RGB matrix coefficients
The same with fullrange=false.

Port Cube

Posted: Wed Aug 17, 2022 1:01 pm
by Guest 2
Curly wrote:
Wed Aug 17, 2022 12:57 pm
Good move. PQ sucks rocks.
Indeed. :mrgreen:

Port Cube

Posted: Thu Aug 18, 2022 8:18 am
by Rocky
The way it is designed to work is that if fullrange is false then the pixels are scaled up to fullrange on input and scaled back down to limited on output. In between the LUT is applied. So the fullrange parameter has the meaning "when false we specify that the input is limited and the output should be limited".

If this is not what is happening or is not what you need, then please advise.

Port Cube

Posted: Thu Aug 18, 2022 9:30 am
by Rocky
Ah, I forgot that my YUV <-> RGB conversions from DGHDRtoSDR already do limited -> full and back (the tonemapping is done in full RGB). I remember getting the coefficients for input by multiplying the YUV->RGB matrix by the limited-to-full matrix, which is more efficient than doing a separate scaling. Similarly on output. So, fullrange is inapplicable for YUV420P16 input, i.e., just leave it as fullrange=true. It would still be applicable to RGBP16 input. Gonna rethink the interface, e.g., error out when fullrange=false for YUV input? Or just silently ignore it?

Port Cube

Posted: Thu Aug 18, 2022 10:25 am
by Guest 2
Rocky wrote:
Thu Aug 18, 2022 8:18 am
when false we specify that the input is limited and the output should be limited
I don't know if possible but if the LUT produces limited from full or full from limited? Perhaps it should be better to specify what we are giving as input and what we want as output?

Port Cube

Posted: Thu Aug 18, 2022 10:27 am
by Guest 2
Rocky wrote:
Thu Aug 18, 2022 9:30 am
Ah, I forgot that my YUV <-> RGB conversions
And what about my error? Any idea?
Rocky wrote:
Thu Aug 18, 2022 9:30 am
fullrange is inapplicable for YUV420P16 input
What would happen if DGDecodeNV outputs a 444 stream?