CUDASynth

These CUDA filters are packaged into DGDecodeNV, which is part of DGDecNV.
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

CUDASynth

Post by Guest 2 »

Rocky wrote:
Mon Jan 29, 2024 1:49 am
Thanks, m8. I'm gonna put out a test version with HDRtoSDR today, and then start working on DGDenoise() integration.
:bow:

When you have some spare time, have a look at https://github.com/WolframRhodium/Vapou ... A/issues/7

Seems that AVS support is getting orphaned. It's a very good filter, please evaluate if it's feasible to run internally.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

I heard BM3D sucks. Is it all just hype or can you prove otherwise?
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

CUDASynth

Post by Guest 2 »

Rocky wrote:
Mon Jan 29, 2024 7:55 am
I heard BM3D sucks. Is it all just hype or can you prove otherwise?
My experience and eyes. I find BM3D more precise and less blurry and it has the capability to go temporal too.

NLMeans and BM3D work best toghether, often, applying first BM3D and then NLMeans.

You can find and entire thread on https://forum.doom9.org/showthread.php?t=172172

I stopped using it as the AVS part build became obsolete.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

If you make it hard for me I won't do anything. Show me a sample and a script and tell why it is better.
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

CUDASynth

Post by Guest 2 »

Rocky wrote:
Mon Jan 29, 2024 9:07 am
If you make it hard for me I won't do anything. Show me a sample and a script and tell why it is better.
Let me finish this batch encode and I will post proper examples.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

Thank you.
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

CUDASynth

Post by Guest 2 »

Rocky wrote:
Mon Jan 29, 2024 4:50 pm
Thank you.
I have found something seriously made

https://www.aanda.org/articles/aa/pdf/2 ... 278-19.pdf

I know you will like it. :salute:

For something more practical, those images https://forum.doom9.org/showthread.php? ... ost1794468

P.S: I am not one of the lucky owners but is possible to access the denoising that Tensor Cores apply to raytraced images?
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

I do like it, thank you.

Just talking about BM3D vs. NL means because DGDenoise is NL-means-based...

BM3D is clearly better with stationary noise, while NL means is a bit better with non-stationary noise. I'd argue that our uses cases are closer to non-stationary.

I'll look at your practical cases and report back.

"the denoising that Tensor Cores apply to raytraced images"

I don't know what they are doing. Do you?
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

Your link says:

"V-BM3D is the best among 3 in general, but it suffers from some kind of "liquefying" low frequency artifacts which does not exist in NLMeans"

It also says:

"quality of high frequency filtering: NLMeans > V-BM3D"

His "general quality" assessment is subjective, and honestly, the NL means fish looks better to me.

Each filter wins at different frequencies. There's no knock-out punch for any of them, IMHO.

He doesn't state if the noise is stationary or non-stationary. Are all the filters doing spatio-temporal, etc.?

One improvement for DGDenoise would be to add temporal processing.

You're gonna have to do much better if you want me to invest a lot of time in this.
User avatar
hydra3333
Posts: 406
Joined: Wed Oct 06, 2010 3:34 am
Contact:

CUDASynth

Post by hydra3333 »

Jogs memory, slips a cog, fails to recall properly, guesses ...
one wonders what maybe up for consideration next in cudasynth ... dgsharpen, dgdeblock ?
did I mention dgdeblock ? :)
I really do like it here.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

DGDenoise then DGSharpen.

I never heard of DGDeblock. ;)
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

CUDASynth

Post by Guest 2 »

Rocky wrote:
Wed Jan 31, 2024 6:57 am
You're gonna have to do much better if you want me to invest a lot of time in this.
The reason is that usually you apply first BM3D then KNLMeansCL to have the best of both worlds.

Having both in CUDA would have been for us easier and faster.
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

CUDASynth

Post by Guest 2 »

Rocky wrote:
Wed Jan 31, 2024 6:53 am
I don't know what they are doing. Do you?
Watch Two minute papers on

https://www.youtube.com/channel/UCbfYPy ... upoX8nvctg

The owner is a researcher who loves CGI and has a huge esteem for Nvidia researchers.

He explained very well why light tracing needs denoising in a recent video.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

Just wasted time watching stupid little fake things running around. Thanks! :roll:

Please give a link to the noise video. And I'm looking for the denoising algorithm not just why it is needed.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

Guest 2 wrote:
Wed Jan 31, 2024 8:09 am
The reason is that usually you apply first BM3D then KNLMeansCL to have the best of both worlds.
You have to support things and not make unsupported claims. And don't just link some idiot saying he does that. Show that it is better and explain why.
User avatar
Sherman
Posts: 578
Joined: Mon Jan 06, 2020 10:19 pm

CUDASynth

Post by Sherman »

Guys, I had a brainstorm! Now pay attention.

We can implement a filter consisting of every known denoising algorithm run in succession. That way we would be assured of getting the best of every world. Isn't science easy?
Sherman Peabody
Director of Linux Development
User avatar
Britney
Posts: 145
Joined: Sun Aug 09, 2020 3:24 pm

CUDASynth

Post by Britney »

Could be worth a try.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

Started integrating DGDenoise. 8-)
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

Got it working! The challenge was to have the same denoising in both DGSource() and DGDenoise() without clashing or requiring duplicated cu files.

Still have some loose ends to tie up. Also, I see that the latest nVidia code is different. Maybe it's better than the older version we use. I'll check.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

DG is trying to spend all his money. He just bought us an RTX 4090 and a 1200W PSU.

I would have said don't waste your money as the PCIe bus is the bottleneck. But in CUDASynth world that is no longer true. 8-)
DAE avatar
Guest 2
Posts: 903
Joined: Mon Sep 20, 2010 2:18 pm

CUDASynth

Post by Guest 2 »

Rocky wrote:
Sun Feb 04, 2024 10:30 am
Got it working! The challenge was to have the same denoising in both DGSource() and DGDenoise() without clashing or requiring duplicated cu files.
:bow:

I have thought about your idea about the ini file. Perhaps it's feasible if you introduce a small panel in DGIndex to control filters and you can save there in its ini the default demux (and later decode) parameters. So you can pass the wanted filter to DGSource too having some parameters variable, such as you do with crop.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

Yes, I agree with that. if we get too many filters we'll have to do something. But rather than an INI file, maybe leverage the existing script template generation.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

There's another benefit from the CUDASynth way for our denoising. I'll explain. There are two things that can slow it down: PCIe transfers and the window size for the algorithm. With the reduction of the PCIe overhead, we have some leeway to increase the window size, and therefore the quality. So I changed the choice of window sizes from the current 5/7/9 to 9/11/15 yielding a noticeable improvement in quality without losing much overall speed. I'm gonna change the parameters to name them low/medium/high quality. Understand that low is the same as the previous highest quality. Here are the speeds for 1920x1080 (decoding plus denoising):

low: 518 fps
medium: 270 fps
high: 173 fps

On the very noisy Nostalghi, all the quality levels look fine but high looks amazing.

The highest level is still 7 times real time, so maybe we should have an ultra quality level as well. That could be window size 25 giving 65 fps, still twice real-time.

I'm also probably going to ditch the blend/cblend options and just always use 0.0 blend, which means never mix in the original pixels with the filtered pixels. I never saw the point of that because it just adds back the noise you just took out. Elimination of the lerp raises the fps by a small amount too. And there's less possible crud on the parameter list too, making coding and maintenance easier.
User avatar
Rocky
Posts: 3621
Joined: Fri Sep 06, 2019 12:57 pm

CUDASynth

Post by Rocky »

After getting awesome performance from our algorithm, I thought I'd try out BM3D (test 10). I got this script online:

ConvertBits(bits=32)
BM3D_CUDA(sigma=0.5, radius=2)
BM3D_VAggregate(radius=2)
ConvertBits(bits=16)

When I tried it there was no perceptible denoising at all. I had to bump sigma to 25 to get something even starting to be comparable to ours. What's up with that? Even then the denoising was poor with artifacts. And worse than that the speed was 25 fps. It's not surprising with the temporal smoothing, the conversions, the extra filter, and all the PCIe bus overhead.

OK, temporal filtering is enabled, so I disabled it by setting radius=0 and ditching BM3D_VAggregate(). The denoising was understandably even worse and the speed was 136 fps. Even at low quality, ours is 518 fps with better results.

To turn on chroma, the doc says you need YUV444PS input. The insanity just multiplies.

So tell me, what's to like here? It seems to me that the only thing this filter has going for it is the name. :?
User avatar
hydra3333
Posts: 406
Joined: Wed Oct 06, 2010 3:34 am
Contact:

CUDASynth

Post by hydra3333 »

Guest 2 wrote:
Sun Feb 04, 2024 11:39 pm
Perhaps it's feasible if you introduce a small panel in DGIndex to control filters and you can save there in its ini the default demux (and later decode) parameters. So you can pass the wanted filter to DGSource too having some parameters variable, such as you do with crop.
Hmm, just a thought, perhaps as an option dgsource specify a .ini file to use ? That could be what you meant and I missed it.
I really do like it here.
Post Reply