DGDecomb
Re: DGDecomb
Well, to be honest, my experience has been that the current implementation causes as many problems as it solves. I may look into it a little bit more before I decide to keep it or ditch it. If anybody has some relevant clips, feel free to submit them.
Re: DGDecomb
Sorry for vacillating but I have resurrected my edge-based approach. I found a way to produce a really good edge map from one field. The directional values are also very good. I just couldn't give it up because I knew that Faroudja's DCDi works so well. It doesn't mean I won't pursue full mo-comp later, but I want to try to bring this idea to fruition first.
Re: DGDecomb
Out of curiosity: Has Faroudja's DCDi algo ever been released/licensed as a SW package for PC use, or was it only available as HW (chip) solution in standalone players and/or TV sets?
Edit:
Never mind. I found the answer in Wikipedia.
Edit:
Never mind. I found the answer in Wikipedia.
Re: DGDecomb
Interestingly, Faroudja got acquired by STMicroelectronics, whom I worked for for 15 years. Sadly, after retiring I no longer have access to STMicro technology documents. I did find the original Faroudja patent, however, and it is clear that what I am attempting now is quite different. They determined the best correlation angle through a pixel to be interpolated by calculating the difference between the pixels above and below the current line at various angles. But the best correlation is not always the correct angle to interpolate. I am detecting the actual edges and their orientations.
Re: DGDecomb
admin wrote:I have decided to ditch my edge-based approach to deinterlacing. The problem is that it is not possible to produce a reliable enough edge map from a single field. And even if it were reliable enough, it's just not right to restrict proper handling to edges.
So now my idea is to start with an existing fast CUDA motion flow code from nVidia and add half-pixel motion compensation (either via constant phase FIR filtering or superresolution). The flow map will be per-pixel, not block-based. The goal is to give decent quality and be much faster than QTGMC. Fun, fun, fun.
admin wrote:I have resurrected my edge-based approach. I found a way to produce a really good edge map from one field. The directional values are also very good. I just couldn't give it up because I knew that Faroudja's DCDi works so well. It doesn't mean I won't pursue full mo-comp later, but I want to try to bring this idea to fruition first.
My goodness. A really good edge based approach, as well as the prospect of CUDA motion compensated approach. How good is this ?admin wrote:I did find the original Faroudja patent, however, and it is clear that what I am attempting now is quite different. They determined the best correlation angle through a pixel to be interpolated by calculating the difference between the pixels above and below the current line at various angles. But the best correlation is not always the correct angle to interpolate. I am detecting the actual edges and their orientations.
I really do like it here.
Re: DGDecomb
I appreciate the sentiment but let's not count our chickens before they hatch.
Re: DGDecomb
Just curious: How is it going with the bobber?
Re: DGDecomb
Slow but steady. It's not an easy thing, and I am learning lots of things along the way.
Re: DGDecomb
Awesome! I feel like you are going to be the one to free us from the slow-but-high-quality shackles of QTGMC
Re: DGDecomb
Apart from a speed gain I would see a major benefit in getting ONE plugin for QTGMC similar functionality and quality.
To collect the "right" set of plugins for keeping track with ongoing avisynth (forks) and QTGMC developments can be quite an exercise today. Let's see .....
To collect the "right" set of plugins for keeping track with ongoing avisynth (forks) and QTGMC developments can be quite an exercise today. Let's see .....
Re: DGDecomb
The Horn & Schunck dense optical flow method implemented in the nVidia sample is not going to cut it for high-quality bobbing. I had implemented the motion compensation based on the flow map and it truly sucked. Somewhat baffled, I created a frame sequence with just a single low-slope white line moving at 45 degrees, and assessed the resulting flow field. It was shocking to see how poor the map was. Not only was the flow diffused away from the line by a large amount, but the line itself showed significant "lumpiness" when it should have been smooth. The Horn & Schunck method employs a smooth flow assumption, i.e., that the flow of nearby points are close to each other. But that cannot be the case for a moving edge, and it is the edges that concern us the most.
As I mentioned this stuff is not easy and I am learning a lot (QTGMC has a 12 year head start). It seems that a block-matching approach will be needed, and I have started looking into efficient CUDA implementations. Don't count the chickens yet!
As I mentioned this stuff is not easy and I am learning a lot (QTGMC has a 12 year head start). It seems that a block-matching approach will be needed, and I have started looking into efficient CUDA implementations. Don't count the chickens yet!
Re: DGDecomb
Oh yes! So still 11 years and 11 months to go ....admin wrote: As I mentioned this stuff is not easy and I am learning a lot (QTGMC has a 12 year head start)
Re: DGDecomb
Or never. Uncertainty is what makes life interesting.
I've been looking for new approaches that are algorithmically faster than QTGMC. It's possible that such a viable approach does not exist. Then I can fall back to simply trying to accelerate the existing QTGMC functionality.
I've been looking for new approaches that are algorithmically faster than QTGMC. It's possible that such a viable approach does not exist. Then I can fall back to simply trying to accelerate the existing QTGMC functionality.
Re: DGDecomb
Good show, finding that out.admin wrote:The Horn & Schunck dense optical flow method implemented in the nVidia sample is not going to cut it for high-quality bobbing.
A bit of a pity, these things do sound like a good idea at the time.
I really do like it here.
Re: DGDecomb
While I continue to study QTGMC and how to speed it up using CUDA, I took a break to implement the YADIF algorithm in CUDA. I implemented it from scratch based on the public algorithm (no code re-use) so there is no problem with licensing. Of course, Michael Niedermayer gets credit as the creator of the YADIF concept. I made one small tweak to the idea for speed.
For comparison I tested both the CUDA version and yadifmod2. (BTW, yadifmod2 has a little bug in the spatial check which I fixed. I sent a PM about it to Chikuzen. It doesn't affect performance but makes the spatial check a bit wrong.) Following are for UHD 3840x2160, 7700K CPU, 1080Ti GPU. For the multi-threading of yadifmod2, 4 threads was the sweet spot. Avisynth+. All 64-bit.
CUDA: 177 fps, 13% CPU
yadifmod2 SSE3 single-threaded: 79, 13
yadifmod2 SSE3 multi-threaded: 239, 48
yadifmod2 AVX2 single-threaded: 127, 14
yadifmod2 AVX2 multi-threaded: 244, 30
Remarks:
1. CUDA wins for single-threaded cases. The CPU utilization is comparable, but CUDA delivers a higher frame rate.
2. For multi-threaded, yadifmod2 wins for frame rate but at the expense of significantly higher CPU utilization. For transcoding, CUDA's 177 fps for UHD is nowhere near a bottleneck, so the lower CPU utilization is preferable to give more CPU to the encoding part of the chain.
3. As always, the CUDA performance is limited by the memory copying to/from the GPU. The kernel compute time is almost negligible compared to that. We need higher CPU-GPU bandwidth! Another useful thing would be to be able to pin the memory, but Avisynth would need to be modified (I am playing around with that).
4. Integration with DGDecodeNV is welcome to avoid some extra user work/script complexity.
5. Yes, I thought of it too; this deinterlacing should replace what we have in DGTelecide for postprocessing, etc. Patience!
6. The SIMD optimizations in yadifmod2 are exceedingly well done (the non-SIMD performance is, as expected, abysmal). Kudos to the authors!
I will do a little bit more testing and then slipstream this later today.
For comparison I tested both the CUDA version and yadifmod2. (BTW, yadifmod2 has a little bug in the spatial check which I fixed. I sent a PM about it to Chikuzen. It doesn't affect performance but makes the spatial check a bit wrong.) Following are for UHD 3840x2160, 7700K CPU, 1080Ti GPU. For the multi-threading of yadifmod2, 4 threads was the sweet spot. Avisynth+. All 64-bit.
CUDA: 177 fps, 13% CPU
yadifmod2 SSE3 single-threaded: 79, 13
yadifmod2 SSE3 multi-threaded: 239, 48
yadifmod2 AVX2 single-threaded: 127, 14
yadifmod2 AVX2 multi-threaded: 244, 30
Remarks:
1. CUDA wins for single-threaded cases. The CPU utilization is comparable, but CUDA delivers a higher frame rate.
2. For multi-threaded, yadifmod2 wins for frame rate but at the expense of significantly higher CPU utilization. For transcoding, CUDA's 177 fps for UHD is nowhere near a bottleneck, so the lower CPU utilization is preferable to give more CPU to the encoding part of the chain.
3. As always, the CUDA performance is limited by the memory copying to/from the GPU. The kernel compute time is almost negligible compared to that. We need higher CPU-GPU bandwidth! Another useful thing would be to be able to pin the memory, but Avisynth would need to be modified (I am playing around with that).
4. Integration with DGDecodeNV is welcome to avoid some extra user work/script complexity.
5. Yes, I thought of it too; this deinterlacing should replace what we have in DGTelecide for postprocessing, etc. Patience!
6. The SIMD optimizations in yadifmod2 are exceedingly well done (the non-SIMD performance is, as expected, abysmal). Kudos to the authors!
I will do a little bit more testing and then slipstream this later today.
Re: DGDecomb
So, you took a break from coding a software package by coding another software package.I took a break
Awesome
Re: DGDecomb
C'mon guys, suggest me a new host.
Re: DGDecomb
website hosting is not in my domain of experience but how about this
https://ca.godaddy.com/hosting/windows-hosting
https://ca.godaddy.com/hosting/windows-hosting
Re: DGDecomb
Sure, that's an option. I'm hoping someone with experience elsewhere may chime in. HostGator also looks good. But will I be just be exchanging one set of problems for another? If only Comcast would let me host my own server.
Re: DGDecomb
Host back up so I slipstreamed the new code.
Re: DGDecomb
I hate to ask, but was the documentation, DGDecodeNVManual, updated?
I dodn't see any mention of your new filter
I dodn't see any mention of your new filter
Re: DGDecomb
Seems I forgot to put it in the zip file. I'll do it right now. Thanks for the heads up.
EDIT: Done.
EDIT: Done.
Re: DGDecomb
Thanks for the new filter
Re: DGDecomb
admin note: oops I accidentally edited this instead of replying.
You had said you thought DGBob was leaving more residual combing.
You had said you thought DGBob was leaving more residual combing.