DGMVCsource
Re: DGMVCsource
With the functionality completed and working as designed it's indeed time to do some performance testing and tweaking. I will start that today.
Re: DGMVCsource
Yeah some of us are lurking in the shadows here, just for the warmth...thanks.neuron2 wrote:Suffering Doom9 withdrawal here...
Well OK I guess that's why we have lap cats. I'll go away now.
Re: DGMVCsource
I tested the decoding frame rate with and without HW acceleration using AVSMeter 1.7.5 and a 2D AVC stream 1920x1080.
dgmvcsource("panama.264","",view=0,frames=1608)
assumefps(10000)
SW:
Frames processed: 1608 (0 - 1607)
FPS (min | max | average): 31.58 | 128.7 | 48.91
CPU usage (average): 13%
Thread count: 11
Physical Memory usage (peak): 210 MB
Virtual Memory usage (peak): 209 MB
Time (elapsed): 000:00:32.879
HW:
Frames processed: 1608 (0 - 1607)
FPS (min | max | average): 73.18 | 344.1 | 156.4
CPU usage (average): 3%
Thread count: 11
Physical Memory usage (peak): 260 MB
Virtual Memory usage (peak): 282 MB
Time (elapsed): 000:00:10.280
So HW is three times faster to decode. Everything looks OK here. I assume things would be a little slower when the combiner is used.
DGIndexNV rendered this at 140 fps, so QuickSync appears somewhat faster than CUVID VP5 for this stream.
dgmvcsource("panama.264","",view=0,frames=1608)
assumefps(10000)
SW:
Frames processed: 1608 (0 - 1607)
FPS (min | max | average): 31.58 | 128.7 | 48.91
CPU usage (average): 13%
Thread count: 11
Physical Memory usage (peak): 210 MB
Virtual Memory usage (peak): 209 MB
Time (elapsed): 000:00:32.879
HW:
Frames processed: 1608 (0 - 1607)
FPS (min | max | average): 73.18 | 344.1 | 156.4
CPU usage (average): 3%
Thread count: 11
Physical Memory usage (peak): 260 MB
Virtual Memory usage (peak): 282 MB
Time (elapsed): 000:00:10.280
So HW is three times faster to decode. Everything looks OK here. I assume things would be a little slower when the combiner is used.
DGIndexNV rendered this at 140 fps, so QuickSync appears somewhat faster than CUVID VP5 for this stream.
Re: DGMVCsource
Any chance that NVIDIA/CUDA/CUVID will support MVC HW decoding in future?
Re: DGMVCsource
How do you make your test ? What do you do to enable or not HW acceleration ? And what is your CPU ?
Re: DGMVCsource
I assumed that DGMVCdecoder detects the HW capabilities and makes the best selection automatically, no?
Anyway, my HW is definitely not the latest:
CPU: Intel Core 2 Quad Q9300 / Yorkfield
GPU: NVIDIA Geforce 9600 GS / G94 Rev. A1
OS: Windows 7/64bit
Anyway, my HW is definitely not the latest:
CPU: Intel Core 2 Quad Q9300 / Yorkfield
GPU: NVIDIA Geforce 9600 GS / G94 Rev. A1
OS: Windows 7/64bit
Re: DGMVCsource
I don't know anything about that. It's up to nVidia.Sharc wrote:Any chance that NVIDIA/CUDA/CUVID will support MVC HW decoding in future?
I test by just opening the script with AVSMeter. My CPU is i7-4770.
HW is automatically used if its available. For my testing I forced it by modifying the code. I can add a parameter to force SW or HW.
Re: DGMVCsource
Here is a version that allows you to force SW or HW decoding.
http://rationalqm.us/misc/dgmvcsource_b20.zip
http://rationalqm.us/misc/dgmvcsource_b20.zip
Re: DGMVCsource
Source code was released at my web site with a free use license.
Re: DGMVCsource
According to my testing with AVSMeter on 2D streams, dgmvcsource is way faster than frimsource. Does anybody else have any results?
Re: DGMVCsource
Are you referring to HW decoding?
For 3D SW decoding the two seem to be much the same here.
(My PC does not support Intel HW decoding, only 2D CUDA/CUVID)
Edit:
Here my test results for 3D SW decoding (separate i/p files, interleaved o/p):
DGMVCsource:
[Clip info]
Number of frames: 3300
Length (hhh:mm:ss.ms): 000:01:08.819
Frame width: 1920
Frame height: 1080
Framerate: 47.952 (96000/2002)
Interlaced: No
Colorspace: YV12
[Runtime info]
Frames processed: 3300 (0 - 3299)
FPS (min | max | average): 13.89 | 769.2 | 26.47
CPU usage (average): 26%
Thread count: 8
Physical Memory usage (peak): 318 MB
Virtual Memory usage (peak): 320 MB
Time (elapsed): 000:02:04.671
FRIMsource:
[Clip info]
Number of frames: 3300
Length (hhh:mm:ss.ms): 000:01:08.819
Frame width: 1920
Frame height: 1080
Framerate: 47.952 (96000/2002)
Interlaced: No
Colorspace: YV12
[Runtime info]
Frames processed: 3300 (0 - 3299)
FPS (min | max | average): 14.67 | 303.9 | 27.10
CPU usage (average): 49%
Thread count: 12
Physical Memory usage (peak): 470 MB
Virtual Memory usage (peak): 499 MB
Time (elapsed): 000:02:01.772
FRIM has the edge in my tests; it's CPU usage, Thread count and Memory usage are higher though.
I did also notice that FRIM seems to give more consistent results in repeated tests (i.e. lower spread of the decoding times).
Edit2:
The difference gets more pronounced when outputting to half-SBS:
DGMVCsource: Time (elapsed): 000:02:12.208
FRIMsource: Time (elapsed): 000:02:01.401
Edit3:
Most interesting: The winner changes when I transcode half-SBS with x264 (using the same scripts as before):
DGMVCsource + x264 (half-SBS): Total time = 180 sec.
FRIMsource + x264 (half-SBS): Total time = 195 sec.
For 3D SW decoding the two seem to be much the same here.
(My PC does not support Intel HW decoding, only 2D CUDA/CUVID)
Edit:
Here my test results for 3D SW decoding (separate i/p files, interleaved o/p):
DGMVCsource:
[Clip info]
Number of frames: 3300
Length (hhh:mm:ss.ms): 000:01:08.819
Frame width: 1920
Frame height: 1080
Framerate: 47.952 (96000/2002)
Interlaced: No
Colorspace: YV12
[Runtime info]
Frames processed: 3300 (0 - 3299)
FPS (min | max | average): 13.89 | 769.2 | 26.47
CPU usage (average): 26%
Thread count: 8
Physical Memory usage (peak): 318 MB
Virtual Memory usage (peak): 320 MB
Time (elapsed): 000:02:04.671
FRIMsource:
[Clip info]
Number of frames: 3300
Length (hhh:mm:ss.ms): 000:01:08.819
Frame width: 1920
Frame height: 1080
Framerate: 47.952 (96000/2002)
Interlaced: No
Colorspace: YV12
[Runtime info]
Frames processed: 3300 (0 - 3299)
FPS (min | max | average): 14.67 | 303.9 | 27.10
CPU usage (average): 49%
Thread count: 12
Physical Memory usage (peak): 470 MB
Virtual Memory usage (peak): 499 MB
Time (elapsed): 000:02:01.772
FRIM has the edge in my tests; it's CPU usage, Thread count and Memory usage are higher though.
I did also notice that FRIM seems to give more consistent results in repeated tests (i.e. lower spread of the decoding times).
Edit2:
The difference gets more pronounced when outputting to half-SBS:
DGMVCsource: Time (elapsed): 000:02:12.208
FRIMsource: Time (elapsed): 000:02:01.401
Edit3:
Most interesting: The winner changes when I transcode half-SBS with x264 (using the same scripts as before):
DGMVCsource + x264 (half-SBS): Total time = 180 sec.
FRIMsource + x264 (half-SBS): Total time = 195 sec.
Re: DGMVCsource
For me, FRIMSource is also faster but CPU and RAM usage are also more important than DGMVCSource.
Re: DGMVCsource
It is my humble privilege and honor to provide refuge for suffering souls, such as yourself, one known for advanced thinking on all planes, and I am always striving to plow the field for advanced thinkers to reap the abundant harvest of their intellects.laserfan wrote: Yeah some of us are lurking in the shadows here, just for the warmth...thanks.
Come again, soon! And may your spirit run wild.Well OK I guess that's why we have lap cats. I'll go away now.
Re: DGMVCsource
Really late to the party.
I completely forgot about this project.
I followed a bit on Doom9 and saw how complicated things got.
I believe it was Intel limitations?
Now that I've got an upgrade--- wait, was this project based on Intel QSV? Or is it Nvidia-friendly?
The desktop has Intel HD Graphics (until I plugged in the NVidia).
I know my old HP Pavilion laptop has Intel HD Graphics 3000 (2nd generation Intel Core i3-2330M Processor 2.2 GHz)(which was to also use DGdecQSV-thingy) but I haven't plugged that in for months!
My second laptop is a Samsung with an old Nvidia card in it (310M).
Found an old link that still works. I should actually read through that thread.
Man, the memories. Just not long ago I was devoid of some good CUDA for my desktop (before getting that GT 640 and then upgrading to GTX 750).
I've been delaying this for some time since I didn't have that many 3D-BD's but time has certainly passed.
I'll see about the condition of my laptop and have a look at this.
EDIT: Ah yeah, I remember: it's an Avisynth plugin.
Some reason or other, I was thinking it was an MVC indexer.
Time for some coffee.
I completely forgot about this project.
I followed a bit on Doom9 and saw how complicated things got.
I believe it was Intel limitations?
Now that I've got an upgrade--- wait, was this project based on Intel QSV? Or is it Nvidia-friendly?
The desktop has Intel HD Graphics (until I plugged in the NVidia).
I know my old HP Pavilion laptop has Intel HD Graphics 3000 (2nd generation Intel Core i3-2330M Processor 2.2 GHz)(which was to also use DGdecQSV-thingy) but I haven't plugged that in for months!
My second laptop is a Samsung with an old Nvidia card in it (310M).
Found an old link that still works. I should actually read through that thread.
Man, the memories. Just not long ago I was devoid of some good CUDA for my desktop (before getting that GT 640 and then upgrading to GTX 750).
I've been delaying this for some time since I didn't have that many 3D-BD's but time has certainly passed.
I'll see about the condition of my laptop and have a look at this.
EDIT: Ah yeah, I remember: it's an Avisynth plugin.
Some reason or other, I was thinking it was an MVC indexer.
Time for some coffee.
Re: DGMVCsource
It's not clear what "it" refers to here.sparktank wrote: I believe it was Intel limitations?
It's QSV only. CUVID doesn't support MVC yet.was this project based on Intel QSV? Or is it Nvidia-friendly?
It's possible to run both your nVidia stuff and the QSV stuff on the same machine. It's described in the thread, IIRC.
Re: DGMVCsource
Just noticed that newer builds are available Thanks for your efforts!
Just curious - what has been changed since b20? I assume at least a newer SDK.
Just curious - what has been changed since b20? I assume at least a newer SDK.
Re: DGMVCsource
Newer SDK to get decoding fixes. That is all.