Page 1 of 2

[RESOLVED] Demuxing Matroska sources is slower

Posted: Sun Dec 27, 2020 1:35 pm
by Boulder
I've recently started using DGIndexNV to demux with every source type I run into since it's much more convenient to index and demux in one go. I noticed that demuxing Matroska files is quite slow, and did a quick test to see how it looks.

I have a 36,5 GB file which I first indexed and demuxed the first audio track + chapters at the same time. The process took ~32 minutes.
I then only indexed it, which took ~6 minutes. I then demuxed the same tracks with mkvextract, which took ~10 minutes. Based on this, demuxing while indexing seems to take about twice as long.

Is this something that could be avoided or is it just some OS related thing which cannot be avoided?

Re: Demuxing Matroska sources is slower

Posted: Sun Dec 27, 2020 2:13 pm
by Rocky
It's probably fixable. I'll look into it.

Re: Demuxing Matroska sources is slower

Posted: Mon Dec 28, 2020 8:05 am
by Rocky
32 minutes seems way too long. I did a test on a 1GB MKV with latest DGIndexNV and it took less than 5 seconds.

Are you running on hard disks or SSDs? Is your destination and source disk the same?

Please give me a list of the audio types being demuxed.

Please tell me how this 36GB MKV came to be. I will try making one with MakeMKV and try that.

Re: Demuxing Matroska sources is slower

Posted: Mon Dec 28, 2020 10:55 am
by Rocky
I just did an 85GB MKV in 9:22. The only thing I can think of is that you have a PCM track that is using _write() currently instead of fwrite(). If so, can you try demuxing without the PCM track?

EDIT: I just demuxed a 19GB MKV with two PCM tracks in 41 seconds.

I'll wait for answers to my questions before doing anything else.

Re: Demuxing Matroska sources is slower

Posted: Mon Dec 28, 2020 11:37 pm
by Boulder
I recreated a test file to be sure of the conditions. This time I used the movie Se7en (Remastered edition), imported it in MKVToolnix GUI and chose only the VC-1 video track, DTS-HD MA audio track and chapters. Exported to a regular HDD, which I know to be the fastest HDD I have. Filesize 27,3 GB.
Writing went normally and I then opened the Matroska file in the 64bit DGIndexNV. Set cropping and in demuxing, chose everything but the video track. I saved the project on an SSD drive to make sure writing is the least problem.

The processing speed was very slow, Task Manager showed about 3-4 MB/second so it was even slower than my report here. I watched it for a while and then went to the GUI and pushed ESC to cancel. The indexing seemed to halt, but something was still happening in the background as Task Manager now showed a normal speed of ~40 MB/second for disk access for DGIndexNV. It seems it was demuxing the tracks as they then appeared in the save folder and the GUI unfroze (it was in the Not Responding state during the whole time).

I don't know if it's my OS doing things but it seems rather consistent, and using demux on a regular playlist item works perfectly.

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 6:57 am
by Rocky
What are your OS details?

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 9:11 am
by Rocky
Please test this and report back. Thank you.

http://rationalqm.us/misc/DGIndexNV_Boulder.exe

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 1:05 pm
by Boulder
Unfortunately it's just as slow, no distinguishable difference to the official build.

My OS is Windows 10 Professional 64bit, Ryzen 3900X + X370 motherboard based build with the latest AMD drivers installed. No fancy things, just a bunch of HDDs + some USB ones in the system.

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 1:45 pm
by Rocky
Then I am totally baffled. I do not see this slowness. Does any one else see it?

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 3:22 pm
by Guest
Tested two MKVs, HEVC 10 bit HDR, DTS X, Chapter file
OS is Win10 1607
First
25.7 GB
Index + Demux 2:21
Index 2:21
gMKV 1:50

Second
64.4GB
Index+Demux 7:26
Index 6:38
gMKV 5:35

Tests are SSD to SSD

Note
Could it be anti-virus or indexing of the files by Windows

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 4:21 pm
by Rocky
Good questions in your Note. Boulder can try checking task manager during processing to see what else may be hammering the CPU or disk. Also, I hope he is not using networked or USB-connected drives.

gonca, do you have any HDDs available? That is the one thing I see different with Boulder. He is doing HDD -> SSD.

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 4:43 pm
by Guest
Yes, I have my "large" data drive
I'll test HDD to SSD

EDIT
USB3.0 SSD should not be a problem
I use one of those (internal on a dock) for the ES that I index
USB2 could be an issue

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 5:03 pm
by Guest
HDD >> SSD
25.7 GB file
Index + Demux
5:13

Re: Demuxing Matroska sources is slower

Posted: Tue Dec 29, 2020 11:08 pm
by Boulder
Disk usage stays low, Task Manager shows around 5-6% for the whole system. Also no problems with excessive CPU usage or Windows Defender acting up, it also uses very little CPU or disk when I index and demux.

Does anyone else see the thing when you cancel indexing and demuxing still progresses till the end? The disk speed jumps to normal readings the second I cancel indexing.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 4:48 am
by Guest
If the ESC key is used, then indexing stops but not the demuxing, however the speed stays the same (more or less) on my system
I think of this as a feature, How to use DGIndexNV as a demuxer

I also tend to use DGDemux for demuxing

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 5:24 am
by Rocky
When indexing and demuxing transport streams, there is only one thread and one pass through the source file(s). When indexing and demuxing Matroska, however, two threads are used, each of which reads independently the source file(s). One thread does indexing and one does demuxing. It is a bug that canceling does not stop the demuxing thread. I will correct that.

What is interesting here is that the disk speed (for Boulder) jumps up when the indexing is canceled. That suggests some kind of thrashing or deadlocking is going on between the threads. But Boulder is apparently the only person to see this. I'll think about this some more. There are tantalizing hints. Perhaps the process could be made single-threaded for Matroska. That could be the most sensible resolution.

@gonca

DGDemux is not an option for Matroska.

@Boulder

Can you test from SSD to SSD? Even if it is the same SSD it could be interesting.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 6:22 am
by Boulder
Cheers, I will test SSD to SSD when I get back from work (i.e. move from the corner of the drawing room sofa to the computer chair :lol: ) If your theory regarding thrashing holds, it could make a noticable difference. I wouldn't be surprised if this was something specific related to the chipset itself, but just happens to appear in cases like this.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 9:00 am
by Rocky
I have implemented and tested one-pass index+demux for PCM. Let me test DTS-MA and then I will give you a test build. It won't be fully complete but at least you'll be able to see if the disk speed goes up to what it should be. If that works OK, I'll complete the re-design.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 9:17 am
by Rocky
OK, please try this:

http://rationalqm.us/misc/DGIndexNV_onepass.exe

Just use your VC1+DTS-MA file. Don't try anything fancy, no project range setting, etc. Just select the video and DTS for demuxing. Let's see how fast it goes. I ran it on ALIENS (25GB) and it completed in 55 seconds.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 12:33 pm
by Boulder
That version is much faster, seems like it fixed the issue.

SSD to SSD is also quite fast with the old version so I suppose it has something to do with mechanical HDDs having to move the heads around? The files are so huge that they are probably fragmented into a million pieces. Still I wonder why Windows cache won't do its job in this case.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 12:51 pm
by Rocky
Great info, thanks. Before going for that I want to look into the HDD angle. gonca had no issue with HDD to SSD but I want to try it too. Also, this re-design is a bit slower for me in SSD->SSD.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 1:27 pm
by Rocky
@gonca

For proper HDD testing you need to use the flushmem utility and power cycle the HDD before doing anything.

@all

Initial test from HDD to SSD:

stable DGIndexNV: 16 minutes
new DGIndexNV: 3 minutes

So looks like we're going to go with this redesign.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 2:03 pm
by Rocky
Ha ha, unexpected bonus. When I was gearing up to test the HDD, I discovered my USB drive dock was only USB 2.0. I replaced it with a good 3.0 and connected it to a USB 3 Gen 2 port. Just ran a full system backup in 17 minutes, whereas previously it took 35 minutes (integral SSD to SSD connected via USB 3.0). :lol: The new one also has a built-in 3.0 hub.

OK, gotta clean up loose ends with the re-design.

Re: Demuxing Matroska sources is slower

Posted: Wed Dec 30, 2020 2:47 pm
by Curly
Boulder wrote:
Wed Dec 30, 2020 6:22 am
move from the corner of the drawing room sofa to the computer chair :lol:
Now that there is funny! Coming from a primo comedian like me, take it to the bank.

Re: Demuxing Matroska sources is slower

Posted: Fri Jan 01, 2021 3:27 pm
by DG
Rocky, looking at your checked out files for this feature, I notice you added a setting to specify optimization for HDD. I like that idea! But the implementation has lots of duplicated code that could cause maintenance issues down the road. If you like, I'd be happy to clean that up for you. You know, just to get my coding skills back up to snuff. If you had planned to do it, never mind, I have QM papers to catch up on.

BTW, it's possible to automatically determine if an HDD drive is involved but it's a nightmare. Much easier to let the user set things up.