[RESOLVED] DGIndexNV 2048 doesn't handle Shift-JIS names

Support forum for DGDecNV
DAE avatar
jmt247
Posts: 28
Joined: Sun Jun 15, 2014 5:46 pm

[RESOLVED] DGIndexNV 2048 doesn't handle Shift-JIS names

Post by jmt247 »

Hello.

I see some errors when I open a file that contains "0x5c" in the second byte of a character.

Please see https://en.wikipedia.org/wiki/Shift_JIS and https://sites.google.com/site/fudist/Ho ... i-jp/table

It opens the wrong directory when I try to save the project file.

I'm certain that previous versions didn't have this problem.
2047.png
2048.png
Also, those should be "(テレビ大阪) 2014-07-20 天気予報.ts" in the top. "予" is the problematic character.
DAE avatar
jmt247
Posts: 28
Joined: Sun Jun 15, 2014 5:46 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by jmt247 »

The source code of this modified DGIndex would help you to fix this issue.

https://skydrive.live.com/?cid=8658EC27 ... 99D5%21230

edit: I found newer versions.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Is it Unicode? I don't support Unicode. Sorry.
DAE avatar
Aleron Ives
Posts: 126
Joined: Fri May 31, 2013 8:36 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by Aleron Ives »

Shift JIS is one of the older character sets used for Japanese. It predates Unicode and is still very common in Japan and in programs made by Japanese developers.
DAE avatar
jmt247
Posts: 28
Joined: Sun Jun 15, 2014 5:46 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by jmt247 »

I don't know much about programming, but the readme of the source says, he had made a trick with the TCHAR routine instead of Unicode, MBCS. I think Unicode support would help other people who regularly use multibyte characters, though.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Which is the first version up there that includes this fix (including his history directory)? I need to minimize all the other changes he made to make diffing easier for me.
DAE avatar
jmt247
Posts: 28
Joined: Sun Jun 15, 2014 5:46 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by jmt247 »

admin wrote:Which is the first version up there that includes this fix (including his history directory)? I need to minimize all the other changes he made to make diffing easier for me.
He fixed that in mod 3 and this is the closest that I have. (mod 4)

https://www.mediafire.com/?9w0y9z5ah14dkbd

Thank you very much for your effort. All Asians salute you!
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

I've started a Unicode aware version. It will take some time. I'm using the approach described here, which appears to be the easiest way for me to implement Unicode support:

http://www.utf8everywhere.org/

It retains the existing simple char arrays for internal storage but they are now UTF-8. For all interactions with Windows APIs, they are converted as needed to/from wchar_t. I already have a lot of it implemented, certainly enough to prove the concept. There are a lot of places I need to change (for example, the Boost narrow() function does not work on a multiselect string from an open dialog), and there is a lot of regression testing to do, but at this point it is just cranking the handle.

I accept the author's point that in the modern world a program that does not support Unicode is arguably brain dead.
:agree:
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

I've completed Unicode support in DGIndexNV. Not bad for 3 days, given the extensive functionality that was affected (everything touching a Windows API!). ;)

I have to make some minor changes to DGDecodeNV. It has to handle the UTF-8 paths from the AVS script and the DGI file. Then I will give you a beta version of 2049 including this.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Oh crap. I get DGDecodeNV coded for Unicode and then try to test it. Surprise! Stupid Avisynth/Avisynth+ won't open a script file with a Unicode name. And even if I rename the script, it kicks out an error if the script is in UTF-8. Bottom line: Avisynth does not support Unicode, so all my work is wasted. I should have thought of that, but I thought Avisynth could and would just pass my UTF-8 file name param as it is -- a weird looking char string. It doesn't have to interpret it. No, it just sees some UTF-8 in the script and says unh-unh. So stupid. Now, I see threads about Avisynth and Unicode out there but they peter out without anything being concluded or done. And anyway, IanB fell off the face of the earth leaving Avisynth licensing in lala land. What a giant cluster truck.

I will look into hacking Avisynth to at least allow a UTF-8 script name and passing filter params without bothering if they are UTF-8, which would allow my source filter to work, but really, don't hold your breaths.

Let me sleep on this; maybe there is a way out.

:evil: :evil: :evil: :evil: :evil: :evil: :evil: :evil: :evil: :evil: :evil: :evil:
DAE avatar
Aleron Ives
Posts: 126
Joined: Fri May 31, 2013 8:36 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by Aleron Ives »

It's a real shame that AviSynth is stuck in limbo, as it really seems like there should be an x86-64 version with native mulithreading support by now, so developers could update old plugins and write new ones to take full advantage of modern CPUs, but we're still stuck with getting such features in unofficial branches. Thanks for trying to support more languages, at any rate. Maybe you can at least keep the code around in case anyone makes a viable UTF-aware AviSynth fork at some point.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

That's good to know. I wonder if OP will comment.

One shouldn't have to depend on the system locale setting, however.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Ah, it's good to have access to an expert in encodings.

Will you be willing to test my unicode version 2049 to see if it fixes the display issue you mention. If so, I will make it available to you privately.

And of course it's great to know that Avisynth can work correctly if the locale is set properly.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

PM coming.

The display issue at least should be fixed. But this version generates UTF-8 scripts and DGI files. Will that be a problem for Avisynth? If so, what encoding should I use for the scripts? I can use whatever I want for the DGI files, I suppose, as Avisynth doesn't look at it.

After resolving that stuff, we can test DGDecodeNV.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Did you get the PM? I don't see it going away from my outbox.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

No, it's not just the Windows title.

Were you able to test?

I understand that I can't use UTF-8 in the script. What should I use to be able to represent the Unicode names?
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

What else did you find?
Messages in MessageBox's, etc. I've fixed all of them. Remember, I do not have the locale set. I try to make everything work without the locale. So we may not see the same things.
All I could test is the DGI and AVS creation which seems fine. Also, the window title displays correctly now.
Good.
As I mentioned before, define "_MBCS" and "MBCS" in your project, and use string routines from tchar.h.
I can't do that because it conflicts with everything I have already done (I have Unicode enabled just to get the right APIs and compiler checks). But I did discover that it is only the BOM in the script that dazzles Avisynth. So I can just not set ccs=UTF-8 on the open but go ahead and write my UTF-8 strings anyway. The only possible downside is that the script may not look right in an editor?

But I can just use MBCS in the script. That I think I can do with the correct open mode. Will Avisynth be OK with a BOM specifying MBCS?
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Huh? Why not?
Because I am a purist. I think a string should be able to have multiple languages, if that is what the user wants. So I test with names like this:

שלום привет hello.m2ts

The problem is that I can't do anything about Avisynth being not able to open a script named with Unicode, so the locale is needed to get around that. But I have already coded everything non-local-aware and I see no reason to throw it out.
Noooo! If you remove the BOM Avisynth will assume the script content is Ansi (Shift-JIS, whatever) and choke.
Even if the non-ANSI is only in a filename parameter to a source filter?
Yes. If you want to use Unicode in your program you should use the "WideCharToMultiByte()" API to write strings to the script/dgi.
OK, that's the way I will go. But instead of doing it explicitly like that I only have to use open mode "w,ccs=UTF-16LE". If Avisynth can't handle that, I will try your way.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Yes.
But why?

Actually, I just tested it and it doesn't barf.

Also, I am scared to change my locale because I may not be able to switch back because I cannot read Japanese. :o
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Not at all.

As I said I deleted the BOM before opening the script in Avisynth, and my filter received the call. You can call that wrong testing, but it happened. There is no reason for Avisynth to parse a filename parameter to a source filter.

It's moot, though, because I am going to have an MBCS BOM.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

I greatly appreciate your help. ;)
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Thanks for the correction. I just meant that I will send MBCS in the script.

I'll give you the script when I have converted to MBCS from UTF-8. Right now, I have a UTF-8 script and I can make everything work all the way to serving frames to VirtualDub if I delete the BOM from the script. With file mode "w,css=UTF-8" a BOM gets added to the script by the runtime. If I leave it, Avisynth barfs. If I delete it, everything works.

Converting to MBCS in the script should solve everything (I hope). I can leave UTF-8 in the DGI file; that doesn't matter.
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

OK, remember the script is created by my AVS template system. That's another good reason not to use UTF-8 for the script, because a user making a script by hand will not be using UTF-8. :scratch:

See the attachment.
Attachments
שלום привет hello.avs
(160 Bytes) Downloaded 331 times
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

Can you show me a script that Avisynth can open but which has a multibyte filename for the source filter? And how will users make scripts like that?
User avatar
admin
Posts: 4551
Joined: Thu Sep 09, 2010 3:08 pm

Re: DGIndexNV 2048 doesn't handle Shift-JIS names

Post by admin »

So it's just a char string that is properly mapped by the locale?

In that case, I should just go back and make a few simple changes like you described, as you say everything just worked with one (or maybe a few) display issues. The idea of "UTF-8 Everywhere" is not viable if Avisynth does not support it.
Post Reply