[RESOLVED] DGIndexNV 2048 doesn't handle Shift-JIS names
[RESOLVED] DGIndexNV 2048 doesn't handle Shift-JIS names
Hello.
I see some errors when I open a file that contains "0x5c" in the second byte of a character.
Please see https://en.wikipedia.org/wiki/Shift_JIS and https://sites.google.com/site/fudist/Ho ... i-jp/table
It opens the wrong directory when I try to save the project file.
I'm certain that previous versions didn't have this problem.
Also, those should be "(テレビ大阪) 2014-07-20 天気予報.ts" in the top. "予" is the problematic character.
I see some errors when I open a file that contains "0x5c" in the second byte of a character.
Please see https://en.wikipedia.org/wiki/Shift_JIS and https://sites.google.com/site/fudist/Ho ... i-jp/table
It opens the wrong directory when I try to save the project file.
I'm certain that previous versions didn't have this problem.
Also, those should be "(テレビ大阪) 2014-07-20 天気予報.ts" in the top. "予" is the problematic character.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
The source code of this modified DGIndex would help you to fix this issue.
https://skydrive.live.com/?cid=8658EC27 ... 99D5%21230
edit: I found newer versions.
https://skydrive.live.com/?cid=8658EC27 ... 99D5%21230
edit: I found newer versions.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Is it Unicode? I don't support Unicode. Sorry.
- Aleron Ives
- Posts: 126
- Joined: Fri May 31, 2013 8:36 pm
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Shift JIS is one of the older character sets used for Japanese. It predates Unicode and is still very common in Japan and in programs made by Japanese developers.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
I don't know much about programming, but the readme of the source says, he had made a trick with the TCHAR routine instead of Unicode, MBCS. I think Unicode support would help other people who regularly use multibyte characters, though.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Which is the first version up there that includes this fix (including his history directory)? I need to minimize all the other changes he made to make diffing easier for me.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
He fixed that in mod 3 and this is the closest that I have. (mod 4)admin wrote:Which is the first version up there that includes this fix (including his history directory)? I need to minimize all the other changes he made to make diffing easier for me.
https://www.mediafire.com/?9w0y9z5ah14dkbd
Thank you very much for your effort. All Asians salute you!
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
I've started a Unicode aware version. It will take some time. I'm using the approach described here, which appears to be the easiest way for me to implement Unicode support:
http://www.utf8everywhere.org/
It retains the existing simple char arrays for internal storage but they are now UTF-8. For all interactions with Windows APIs, they are converted as needed to/from wchar_t. I already have a lot of it implemented, certainly enough to prove the concept. There are a lot of places I need to change (for example, the Boost narrow() function does not work on a multiselect string from an open dialog), and there is a lot of regression testing to do, but at this point it is just cranking the handle.
I accept the author's point that in the modern world a program that does not support Unicode is arguably brain dead.
http://www.utf8everywhere.org/
It retains the existing simple char arrays for internal storage but they are now UTF-8. For all interactions with Windows APIs, they are converted as needed to/from wchar_t. I already have a lot of it implemented, certainly enough to prove the concept. There are a lot of places I need to change (for example, the Boost narrow() function does not work on a multiselect string from an open dialog), and there is a lot of regression testing to do, but at this point it is just cranking the handle.
I accept the author's point that in the modern world a program that does not support Unicode is arguably brain dead.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
I've completed Unicode support in DGIndexNV. Not bad for 3 days, given the extensive functionality that was affected (everything touching a Windows API!).
I have to make some minor changes to DGDecodeNV. It has to handle the UTF-8 paths from the AVS script and the DGI file. Then I will give you a beta version of 2049 including this.
I have to make some minor changes to DGDecodeNV. It has to handle the UTF-8 paths from the AVS script and the DGI file. Then I will give you a beta version of 2049 including this.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Oh crap. I get DGDecodeNV coded for Unicode and then try to test it. Surprise! Stupid Avisynth/Avisynth+ won't open a script file with a Unicode name. And even if I rename the script, it kicks out an error if the script is in UTF-8. Bottom line: Avisynth does not support Unicode, so all my work is wasted. I should have thought of that, but I thought Avisynth could and would just pass my UTF-8 file name param as it is -- a weird looking char string. It doesn't have to interpret it. No, it just sees some UTF-8 in the script and says unh-unh. So stupid. Now, I see threads about Avisynth and Unicode out there but they peter out without anything being concluded or done. And anyway, IanB fell off the face of the earth leaving Avisynth licensing in lala land. What a giant cluster truck.
I will look into hacking Avisynth to at least allow a UTF-8 script name and passing filter params without bothering if they are UTF-8, which would allow my source filter to work, but really, don't hold your breaths.
Let me sleep on this; maybe there is a way out.
I will look into hacking Avisynth to at least allow a UTF-8 script name and passing filter params without bothering if they are UTF-8, which would allow my source filter to work, but really, don't hold your breaths.
Let me sleep on this; maybe there is a way out.
- Aleron Ives
- Posts: 126
- Joined: Fri May 31, 2013 8:36 pm
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
It's a real shame that AviSynth is stuck in limbo, as it really seems like there should be an x86-64 version with native mulithreading support by now, so developers could update old plugins and write new ones to take full advantage of modern CPUs, but we're still stuck with getting such features in unofficial branches. Thanks for trying to support more languages, at any rate. Maybe you can at least keep the code around in case anyone makes a viable UTF-aware AviSynth fork at some point.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
That's good to know. I wonder if OP will comment.
One shouldn't have to depend on the system locale setting, however.
One shouldn't have to depend on the system locale setting, however.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Ah, it's good to have access to an expert in encodings.
Will you be willing to test my unicode version 2049 to see if it fixes the display issue you mention. If so, I will make it available to you privately.
And of course it's great to know that Avisynth can work correctly if the locale is set properly.
Will you be willing to test my unicode version 2049 to see if it fixes the display issue you mention. If so, I will make it available to you privately.
And of course it's great to know that Avisynth can work correctly if the locale is set properly.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
PM coming.
The display issue at least should be fixed. But this version generates UTF-8 scripts and DGI files. Will that be a problem for Avisynth? If so, what encoding should I use for the scripts? I can use whatever I want for the DGI files, I suppose, as Avisynth doesn't look at it.
After resolving that stuff, we can test DGDecodeNV.
The display issue at least should be fixed. But this version generates UTF-8 scripts and DGI files. Will that be a problem for Avisynth? If so, what encoding should I use for the scripts? I can use whatever I want for the DGI files, I suppose, as Avisynth doesn't look at it.
After resolving that stuff, we can test DGDecodeNV.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Did you get the PM? I don't see it going away from my outbox.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
No, it's not just the Windows title.
Were you able to test?
I understand that I can't use UTF-8 in the script. What should I use to be able to represent the Unicode names?
Were you able to test?
I understand that I can't use UTF-8 in the script. What should I use to be able to represent the Unicode names?
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Messages in MessageBox's, etc. I've fixed all of them. Remember, I do not have the locale set. I try to make everything work without the locale. So we may not see the same things.What else did you find?
Good.All I could test is the DGI and AVS creation which seems fine. Also, the window title displays correctly now.
I can't do that because it conflicts with everything I have already done (I have Unicode enabled just to get the right APIs and compiler checks). But I did discover that it is only the BOM in the script that dazzles Avisynth. So I can just not set ccs=UTF-8 on the open but go ahead and write my UTF-8 strings anyway. The only possible downside is that the script may not look right in an editor?As I mentioned before, define "_MBCS" and "MBCS" in your project, and use string routines from tchar.h.
But I can just use MBCS in the script. That I think I can do with the correct open mode. Will Avisynth be OK with a BOM specifying MBCS?
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Because I am a purist. I think a string should be able to have multiple languages, if that is what the user wants. So I test with names like this:Huh? Why not?
שלום привет hello.m2ts
The problem is that I can't do anything about Avisynth being not able to open a script named with Unicode, so the locale is needed to get around that. But I have already coded everything non-local-aware and I see no reason to throw it out.
Even if the non-ANSI is only in a filename parameter to a source filter?Noooo! If you remove the BOM Avisynth will assume the script content is Ansi (Shift-JIS, whatever) and choke.
OK, that's the way I will go. But instead of doing it explicitly like that I only have to use open mode "w,ccs=UTF-16LE". If Avisynth can't handle that, I will try your way.Yes. If you want to use Unicode in your program you should use the "WideCharToMultiByte()" API to write strings to the script/dgi.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
But why?Yes.
Actually, I just tested it and it doesn't barf.
Also, I am scared to change my locale because I may not be able to switch back because I cannot read Japanese.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Not at all.
As I said I deleted the BOM before opening the script in Avisynth, and my filter received the call. You can call that wrong testing, but it happened. There is no reason for Avisynth to parse a filename parameter to a source filter.
It's moot, though, because I am going to have an MBCS BOM.
As I said I deleted the BOM before opening the script in Avisynth, and my filter received the call. You can call that wrong testing, but it happened. There is no reason for Avisynth to parse a filename parameter to a source filter.
It's moot, though, because I am going to have an MBCS BOM.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
I greatly appreciate your help.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Thanks for the correction. I just meant that I will send MBCS in the script.
I'll give you the script when I have converted to MBCS from UTF-8. Right now, I have a UTF-8 script and I can make everything work all the way to serving frames to VirtualDub if I delete the BOM from the script. With file mode "w,css=UTF-8" a BOM gets added to the script by the runtime. If I leave it, Avisynth barfs. If I delete it, everything works.
Converting to MBCS in the script should solve everything (I hope). I can leave UTF-8 in the DGI file; that doesn't matter.
I'll give you the script when I have converted to MBCS from UTF-8. Right now, I have a UTF-8 script and I can make everything work all the way to serving frames to VirtualDub if I delete the BOM from the script. With file mode "w,css=UTF-8" a BOM gets added to the script by the runtime. If I leave it, Avisynth barfs. If I delete it, everything works.
Converting to MBCS in the script should solve everything (I hope). I can leave UTF-8 in the DGI file; that doesn't matter.
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
OK, remember the script is created by my AVS template system. That's another good reason not to use UTF-8 for the script, because a user making a script by hand will not be using UTF-8.
See the attachment.
See the attachment.
- Attachments
-
- שלום привет hello.avs
- (160 Bytes) Downloaded 371 times
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
Can you show me a script that Avisynth can open but which has a multibyte filename for the source filter? And how will users make scripts like that?
Re: DGIndexNV 2048 doesn't handle Shift-JIS names
So it's just a char string that is properly mapped by the locale?
In that case, I should just go back and make a few simple changes like you described, as you say everything just worked with one (or maybe a few) display issues. The idea of "UTF-8 Everywhere" is not viable if Avisynth does not support it.
In that case, I should just go back and make a few simple changes like you described, as you say everything just worked with one (or maybe a few) display issues. The idea of "UTF-8 Everywhere" is not viable if Avisynth does not support it.