|
Re: ASCII and JIS X 0201 Roman - the backslash problem: msg#00075internationalization.linux
Tomohiro KUBOTA writes: > > 3) For programs that interpret backslash as some kind of escape character > > and use Unicode internally but should work with text in Shift_JIS > > encoding, consider the multibyte character 0x5C as being the escape > > trigger, not [only] the Unicode character U+005C. This is already done > > in bash and gettext. For example, in GNU gettext, we have the code > > I think interpretation of > U+00A5 as an additional escape character doesn't always work, because > Unicode texts don't have information on their origin (converted from > Shift_JIS or not). These are particular kinds of text files, which are fed to such programs that do backslash interpretation: shell scripts, awk scripts, gettext PO files, etc. - yes if the Yen sign should appear there it needs to be doubled. > If U+00A5 would always be an escape character, > it would be harmful for many softwares. Why is it more harmful if U+00A5 is an escape character than if U+005C is an escape character? In both cases you just double it to get the original character. > I am interested in how European people succeeded to migrate from ISO 646 > variants into ISO 8859. Yen Sign Problem is exactly a problem of ISO 646, > because "0x5c = YEN SIGN" comes from JIS X 0201 Roman, which is Japanese > variant of ISO 646. For me, the migration occurred when I switched to using a different computer with a different OS and a different character set. (From ISO646-DE to CP437 at that time.) Few files were transported - there is usually a lot of text files that you can just drop once in three years. Among the remaining ones the disambiguation was usually easy, depending on the type of file: In letters I only used umlauts and no brackets, whereas in programs I mostly used brackets and no umlauts. Only few programs contained both brackets and umlauts, and I had to do the fixup by hand, usually the next time I needed the particular program. So it is a minor annoyance over the time of a few months, but by far not the costs that you are estimating. Bruno |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: ASCII and JIS X 0201 Roman - the backslash problem: 00075, H. Peter Anvin |
|---|---|
| Next by Date: | Re: ASCII and JIS X 0201 Roman - the backslash problem: 00075, Glenn Maynard |
| Previous by Thread: | Re: ASCII and JIS X 0201 Roman - the backslash problemi: 00075, H. Peter Anvin |
| Next by Thread: | Re: ASCII and JIS X 0201 Roman - the backslash problem: 00075, Glenn Maynard |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |