logo       

Re: Shift-JIS/Unicode mapping in JAVA: msg#00739

text.unicode.general

Subject: Re: Shift-JIS/Unicode mapping in JAVA

Most probably, Sun upgraded its tables from ICU, and ICU had this bug, which
did not exist in their prior tables for MS-CP932. So the source of the data may
now be different, or there may be an alias problem in the MS-CP932 encoding
name.
Submit this bug to Sun, (and probably also to IBM's ICU), so that it can be
corrected...

This is really a regression, unless Microsoft has changed its MS-CP932 to
better support the new JIS standard based on the unifciation of the Han script
in Windows XP, .Net, and Windows 2003...

In that case, Microsoft has corrected its codepage without registering a new
codepage (and the fault is on Microsoft).

-- Philippe.
----- Original Message -----
From: "Jane Liu" <xjliu_ca@xxxxxxxxx>
To: <unicode@xxxxxxxxxxx>
Sent: Wednesday, May 28, 2003 9:36 PM
Subject: Shift-JIS/Unicode mapping in JAVA


> Hi,
>
> I am running a JAVA program on Japanese Windows 2000 system, looking
> at the Unicode conversion of the following four characters from
> Shift-JIS encoding (MS-CP932) in both JRE 1.3.1 and JRE 1.4.1, and
> noticed some interesting changes:
>
> In JRE 1.3.1, it converts them just same as what Microsoft does:
>
> 0x815C (&#8213;) -> U+2015 (&#8213;) Horizontal Bar
> 0x8160 (&#65374;) -> U+FF5E (&#65374;) Full-width Tilde
> 0x8161 (&#8741;) -> U+2225 (&#8741;) Parallel To
> 0x817C (&#65293;) -> U+FF0D (&#65293;) Full-width Hyphen
>
> In JRE 1.4.1, it converts them just same as what ICU does:
>
> 0x815C (&#8213;) -> U+2014 (-) EM Dash
> 0x8160 (&#65374;) -> U+301C (&#12316;) Wave Dash
> 0x8161 (&#8741;) -> U+2016 (&#8214;) Double Vertical Line
> 0x817C (&#65293;) -> U+2212 (&#8722;) Minus Sign
>
> Obviously, this cause some backward compatibility & forward migration
> issues here. I have the exactly same program. Those four Japanese
> characters used to work perfectly when we use the older JRE version
> 1.3.1. However, now we move up to JRE 1.4.1, three of the four
> charcters are displayed differently, and one which is the "Double
> vertical line" becomes a dot on the UI because U+2016 is not defined
> in Japanese TrueType font "MS Gothic".
>
> Can someone help please? Why SUN made such changes? To me, it's hard
> to believe this is just a mistake in the new mapping table, if SUN
> does have some good reasons, they may also require some code changes,
> would this be true? Where and how should I change my code?
>
> Thanks.
>
> Jane
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
> http://calendar.yahoo.com
>


------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get A Free Psychic Reading! Your Online Answer To Life's Important Questions.
http://us.click.yahoo.com/Lj3uPC/Me7FAA/CNxFAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-unsubscribe@xxxxxxxxxxxxxxx

This mailing list is just an archive. The instructions to join the true Unicode
List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/





<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise