logo       

Unicode Normalization on MS-Windows: msg#00297

text.unicode.devel

Subject: Unicode Normalization on MS-Windows

Dear Unicoders,

I am using IBM ICU V1.8 for some testing on Windows 2000 and XP, I
found when I process some CJK characters, ICU by default will
normalize it. For example, U+FA19(?0?9) will be replaced by U+795E
(Éñ). However, if I save that two characters into a file on Windows
2000 and XP by using Notepad and select "Unicode" as the encoding, I
don't see Notepad would do such normalization/replacement. Also, on
Windows file system, I can also use that two characters in the
file/folder name, and no normalization seems to be done by the OS
either ...

Can anyone please shed some lights on:

1. Why Windows doesn't do normalization, and is there any ways to ask
Windows to do it?

2. If Windows never do normalization, how should I balance this in my
Windows based application since I am using the ICU. I don't think
simply turn off the normalization process in the ICU would be a good
idea though, however, if I keep to use ICU to normalize everything in
my application, then I will possible run into some troubles when
dealing with the Windows system ...

Thanks

Jane

__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise