|
Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn): msg#00337text.unicode.devel
Wait just a second. The IJ digraph was added for compatibility with other standards, not necessarily because it is really needed for Dutch. Unicode does not, in general, encode graphemes, except for compatibility purposes; "ch" for Spanish and Slovak, for example, are not encoded. Given the mass of data in Dutch that already use "i" + "j" to encode that grapheme, adding the "ij" character will just confuse matters. When editing a mixture of such text, search/replace will not identify the two; users will sometimes have to hit one backspace to delete what appears to be two characters, sometimes hit two backspaces, etc. Bad idea. The only concrete thing I have heard is that when titlecasing Dutch, "i" + "j" at the start of a word should be titlecased as "I" + "J", not as "I" + "j". For that, one would request a change to SpecialCasing.txt in the Unicode Character Database for the next version of Unicode. Kent Karlson proposed this some time back; it may be time to revisit it, but we would need a proposal for the next UTC. Märk Davis ________ mark.davis@xxxxxxxxx IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 ----- Original Message ----- From: "Thomas Milo" <t.milo@xxxxxxxxx> To: "John Hudson" <tiro@xxxxxxxx> Cc: "Chris Pratley" <chrispr@xxxxxxxxxxxxxxxxxxxxxx>; <Bob_Hallissy@xxxxxxx>; <unicore@xxxxxxxxxxx>; <unicode@xxxxxxxxxxx>; "Gerard Unger" <ungerard@xxxxxx> Sent: Sunday, April 27, 2003 12:18 Subject: Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn) > Hi John, > > At 02:49 AM 4/27/2003, Thomas Milo wrote: > > > >Would it be possible to make the IJ/ij available at last as a single > > >character IJ/ij for Dutch users? MS Office seems to be unaware of this > > >character (apart from correct shifting between upper and lower case). A > > >spell check of IJstijd (correct Unicode) vs. IJstijd (improvised ASCII) > > >approves of the - erroneous! - ASCII form and does not even recognize > the > > >horrendous misspelling Ijstijd. > > > > > >A web search of the Dutch word IJstijd (Ice Age) indicates that the use > > >of this essential character is still practically zero. > > > > Whenever I've asked Dutch colleagues (type designers and typographers) > > about the IJ/ij characters they've always expressed amazement that these > > characters exist and most reject the need for them. 'Just use I and J' > > seems to be the usual response. Tom is the only Dutch colleague I've ever > > heard express support for the use of these characters. It is true that > > there are special rules for how the letters I and J in combination should > > be typeset in Dutch, but the same is true of lots of digraphs in German > and > > other languages that are not encoded as distinct characters and will not > > be. I'm far from convinced that the IJ/ij characters are necessary or that > > their use should be encouraged. > > No Dutchman - whether he is involved in type or not - can be amazed by the > existence of IJ. If his name happened to begin with IJ, he would not be able > to look up his own name in a telephone directory. With no exception IJ is > taught in all schools as part of our handwriting as a ligature - just > checked with my daughter. I called Gerard Unger about it and he pointed out > that IJ is surrounded by a certain ambivalence: dictionaries list it either > with I or with Y. The latter is enough to grant it graphemic status. And - > like anybody would - he agrees that it capitalizes as one letter. As for > your typographer friends, they mean: just compose it out of I and J (still > in the streets of the Netherlands one frequently observes Ü with the left > leg broken: the ligature of I and J). But this is all talk about glyphs. > Unicode deals with graphemes, and there IJ is already recognized as such. > > IJ as a character is part and parcel of Dutch orthography and included in > the Unicode Standard at the request of the Netherlands Standardisation > Committee. There is no need to ask approval to use IJ/ij - the only point I > am making is, that we still don't have a convenient way of entering it. > > Graphemically the use of IJ involves no complex rules at all. In Dutch ALL > combinations of letters I and J - with extremely rare exceptions in foreign > words like "bijoux", consequently corrupted into byoux by weak spellers - > are instances of the single grapheme IJ. As a result, the common hack to > type capital IJ with two upper case characters causes problems with spellers > and grammar checkers. Moreover it leads to spelling and sorting errors (in > some dictionaries and all telephone directories IJ mix with Y, but the hack > moves it to I); automatic capitalization produces a revolting Ij, in rotated > text IJ come apart, etc. etc. > > There is no need to put up with this hack: the Unicode Standard provides the > correct solution and the industry obliged itself to implement it. > > t > > >
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | [OT] Re: Country codes, Doug Ewell |
|---|---|
| Next by Date: | Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn), Adam Twardoch |
| Previous by Thread: | Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn), Doug Ewell |
| Next by Thread: | Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn), Adam Twardoch |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |