logo       

Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn): msg#00337

text.unicode.devel

Subject: Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn)

Wait just a second. The IJ digraph was added for compatibility with other
standards, not necessarily
because it is really needed for Dutch. Unicode does not, in general, encode
graphemes, except for
compatibility purposes; "ch" for Spanish and Slovak, for example, are not
encoded.

Given the mass of data in Dutch that already use "i" + "j" to encode that
grapheme, adding the "ij"
character will just confuse matters. When editing a mixture of such text,
search/replace will not
identify the two; users will sometimes have to hit one backspace to delete what
appears to be two
characters, sometimes hit two backspaces, etc. Bad idea.

The only concrete thing I have heard is that when titlecasing Dutch, "i" + "j"
at the start of a
word should be titlecased as "I" + "J", not as "I" + "j". For that, one would
request a change to
SpecialCasing.txt in the Unicode Character Database for the next version of
Unicode. Kent Karlson
proposed this some time back; it may be time to revisit it, but we would need a
proposal for the
next UTC.

Märk Davis
________
mark.davis@xxxxxxxxx
IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
(408) 256-3148
fax: (408) 256-0799

----- Original Message -----
From: "Thomas Milo" <t.milo@xxxxxxxxx>
To: "John Hudson" <tiro@xxxxxxxx>
Cc: "Chris Pratley" <chrispr@xxxxxxxxxxxxxxxxxxxxxx>; <Bob_Hallissy@xxxxxxx>;
<unicore@xxxxxxxxxxx>;
<unicode@xxxxxxxxxxx>; "Gerard Unger" <ungerard@xxxxxx>
Sent: Sunday, April 27, 2003 12:18
Subject: Re: [OT] multilingual support in MS products (was Re: Kurdish ghayn)


> Hi John,
>
> At 02:49 AM 4/27/2003, Thomas Milo wrote:
>
> > >Would it be possible to make the IJ/ij available at last as a single
> > >character IJ/ij for Dutch users? MS Office seems to be unaware of this
> > >character (apart from correct shifting between upper and lower case). A
> > >spell check of IJstijd (correct Unicode) vs. IJstijd (improvised ASCII)
> > >approves of the - erroneous! - ASCII form and does not even recognize
> the
> > >horrendous misspelling Ijstijd.
> > >
> > >A web search of the Dutch word IJstijd (Ice Age) indicates that the use
> > >of this essential character is still practically zero.
> >
> > Whenever I've asked Dutch colleagues (type designers and typographers)
> > about the IJ/ij characters they've always expressed amazement that these
> > characters exist and most reject the need for them. 'Just use I and J'
> > seems to be the usual response. Tom is the only Dutch colleague I've ever
> > heard express support for the use of these characters. It is true that
> > there are special rules for how the letters I and J in combination should
> > be typeset in Dutch, but the same is true of lots of digraphs in German
> and
> > other languages that are not encoded as distinct characters and will not
> > be. I'm far from convinced that the IJ/ij characters are necessary or that
> > their use should be encouraged.
>
> No Dutchman - whether he is involved in type or not - can be amazed by the
> existence of IJ. If his name happened to begin with IJ, he would not be able
> to look up his own name in a telephone directory. With no exception IJ is
> taught in all schools as part of our handwriting as a ligature - just
> checked with my daughter. I called Gerard Unger about it and he pointed out
> that IJ is surrounded by a certain ambivalence: dictionaries list it either
> with I or with Y. The latter is enough to grant it graphemic status. And -
> like anybody would - he agrees that it capitalizes as one letter. As for
> your typographer friends, they mean: just compose it out of I and J (still
> in the streets of the Netherlands one frequently observes Ü with the left
> leg broken: the ligature of I and J). But this is all talk about glyphs.
> Unicode deals with graphemes, and there IJ is already recognized as such.
>
> IJ as a character is part and parcel of Dutch orthography and included in
> the Unicode Standard at the request of the Netherlands Standardisation
> Committee. There is no need to ask approval to use IJ/ij - the only point I
> am making is, that we still don't have a convenient way of entering it.
>
> Graphemically the use of IJ involves no complex rules at all. In Dutch ALL
> combinations of letters I and J - with extremely rare exceptions in foreign
> words like "bijoux", consequently corrupted into byoux by weak spellers -
> are instances of the single grapheme IJ. As a result, the common hack to
> type capital IJ with two upper case characters causes problems with spellers
> and grammar checkers. Moreover it leads to spelling and sorting errors (in
> some dictionaries and all telephone directories IJ mix with Y, but the hack
> moves it to I); automatic capitalization produces a revolting Ij, in rotated
> text IJ come apart, etc. etc.
>
> There is no need to put up with this hack: the Unicode Standard provides the
> correct solution and the industry obliged itself to implement it.
>
> t
>
>
>





<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise