[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Grapheme clusters, a.k.a.real characters

On Wednesday, July 19, 2017 at 3:00:21 AM UTC+5:30, Marko Rauhamaa wrote:
> Chris Angelico :
> > Let me give you one concrete example: the letter "?". In English, it
> > is (very occasionally) used to indicate diaeresis, where a pair of
> > letters is not a double letter - for example, "co?perate". (You can
> > also hyphenate, "co-operate".) In German, it is the letter "o" with a
> > pronunciation mark (umlaut), and is considered the same letter as "o".
> > In Swedish, it is a distinct letter, alphabetized last (following z,
> > ?, and ?, in that order). But in all these languages, it's represented
> > the exact same way.
> The German Wikipedia entry on "?" calls "?" a letter ("Buchstabe"):
>    Der Buchstabe ? (kleingeschrieben ?) ist ein Buchstabe des
>    lateinischen Schriftsystems.
> Furthermore, it makes a distinction between "?" the letter and "?" the
> "a with a diaeresis:"
>    In guten Druckschriften unterscheiden sich die Umlautpunkte von den
>    zwei Punkten des Tremas: Die Umlautpunkte sind kleiner, stehen n?her
>    zusammen und liegen etwas tiefer.
>    In good fonts umlaut dots are different from the two dots of a
>    diaeresis: the umlaut dots are smaller and closer to each other and
>    lie a little lower. [translation mine]

Very interesting!
And may I take it that the two different variants ? u-umlaut and u-diaresis ? of ? are not (yet) given a seat in unicode?

Now compare with:
- hyphen-minus 0x2D
? minus sign 0x2212
? hyphen 0x2010
? en dash 0x2013
? em dash 0x2014
? horizontal bar 0x2015
? And perhaps another half-dozen