[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Grapheme clusters, a.k.a.real characters

Chris Angelico wrote:
> Once you NFC or NFD normalize both strings, identical strings will
> generally have identical codepoints... You should then be able to use normal regular expressions to
> match correctly.

Except that if you want to match a set of characters,
you can't reliably use [...], you would have to write
them out as alternatives in case some of them take
up more than one code point.