osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Grapheme clusters, a.k.a.real characters


On Thu, Jul 20, 2017, at 01:15, Steven D'Aprano wrote:
> I haven't really been paying attention to Marko's suggestion in detail, 
> but if we're talking about a whole new data type, how about a list of 
> nodes, where each node's data is a decomposed string object guaranteed to 
> be either:

How about each node but the last has a fixed "length" (say, 16
characters), and random access below that size is done by indexing to
the node level and then walking forward.

I've thought about this in the past for encoding strings in UTF-8 with
O(1) random code point access.