osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] Replacement for array.array('u')?


On Fri, Mar 22, 2019 at 08:31:33PM +1300, Greg Ewing wrote:
> A poster on comp.lang.python is asking about array.array('u').
> He wants an efficient mutable collection of unicode characters
> that can be initialised from a string.
> 
> According to the docs, the 'u' code is deprecated and will be
> removed in 4.0, but no alternative is suggested.
> 
> Why is this being deprecated, instead of keeping it and making
> it always 32 bits? It seems like useful functionality that can't
> be easily obtained another way.

I can't answer any of those questions, but perhaps the poster can do 
this instead:

py> a = array('L', '????? ?????'.encode('utf-32be'))
py> a
array('L', [220266496, 807469056, 3791650816, 1963196416, 4278190080, 
536870912, 4194500608, 3036872704, 3288530944, 2969763840, 1107361792])

Getting the string out again is no harder:

py> bytes(a).decode('utf-32be')
'????? ?????'

But having said that, it would be nice to have an array code which 
treated the values as single UTF-32 characters:

array('?', ['?', '?', '?', '?', '?', ' ', '?', '?', '?', '?', '?'])

if for no other reason than it looks nicer than a bunch of 32 bit ints.


-- 
Steven