OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (ARROW-3685) Better roundtrip between numpy and arrow binary array


Maarten Breddels created ARROW-3685:
---------------------------------------

             Summary: Better roundtrip between numpy and arrow binary array
                 Key: ARROW-3685
                 URL: https://issues.apache.org/jira/browse/ARROW-3685
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
    Affects Versions: 0.11.1
            Reporter: Maarten Breddels


I'm working on getting support for arrow in vaex (out of core dataframe library for Python) in this PR:
[https://github.com/maartenbreddels/vaex/pull/116]
And I fixed length binary arrays for numpy (say dtype='S42') will be converted to a non-fixed length array. Trying to convert that back to numpy will fail, since there is no such conversion.

It makes more sense to convert dtype='S42', to an arrow array with pyarrow.binary(42) type. As I do in:
https://github.com/maartenbreddels/vaex/blob/4b4facb64fea9f83593ce0f0b82fc26ddf96b506/packages/vaex-arrow/vaex_arrow/convert.py#L4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)