[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[jira] [Created] (ARROW-2593) [Python] TypeError: data type "mixed-integer" not understood

Dima Ryazanov created ARROW-2593:

             Summary: [Python] TypeError: data type "mixed-integer" not understood
                 Key: ARROW-2593
                 URL: https://issues.apache.org/jira/browse/ARROW-2593
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.9.0
            Reporter: Dima Ryazanov

Pyarrow 0.9 raises an exception when converting some tables to pandas dataframes. Earlier versions work fine. Repro steps:

{{In [1]: import pandas as pd}}

{{In [2]: import pyarrow as pa}}

{{In [3]: df = pd.DataFrame(\{'foo': [], 123: []})}}

{{In [4]: table = pa.Table.from_pandas(df)}}

{{In [5]: table.to_pandas()}}
{{KeyError                                  Traceback (most recent call last)}}
{{~/envs/cli3/lib/python3.6/site-packages/pyarrow/pandas_compat.py in _pandas_type_to_numpy_type(pandas_type)}}
{{    666     try:}}
{{--> 667         return _pandas_logical_type_map[pandas_type]}}
{{    668     except KeyError:}}

{{KeyError: 'mixed-integer'}}

(I ended up with a dataframe with mixed string/integer columns by using pd.read_excel(..., skiprows=[0]) - which skipped the header, and treated the first line of data as column names.)

This message was sent by Atlassian JIRA