osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Appending to streaming file format


Hey, as far as I can tell it looks like appending to a streaming file format isn't currently supported, is that right?
RecordBatchStreamWriter always writes the schema up front, and it doesn't look like a schema is expected mid file ( assuming im doing this append test correctly, this is the error I hit when I try to read back this file into python:

Traceback (most recent call last):
  File "/home/ra7293/rba_arrow_mmap.py", line 9, in <module>
    table = reader.read_all()
  File "ipc.pxi", line 302, in pyarrow.lib._RecordBatchReader.read_all
  File "error.pxi", line 79, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Message not expected type: record batch, was: 1

This reader script works fine if I write once / don't append.  I can work around by not appending but creating new files any time I restart, I just wanted to confirm im not missing something.

Also, fyi, I opened a ticket last week that append is broken with the FileOutputStream ( unrelated to this email thread )
https://github.com/apache/arrow/issues/2018

Thanks
- Rob





DISCLAIMER: This e-mail message and any attachments are intended solely for the use of the individual or entity to which it is addressed and may contain information that is confidential or legally privileged. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, copying or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and permanently delete this message and any attachments.