Re: Reading BigQuery Timestamps via python SDK returns error for dates < 1900

One workaround could be to query the table you can change the column to unix timestamp and read as a Long instead. 

// Vilhelm von Ehrenheim 

On Wed, 25 Apr 2018, 19:00 Chamikara Jayalath, <chamikara@xxxxxxxxxx> wrote:
I don't see an easy workaround unfortunately. Basically, till this is fixed, BigQuery columns of type TIMESTAMP will not work for pre 1900 values. So you'll either have to change the values or type of this column.


On Wed, Apr 25, 2018 at 9:51 AM Yatish Gupta <ygupta@xxxxxxxxxx> wrote:
Thanks! In the meantime, is there a workaround for this?


On Wed, Apr 25, 2018 at 12:20 AM Chamikara Jayalath <chamikara@xxxxxxxxxx> wrote:
Thanks for reporting. This seems to be due to a known Python bug: https://bugs.python.org/issue1777412

Seems like above bug has not been fixed for Python 2.x line. I created https://issues.apache.org/jira/browse/BEAM-4171 for tracking the Beam issue and possibly updating the Beam SDK to not use 'strftime'.


On Tue, Apr 24, 2018 at 2:33 PM Yatish Gupta <ygupta@xxxxxxxxxx> wrote:

I am getting an exception when running beam on BigQuery rows that contain Timestamps less than 1900. I get a similar error when using the DirectRunner.

  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/nativebigqueryavroio.py", line 83, in _fix_field_value
    return dt.strftime('%Y-%m-%d %H:%M:%S.%f UTC')
ValueError: year=1851 is before 1900; the datetime strftime() methods require year >= 1900

How do I read these rows without changing my schema?