OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Go SDK: Bigquery and nullable field types.



On 2018/06/22 01:20:19, Henning Rohde <herohde@xxxxxxxxxx> wrote: 
> The Go SDK can't actually serialize named types -- we serialize the
> structural information and recreate assignment-compatible isomorphic
> unnamed types at runtime for convenience. This usually works fine, but
> perhaps not if inspected reflectively. Have you tried to Register the
> Record (or bigquery.NullString) type? That bypasses the serialization.

Type registration makes it work. This also takes care of  my Legacy SQL problem since I  can just handle timestamps through time.Time.

Thanks.


> 
> Thanks,
>  Henning
> 
> On Thu, Jun 21, 2018 at 5:40 PM eduardo.morales@xxxxxxxxx <
> eduardo.morales@xxxxxxxxx> wrote:
> 
> > I am using the bigqueryio transform and I am using the following struct to
> > collect a data row:
> >
> > type Record {
> >   source_service  biquery.NullString
> >   .. etc...
> > }
> >
> > This works fine with the direct runner, but when I try it with the
> > dataflow runner, then I get the following exception:
> >
> > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error
> > received from SDK harness for instruction -41: execute failed: bigquery:
> > schema field source_service of type STRING is not assignable to struct
> > field source_service of type struct { StringVal string; Valid bool }
> >         at
> > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> >         at
> > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> >         at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:55)
> >         at
> > com.google.cloud.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:274)
> >         at
> > com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:83)
> >         at
> > com.google.cloud.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:101)
> >         at
> > com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:391)
> >         at
> > com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:360)
> >         at
> > com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:288)
> >         at
> > com.google.cloud.dataflow.worker.DataflowRunnerHarness.start(DataflowRunnerHarness.java:179)
> >         at
> > com.google.cloud.dataflow.worker.DataflowRunnerHarness.main(DataflowRunnerHarness.java:107)
> >         Suppressed: java.lang.IllegalStateException: Already closed.
> >                 at
> > org.apache.beam.sdk.fn.data.BeamFnDataBufferingOutboundObserver.close(BeamFnDataBufferingOutboundObserver.java:97)
> >                 at
> > com.google.cloud.dataflow.worker.fn.data.RemoteGrpcPortWriteOperation.abort(RemoteGrpcPortWriteOperation.java:93)
> >                 at
> > com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:89)
> >                 ... 6 more
> >
> > Looks like the bigquery API is failing to detect the nullable type
> > NullString, and instead is attempting a plain assignment. Could it be that
> > some aspect of the type information has been lost thus preventing the
> > bigquery API from identifying and handling NullString properly?
> >
> >
>