OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How do you branch your code with BigQuery?


I do intend to create a PR (when I get the chance) to get this into main
airflow repo

If anybody has any comments about this before I do, please let me know


On Mon, 15 Oct 2018 at 10:33, airflowuser
<airflowuser@xxxxxxxxxxxxxx.invalid> wrote:

> Awesome!
> I think this would be a fine addition to the BigQuery operators if you
> ever think about PR this to airflow master
>
> cheers
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Monday, October 15, 2018 10:02 AM, Anthony Brown <
> anthony.brown@xxxxxxxxxxxxxxx> wrote:
>
> > Hi
> > I have created a custom plugin that allows you to branch on the results
> > of a BigQuery query. The code for it is at
> >
> https://github.com/JohnLewisandPartners/custom-airflow-plugins/blob/master/bq_branch/plugins/bq_branch.py
> .
> > The version in master only works on airflow 1.10, but there is a branch
> > called airflow_1.9 that also contains the latest BigQuery hook and so
> works
> > on airflow 1.9
> >
> > The query you run must return true for all columns - the same as for the
> > BigQuery check operator, so you may need to rewrite your queries to do
> this
> >
> > On Sun, 14 Oct 2018 at 14:16, airflowuser
> > airflowuser@xxxxxxxxxxxxxx.invalid wrote:
> >
> > > I believe this is quite common case when working with data.
> > > If something : do A
> > > else: do B
> > > With coding PythonBranchOperator is the solution.
> > > But when working on Google Cloud there is no way to do this.
> > > All existed operators are designed to continue or fail on comparison of
> > > specific value:
> > > BigQueryValueCheckOperator with pass_value=500 will continue if 500
> > > return or fail in any other case. Same for all other CheckOperators.
> You
> > > must know the value in advanced for this to work and it's not an actual
> > > branch but more of a way to stop the workflow if an unexpected result
> has
> > > been found.
> > > But how do you handle a scenario where you want to do A or B based on
> > > condition from a query result? Nothing needs to be failed. just a
> simple
> > > branch.
> > > XCOM could solve it. But there is no support for XCOM yet.
> > >
> https://stackoverflow.com/questions/52801318/airflow-how-to-push-xcom-value-from-bigqueryoperator
> > > Say for example:
> > > the query represent the number of frauds.. if it's <1000 you want to
> email
> > > specific users (EmailOperator) , if it's >=1000 you want to run another
> > > operator and continue the workflow.
> > > Any thoughts on the matter will be appreciated.
> >
> > --
> >
> > --
> >
> > Anthony Brown
> > Data Engineer BI Team - John Lewis
> > Tel : 0787 215 7305
> >
> > This email is confidential and may contain copyright material of the
> John Lewis Partnership.
> > If you are not the intended recipient, please notify us immediately and
> delete all copies of this message.
> > (Please note that it is your responsibility to scan this message for
> viruses). Email to and from the
> > John Lewis Partnership is automatically monitored for operational and
> lawful business reasons.
> >
> > John Lewis plc
> > Registered in England 233462
> > Registered office 171 Victoria Street London SW1E 5NN
> >
> > Websites: https://www.johnlewis.com
> > http://www.waitrose.com
> > https://www.johnlewisfinance.com
> > http://www.johnlewispartnership.co.uk
>
>
>

-- 
-- 

Anthony Brown
Data Engineer BI Team - John Lewis
Tel : 0787 215 7305