osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Apache Beam Newsletter - November 2018



Beam.png

November 2018 | Newsletter


What’s been done


Beam Community Metrics (by: Mikhail Gryzykhin, Udi Meiri, Huygaa Batsaikhan)

  • To help track project health status, added dashboarding platform.

  • Initial dashboards were created that aim at tracking pre- and post-commit tests status and engineering load.

  • Leave feedback under BEAM-5862

  • View the dashboards here

Apache Beam 2.8.0 released! (by: many contributors)

  • Major new features and improvements, such as Python on Flink MVP

  • You can download the release here.

  • See the blog post for more details.


New Edit button on beam.apache.org pages (by: Alan Myrvold, Scott Wegner)

  • To make it easier for non-committers  to update documentation, an edit button has been added on https://beam.apache.org pages to help create a pull request using the GitHub web UI.

  • See BEAM-4431 for more details.

RabbitMqIO (by: Jean-Baptiste Onofré)

  • A IO to publish or consume messages with a RabbitMQ broker


Graphite sink for metrics (by: Etienne Chauchot)

  • Metrics Pusher can now export Beam metrics to Graphite


BeamSQL (by: Rui Wang, Mingming Xu)

  • Add 13 built-in SQL functions.

  • Enable function overloading for UDF by a new UDF registration approach.

  • UDF supports Joda DateTime as argument type.

What we're working on...


Flink Portable Runner (by: Ankur Goenka, Maximilian Michels, Thomas Weise, Ryan Williams, Robert Bradshaw)

  • Integration of timers in user functions for streaming and batch execution

  • Enabling TFX pipelines to run on Flink

  • Investigating the integration of metrics

  • Bug fixes


Load tests of Core Apache Beam Operations (by: Łukasz Gajowy, Katarzyna Kucharczyk)

New Members

New Committers
  • David Morávek, Pilsen, Czech Republic

    • Using Beam for an internet scale web crawler

    • See BEAM-3900 for more details.

Talks & Meetups


Hadoop User Group @ Warsaw, Poland

  • Apache Beam - what do I gain? by Łukasz Gajowy (link to the meetup)

  • We discussed the basics of the Dataflow model,  Beam in more detail, and familiarized the audience with the current state of the project


Resources


Blog Post on London Summit (by: Matthias Baetens)

  • “Inaugural edition of the Beam Summit Europe 2018 - aftermath”- a recap of the conference, including the presentation slide decks.

  • See the post here and videos of the sessions on the Apache Beam YouTube channel.

How to transfer BigQuery tables between locations (by: Graham Polley)

  • A Cloud Dataflow solution in Java for transferring BigQuery tables including source code.

  • See the Medium article here.


Hands on Apache Beam, building data pipelines in Python (by: Graham Polley)

  • Writing a Beam pipeline in Python to compute the mean of the Open and Close columns for a historical S&P 500 dataset.

  • See the Medium Towards Data Science article here and GitHub tutorial here.


Until Next Time!

This edition was curated by our community of contributors, committers and PMCs. It contains work done in November 2018 and ongoing efforts. We hope to provide visibility to what's going on in the community, so if you have questions, feel free to ask in this thread.
-- 
Rose Thị Nguyễn