Last weekend PyCon DE and PyData Berlin joined in Berlin for a great conference event that I was lucky to attend. The speaker line-up was great and often it was hard to choose which talk or tutorial to attend. I will give you an overview of the talks I liked and the respective material.

Day 1

Apache Airflow for beginners

by Varya Karpenko // material

Apache Airflow is an open source project that allows you programmatically create, schedule and monitor sequences of tasks. Varya went over the basic concepts and the building blocks in Airflow, such as DAGs, Operators, Tasks, Hooks, Variables, and XComs. And she had great slides!

Production-level data pipelines that make everyone happy using Kedro

by Yetunde Dada //

Yetunde walked the audience thought the software principles data engineers and data scientists should consider applying to their code to make it easier to deploy into the production environment. Then she introduced us to an open source Python library, called Kedro, to simplify the deployment workflow.

Day 2

Airflow: your ally for automating machine learning and data pipelines

by Enrica Pasqua and Bahadir Uyarer // material

In this workshop we learned how the basic Airflow workflow works. Then they walked us through the setup of an instance for orchestrating an inference pipeline for a machine learning model. It was a really well designed and approachable tutorial based on airflow in a docker container.

Dash: Interactive Data Visualization Web Apps with no Javascript

by Dom Weldon // material

This talk gave a nice overview of Dash. Dom showed how it works and what it can be used for, before outlining some of the common problems that emerge when data scientists are let loose to produce web applications. In addition to this, the talk also outlined effective working practices to start producing cool interactive statistical web applications.

Day 3

My own Tutorial: Managing the end-to-end machine learning lifecycle with MLFlow

by Tobias Sterbak // material

I also had the opportunity to share some of my experiences with handling the machine learning lifecycle with MLflow. It was a nice experience and I liked the more hands-on way of presenting.

A Bayesian Workflow with PyMC and ArviZ

by Corrie Bartelheimer // material

Corrie walked the audience through a basic Bayesian workflow with PyMC3 and the model inspection with ArviZ in a real world use-case of house price modelling. I found the part about picking and designing the right prior and pitfalls with model convergence most helpful.

Wrap-up

There have been much more amazing talks I attended and too many talks I skipped for talking to great people. All in all, it was an amazing conference!

I learned a ton of new things and meet great people. I especially learned a lot of things about airflow. It’s nice to see the community shifting from fancy use-cases to production-focused machine learning and data management. Also diversity and ethics where big topics. Overall a great event with an amazing community and I’m proud to be part of it. I’m already looking forward to the next PyData Conference.

You might also be interested in: