The Airflow Podcast

Astronomer

A podcast about Apache Airflow, an open source workflow management system that lets you define data pipelines in python. Produced with love by the team at Astronomer.

Season One Teaser
Trailer 3 min 2 sec

All Episodes

Welcome back to the Airflow Podcast. This week, we met up with Ben Wisegarver, a staff data scientist at Reddit who runs their data warehousing and data engineering functions. Reddit users generate petabytes of data every day that needs to be processed, stored, and analyzed by a wide breadth of backend services. Our conversation with Ben touches on everything from Airflow as a tool for career mobility across the data stack to scaling out a self-service data architecture across many teams. For folks interested, our team at Astronomer is growing rapidly and we're on the hunt for new folks to join in a variety of different roles. If you're passionate about Airflow and interested in building the future of data engineering, please get in touch. You can check our current job postings at careers.astronomer.io, but we're constantly updating our listings to accommodate new hiring needs. Please feel free to email me directly at pete@astronomer.io if you're passionate about what we're doing and think you'd be a good addition to the team. Mentioned Resources: Careers: https://careers.astronomer.io Guest Profile: Ben Wisegarver: https://www.linkedin.com/in/ben-wisegarver-54566576

Feb 4

45 min 48 sec

Welcome back to the Airflow Podcast. This week, we met up with Albert Franzi and Carlos Escura from Typeform. Typeform is a tool that allows you to build beautiful interactive forms that you can use for a wide variety of use cases, including customer surveys, employee engagement, product feedback, and market research to name a few. In our conversation, we discussed Airflow as a tool for GDPR compliance, the concept of self-service data and how it allows your data operations team to function as a data platform team, and some of the more specialized infrastructure tooling that the Typeform team has built out to support their internal teams. For folks interested, our team at Astronomer is growing rapidly and we're on the hunt for new folks to join in a variety of different roles. If you're passionate about Airflow and interested in building the future of data engineering, please get in touch. You can check our current job postings at careers.astronomer.io, but we're constantly updating our listings to accommodate new hiring needs. Please feel free to email me directly at pete@astronomer.io if you're passionate about what we're doing and think you'd be a good addition to the team. Mentioned Resources: Dag Factory: https://github.com/ajbosco/dag-factory Astronomer Careers: https://careers.astronomer.io Guest Profiles: Albert Franzi: https://www.linkedin.com/in/albertfranzi/?originalSubdomain=es Carlos Escura: https://www.linkedin.com/in/carlosescura/en-us/

Nov 2020

31 min 26 sec

After a bit of a break, we're back with the third official episode bundle of The Airflow Podcast. In this batch, we'll get a little bit deeper with current Airflow users and maintainers on core fundamental concepts in data engineering, architectures for operating modern data platforms at scale, and the process of maintaining and operating Airflow, specifically as we go through the release process of Airflow 2.0. This week, we met up with Brian de la Motte and Florian Hines at Netlify. Netlify provides an extremely popular toolset for building and deploying JAMstack sites. They provide hosting services, CI, DNS, authentication, and managed backend tools that help users run and operate static sites at scale. The team over there recently adopted Airflow to help decouple orchestration logic from a complex collection Spark jobs and are currently in the process of expanding their Airflow footprint to accommodate a broader group of interesting use-cases. Disclaimer: we get a bit of a surprise about halfway through the episode when Brian tells us that they had recently signed up for Astronomer- we promise that it wasn't a planted ad :). Please contact pete@astronomer.io if you'd like to get in touch regarding future episodes. Hope you enjoy! Guest Profiles: Brian de la Motte: https://www.linkedin.com/in/brian-de-la-motte/ Florian Hines: https://www.linkedin.com/in/florianhines/

Oct 2020

28 min 35 sec

This week, we linked up with Airflow release manager, core committer, and Astronomer platform engineer Ash Berlin-Taylor to discuss the Airflow 2.0 roadmap [1]. There is some great stuff in the works around performance, autoscaling, and usability that we're excited about. In this episode, Ash lends his thoughts on the design, implementation, and value-add around all of the upcoming features, including: - The Knative Executor - A modern and real-time UI - A production-grade API - Improved scheduler and webserver performance - An official production Docker image for Airflow We hope you enjoy! Please email pete@astronomer.io if you have thoughts on topics you'd like to see covered in future episodes. Separately, some good folks from the Airflow community are running a user survey that will help collect some useful information around the Airflow UX. If you have five minutes to spare, filling out the following form will help the core Airflow committers to shape the project roadmap: https://forms.gle/XAzR1pQBZiftvPQM7 [1] https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0

Nov 2019

32 min 28 sec

This week, we had the pleasure of meeting up with Jarek Potiuk, Principal Software Engineer at Polidea and Apache Airflow committer, to discuss his most recent contribution to the community, Airflow Breeze. Jarek deeply values developer productivity and realized while building a team of Airflow committers that, in order to open a PR on the project, passing unit tests and waiting for the CI build was a cumbersome process that could take up to a few hours. Breeze seeks to improve that experience for Airflow committers and lower the barrier-to-entry of contribution for folks that are new to the open-source community. You can read more about Airflow Breeze here: https://www.polidea.com/blog/its-a-breeze-to-develop-apache-airflow/#the-apache-airflow-projects-setup

Oct 2019

46 min 57 sec

This episode kicks off season 2 of The Airflow Podcast. In this next season, we'll focus on the future of Airflow and chat with leading members of the community to paint a picture of what's to come. We're pumped to be diving back into this project and look forward to the great conversations we have lined up. This week, we chatted with James Malone, Product Manager of Google's Cloud Composer. James had some interesting things to say about open source at Google and where his team plans on contributing most to the project going forward. As always, thanks for listening and please email pete@astronomer.io if you have any feedback or would like to be considered as a guest.

Aug 2019

39 min 3 sec

This week, we met up with Ash Berlin-Taylor to discuss the recent 1.10 release, what it's like to be a release manager for an open source project, Airflow's bid to graduate from incubating status, and the next phase of Airflow project development. As mentioned in our podcast intro, we at Astronomer are hiring Data Engineers who are passionate about contributing to open source and making Airflow great. Please shoot us an email at humans@astronomer.io if you're interested in hearing more about the fully-remote opportunity. Check us out at www.astronomer.io if you're interested in seeing a demo of our platform.

Dec 2018

29 min 3 sec

This time, we met up with WePay's Joy Gao to talk through her work on the RBAC components in the recent Airflow 1.10 release. We dove deep into what inspired her work and took some time to discuss what it's like to be a woman contributing to a predominately male open-source community. Hope you enjoy! If you'd like to get started using Airflow in your org, check out our recently-launched Spacecamp program here: https://www.astronomer.io/spacecamp Feel free to email me at pete@astronomer.io with any feedback or if you'd like to be considered as a guest!

Aug 2018

37 min 28 sec

In this episode, we dove into the relationship between Airflow and Kuberenetes and interviewed Daniel Imberman, Senior Software Engineer at Bloomberg (1:30), and Greg Neiheisel, CTO here at Astronomer (37:31). Daniel has done most of the work on the Kubernetes executor for Airflow and Greg plans to take on a chunk of the development going forward, so it was really interesting to hear both of their perspectives on the project. Enjoy!

Jun 2018

55 min 10 sec

This week, we’ll examine conversations with both old guests and new to paint a comprehensive picture of Airflow’s pain points. While we still undoubtedly believe that Airflow is the future of ETL, it’s important to acknowledge that any incubating project will have issues, and bringing those issues to the forefront of the community’s attention will help shape the future of the project. We’ll talk with Thomas La Piana (1:36), Data Engineer at OrderMyGear, Frank Hsu (14:20), Data Engineer at mines.io, and Alan Cruickshank (27:41), business insights and data manager at tails.com. Check out our open-source library of Airflow plugins at github.com/airflow-plugins, and feel free to contribute anything that you've been working on! If you're interested in being on the podcast or have any feedback on how you think we could make it better, shoot me an email at pete@astronomer.io

May 2018

38 min 57 sec

On this episode, we linked up with Erik Bernhardsson (@erikbern), creator of Luigi and CTO of Better Mortgage. We chatted about everything from the motivations behind Luigi's creation and his current thoughts on Airflow- we hope you enjoy! Check out: - Erik's blog at erikbern.com - Our open-source library of Airflow plugins at github.com/airflow-plugins All podcast feedback is hugely appreciated- feel free to email me at pete@astronomer.io if you have any thoughts.

Apr 2018

27 min 39 sec

In this episode, we dive into Airflow Best Practices and include longer portions of interviews with Alan Cruickshank (1:30), Business Insights and Data Manager at Tails.com, Chris Riccomini (7:27), Principal Software Engineer at WePay, and Bolke de Bruin(31:45), Head of Advanced Analytics Technology at ING. Hope you enjoy! We're still working to get better at podcasting, so please send over any feedback to pete@astronomer.io. We really appreciate hearing what the community has to say, and your feedback is hugely helpful in making us better. If you're interested in Astronomer Spacecamp, a guided Airflow development course, you can find more info on that here: https://www.astronomer.io/blog/announcing-astronomer-spacecamp/ We also launched our Managed Airflow on Product Hunt last week- you can check that out here: https://www.producthunt.com/posts/apache-airflow-on-astronomer Thanks so much for listening!

Mar 2018

58 min 31 sec

Episode 2 of The Airflow Podcast is here to discuss six specific use cases that we’ve seen for Apache Airflow. Here’s the lineup: Patrick Atwater (@patwater), Water Data Projects Manager at ARGO Labs: 2:03-5:35 Maksime Pecherskiy (@mrmaksimize), CDO of San Diego: 5:35-23:06 Scott Halgrim (@shalgrim), Data Engineer at Zapier: 23:06-27:27 Bolke de Bruin (@bolke2028), Head of Advanced Analytics at ING: 27:27-39:46 Chris Riccomini (@criccomini), Principal Software Engineer at WePay: 39:46-54:20 Ben Gregory (@benbeingbin), Data Engineer (and noted craft soda enthusiast) at Astronomer: 54:20-1:14:38 Contribute to our open-source library of Airflow plugins at github.com/airflow-plugins Contact us at www.astronomer.io if you’re interested in Spacecamp: A guided development program to get your team up and running on Airflow.

Feb 2018

1 hr 15 min

For the first episode of the Airflow Podcast, we met up with Maxime Beauchemin, creator of Airflow, to explore the motivations behind its creation and the problems it was designed to solve. We asked Maxime for his definition of Airflow, the design principles behind hook/operator use, and his vision for the project. Speaker list: Pete DeJoy - Product at Astronomer Viraj Parekh - Data Engineer at Astronomer Maxime Beauchemin - Software Engineer at Lyft, creator of Airflow Talk mentioned at the end of the podcast- Advanced Data Engineering Patterns with Apache Airflow: http://www.ustream.tv/recorded/109227704 Maxime's Blog: https://medium.com/@maximebeauchemin

Feb 2018

45 min 9 sec

A sneak peek at our upcoming podcast about Apache Airflow. Featured in this clip (in order of appearance): Pete DeJoy - Product Specialist at Astronomer Patrick Atwater - Water Data Projects Manager at ARGO Labs Maksime Pecherskiy - Chief Data Officer of the City of San Diego Bolke de Bruin - Head of Advanced Analytics at ING

Jan 2018

3 min 2 sec