There are many open source tools to help you with the different steps you typically need to extract insights from your data. As you scale and grow your use of data, keeping on top of the steps can be difficult. Apache Airflow is an open source orchestration tool that helps you to programmatically create workflows in Python that will help you run, schedule, monitor and mange data engineering pipelines - no more manually managing those cron jobs! In this session, we will take a look at the architecture of Apache Airflow, and walk you through creating your first workflow and how you can use a growing number of provider libraries to help you work with other open source tools and services. This session is intended for beginners/those wanting to learn more about this open source project.
Over 30 years spent working in the technology industry, happiest when working helping customers to solve business problems with innovative, emerging technologies, open source and cloud. Currently I am a Developer Advocate at AWS focusing on open source. I help raise awareness of AWS and our customers open source projects and technology, and help make AWS a great place to run your open source software.