Skip to content

Commit

Permalink
Program
Browse files Browse the repository at this point in the history
  • Loading branch information
pedrogk committed Aug 6, 2024
1 parent dd5fbe6 commit 9d4f640
Show file tree
Hide file tree
Showing 29 changed files with 172 additions and 177 deletions.
2 changes: 1 addition & 1 deletion content/program-sessionize.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Program"
url: /program
url: /program-sessionize

---

Expand Down
6 changes: 3 additions & 3 deletions content/schedule/_index.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
---
title: "Program"
date: 2023-04-21T15:49:31-05:00
url: program-2024
aliases:
- /schedule
url: /program

description: Airflow Summit 2024 features more than 90 sessions covering Airflow features, case studies, workshops and community sessions. Check it out!

tracks:
Expand Down Expand Up @@ -234,3 +233,4 @@ description: "Welcome to the session program for Airflow Summit."
---

<h4 class="mb-4">Welcome to the session program for Airflow Summit 2024. </h4>
<h5>If you prefer, you can also see this as <a style="color:#c04040; !important" href="/program-sessionize">sessionize layout</a> or <a style="color:#c04040; !important" href="/sessions/2024">list of sessions</a>.</h5>
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: "Exploring DAG Design Patterns in Apache Airflow"
slug: exploring-dag-design-patterns-in-apache-airflow
speakers:
- Sriram Vamsi Ilapakurthy
time_start: 2024-09-12 15:45:00
time_end: 2024-09-12 16:10:00
room: Elizabethan A+B
track: Airflow intro talks
day: 20243
timeslot: 108
gridarea: "13/4/14/5"
images:
-
---

This talk delves into advanced Directed Acyclic Graph (DAG) design patterns that are pivotal for optimizing data pipeline management and boosting efficiency. We'll cover dynamic DAG generation, which allows for flexible, scalable workflow creation based on real-time data and configurations. Learn about task grouping and SubDAGs to enhance readability and maintainability of complex workflows. We'll also explore parameterized DAGs for injecting runtime parameters into tasks, enabling versatile and adaptable pipeline configurations. Additionally, the session will address branching and conditional execution to manage workflow paths dynamically based on data conditions or external triggers. Lastly, understand how to leverage parallelism and concurrency to maximize resource utilization and reduce execution times. This session is designed for intermediate to advanced users who are familiar with the basics of Airflow and looking to deepen their understanding of its more sophisticated capabilities.

This session is crafted to be compelling by focusing on practical, high-impact design patterns that can significantly improve the performance and scalability of Airflow deployments.
16 changes: 0 additions & 16 deletions content/sessions/2024/108-to-be-defined.md

This file was deleted.

4 changes: 2 additions & 2 deletions content/sessions/2024/109-to-be-defined.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "To be confirmed"
slug: to-be-confirmed4
title: "To be defined"
slug: to-be-defined
speakers:
-
time_start: 2024-09-12 16:20:00
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ images:

This session reveals an experimental venture integrating OpenAI's AI technologies with Airflow, aimed at advancing error diagnosis.

Through the application of AI, our objective is to deepen the understanding of issues, provide comprehensive insights into task failures, and suggest actionable solutions, thereby augmenting the resolution process. This method seeks to not only enhance diagnostic efficiency but also to equip data engineers with AI-informed recommendations.

Participants will be guided through the integration journey, illustrating how AI can refine error analysis and potentially simplify troubleshooting workflows.

Through the application of AI, our objective is to deepen the understanding of issues, provide comprehensive insights into task failures, and suggest actionable solutions, thereby augmenting the resolution process. This method seeks to not only enhance diagnostic efficiency but also to equip data engineers with AI-informed recommendations.



Participants will be guided through the integration journey, illustrating how AI can refine error analysis and potentially simplify troubleshooting workflows.
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ track: Keynote
day: 20241
timeslot: 2
gridarea: "2/2/3/6"
addevent: xp22530341
images:
- /images/sessions/2024/10years.jpg
---
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
title: "Empowering more teams in your organization to self-service their Airflow needs"
slug: empowering-more-teams-in-your-organization-to-self-service-their-airflow-needs
speakers:
- Spencer Tollefson
time_start: 2024-09-10 16:00:00
time_end: 2024-09-10 16:25:00
room: Elizabethan A+B
track: Airflow Intro talks
day: 20241
timeslot: 29
gridarea: "12/4/13/5"
images:
- /images/sessions/2024/ai-reality-check.jpg
---

Does your organization feel like the responsibility to write Airflow DAGs, handle the Airflow infrastructure administration, debug failing tasks, and keep up with new features and best practices is too much for too few people? Perhaps you only have one data team that owns all of that; or you have too many teams that have too many permissions into other teams' DAGs.

The topic of this talk is how Rakuten Kobo enables self-service for various teams within its organization to build their own DAGs in Airflow. The talk will include how we delineate the Airflow responsibilities of various teams, build guard rails for new Airflow developers, how different teams automatically have permissions required for their "own" DAGs (but not others), the unique responsibilities of Operations and Data Engineering teams, and how it is done in a scalable manner.

Maybe you'll be inspired to make changes in your own organization, or have some tips of your own to share! Depending on questions, we could discuss some of the technical details as well.
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,10 @@ images:

Explore the evolutionary journey of orchestration within GoDaddy, tracing its transformation from initial on-premise deployment to a robust cloud-based Apache Airflow orchestration model. This session will detail the pivotal shifts in design, organizational decisions, and governance that have streamlined GoDaddy’s Data Platform and enhanced overall governance.

Attendees will gain insights valuable for optimizing Airflow deployments and simplifying complex orchestration processes.

Attendees will gain insights valuable for optimizing Airflow deployments and simplifying complex orchestration processes.
Recap of the transformation journey and its impact on GoDaddy’s data operations.

Recap of the transformation journey and its impact on GoDaddy’s data operations.
Future directions and ongoing improvements in orchestration at GoDaddy.

Future directions and ongoing improvements in orchestration at GoDaddy.



This session will benefit attendees by providing a comprehensive case study on optimizing orchestration in a complex enterprise environment, emphasizing practical insights and scalable solutions.
This session will benefit attendees by providing a comprehensive case study on optimizing orchestration in a complex enterprise environment, emphasizing practical insights and scalable solutions.
17 changes: 17 additions & 0 deletions content/sessions/2024/35-dag-dependency-management-across-lowes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
title: "DAG Dependency Management across Lowes"
slug: dag-dependency-management-across-lowes
speakers:
- Arnab Kundu
time_start: 2024-09-10 17:10:00
time_end: 2024-09-10 17:35:00
room: California East
track: Use cases
day: 20241
timeslot: 35
gridarea: "14/2/15/3"
images:
-
---

DAG dependency is already a solved use case for the same Airflow instance. But what happens when you have 50+ Airflow instances across teams and the workflow of one or many depends on others? By leveraging sensors and datasets we have created a custom operator that brings in the capability of cross-cluster dependencies. It works with our OnPrem Kubernetes architecture which is responsible for deployment of the custom operators throughout the entire Organization.
16 changes: 0 additions & 16 deletions content/sessions/2024/35-to-be-defined.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,10 @@ images:

DAG Authors, while constructing DAGs, generally use native libraries provided by Airflow in conjunction with python libraries available over public PyPI repositories.

But sometimes, DAG authors need to construct DAG using libraries that are either in-house or not available over public PyPI repositories. This poses a serious challenge for users who want to run their custom code with Airflow DAGs, particularly when Airflow is deployed in a cloud-native fashion.

Traditionally, these packages are baked in Airflow Docker images. This won’t work post deployment and is super impractical if your library is under development.

But sometimes, DAG authors need to construct DAG using libraries that are either in-house or not available over public PyPI repositories. This poses a serious challenge for users who want to run their custom code with Airflow DAGs, particularly when Airflow is deployed in a cloud-native fashion.
We propose a solution that creates a dedicated Airflow global python environment that dynamically generates the requirements, establishes a version-compatible pyenv adhering to Airflow’s policies, and manages custom pip repository authentication seamlessly. Importantly, the service executes these steps in a fail-safe manner, not compromising core components.



Traditionally, these packages are baked in Airflow Docker images. This won’t work post deployment and is super impractical if your library is under development.



We propose a solution that creates a dedicated Airflow global python environment that dynamically generates the requirements, establishes a version-compatible pyenv adhering to Airflow’s policies, and manages custom pip repository authentication seamlessly. Importantly, the service executes these steps in a fail-safe manner, not compromising core components.



Join us as we discuss the solution to this common problem, touching upon the design, and seeing the solution in action. We also candidly discuss some challenges, and the shortcomings of the proposed solution.
Join us as we discuss the solution to this common problem, touching upon the design, and seeing the solution in action. We also candidly discuss some challenges, and the shortcomings of the proposed solution.
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: "AI Reality Checkpoint: The Good, the Bad, and the Overhyped"
slug: ai-reality-checkpoint-the-good-the-bad-and-the-overhyped
speakers:
- Maxime Beauchemin
time_start: 2024-09-10 17:10:00
time_end: 2024-09-10 17:35:00
room: Elizabethan A+B
track: Community
day: 20241
timeslot: 37
gridarea: "14/4/15/5"
images:
-
---

In the past 18 months, artificial intelligence has not just entered our workspaces – it has taken over. As we stand at the crossroads of innovation and automation, it's time for a candid reflection on how AI has reshaped our professional lives, and to talk about where it's been a game changer, where it's falling short, and what's about to shift dramatically in the short term.

Since the release of ChatGPT in December 2022, I've developed a "first-reflex" to augment and accelerate nearly every task with AI. As a founder and CEO, this spans a wide array of responsibilities from fundraising, internal communications, legal, operations, product marketing, finance, and beyond. In this keynote, I'll cover diverse use cases across all areas of business, offering a comprehensive view of AI's impact.

Join me as I sort out through this new reality and try and forecast the future of AI in our work. It's time for a radical checkpoint. Everything's changing fast. In some areas, AI has been a slam dunk; in others, it's been frustrating as hell. And once a few key challenges are tackled, we're on the cusp of a tsunami of transformation.

3 major milestones are right around the corner: top-human-level reasoning, solid memory accumulation and recall, and proper executive skills. How is this going to affect all of us?

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
title: "To be confirmed"
slug: to-be-defined-2
title: "Keynote to be confirmed"
slug: keynote-tbd
speakers:
-
time_start: 2024-09-11 09:00:00
time_end: 2024-09-11 09:25:00
room: Grand Ballroom
track:
track: Keynote
day: 20242
timeslot: 40
gridarea: "1/2/2/6"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ speakers:
time_start: 2024-09-11 09:30:00
time_end: 2024-09-11 09:55:00
room: Grand Ballroom
track: Use cases
track: Keynote
day: 20242
timeslot: 41
gridarea: "2/2/3/6"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ speakers:
time_start: 2024-09-11 11:00:00
time_end: 2024-09-11 11:25:00
room: Elizabethan A+B
track:
track: Sponsored
day: 20242
timeslot: 47
gridarea: "5/4/6/5"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ speakers:
time_start: 2024-09-11 11:30:00
time_end: 2024-09-11 11:55:00
room: Elizabethan A+B
track: Airflow & ...
track: Sponsored
day: 20242
timeslot: 51
gridarea: "6/4/7/5"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ speakers:
time_start: 2024-09-11 12:00:00
time_end: 2024-09-11 12:25:00
room: Elizabethan A+B
track:
track: Sponsored
day: 20242
timeslot: 53
gridarea: "7/4/8/5"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ speakers:
time_start: 2024-09-11 12:30:00
time_end: 2024-09-11 12:55:00
room: Elizabethan A+B
track: Use cases
track: Sponsored
day: 20242
timeslot: 56
gridarea: "8/4/9/5"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ speakers:
time_start: 2024-09-11 13:00:00
time_end: 2024-09-11 13:25:00
room: Elizabethan A+B
track:
track: Sponsored
day: 20242
timeslot: 58
gridarea: "9/4/10/5"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: "Scale and Security : How Autodesk Securely Develops and Tests PII Pipelines with Airflow"
slug: scale-and-security-how-autodesk-securely-develops-and-tests-pii-pipelines-with-airflow
speakers:
- Bhavesh Jaisinghani
time_start: 2024-09-12 10:30:00
time_end: 2024-09-12 10:55:00
room: Elizabethan A+B
track: Use cases
day: 20243
timeslot: 84
gridarea: "4/4/5/5"
images:
-
---

In today's data-driven era, ensuring data reliability and enhancing our testing and development capabilities are paramount. Local unit testing has its merits but falls short when dealing with the volume of big data. One major challenge is running Spark jobs pre-deployment to ensure they produce expected results and handle production-level data volumes.

In this talk, we will discuss how Autodesk leveraged Astronomer to improve pipeline development. We'll explore how it addresses challenges with sensitive and large data sets that cannot be transferred to local machines or non-production environments. Additionally, we'll cover how this approach supports over 10 engineers working simultaneously on different feature branches within the same repo.

We will highlight the benefits, such as conflict-free development and testing, and eliminating concerns about data corruption when running DAGs on production Airflow servers.

Join me to discover how solutions like Astronomer empower developers to work with increased efficiency and reliability. This talk is perfect for those interested in big data, cloud solutions, and innovative development practices.
21 changes: 0 additions & 21 deletions content/sessions/2024/95-data-centric-airflow.md

This file was deleted.

Loading

0 comments on commit 9d4f640

Please sign in to comment.