Skip to content

Commit

Permalink
XD community talk
Browse files Browse the repository at this point in the history
  • Loading branch information
pedrogk committed Aug 17, 2024
1 parent 4a428ec commit d4c22af
Show file tree
Hide file tree
Showing 11 changed files with 44 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ slug: optimizing-critical-operations-enhancing-robinhood-s-workflow-journey-with
speakers:
- Kevin Wang
- Palanieppan Muthiah
- Peiqiu Tian
time_start: 2024-09-10 17:10:00
time_end: 2024-09-10 17:35:00
room: Georgian
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ timeslot: 6
gridarea: "5/3/6/4"
images:
- /images/sessions/2024/mastering-llm-batch.jpg
draft: true
---

As large language models (LLMs) gain traction, companies encounter challenges in deploying them effectively. This session focuses on using Airflow to manage LLM batch pipelines, addressing rate limits and optimizing asynchronous batch APIs. We will discuss strategies for managing cloud provider rate limits efficiently to ensure uninterrupted, cost-effective LLM operations. This includes queuing and job prioritization techniques to optimize throughput. Additionally, we'll explore asynchronous batch processing for tasks such as Retrieval Augmented Generation (RAG) and vector embedding, which enhance processing efficiency and reduce latency. The session features a hands-on demonstration on AWS's managed Airflow service, providing practical insights into configuring and scaling LLM workflows in the cloud.
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,21 @@ title: "DAGify - Enterprise Scheduler Migration Accelerator for Airflow"
slug: dagify-enterprise-scheduler-migration-accelerator-for-airflow
speakers:
- Konrad Schieban
- Tim Hiatt
time_start: 2024-09-12 11:30:00
time_end: 2024-09-12 12:15:00
time_end: 2024-09-12 11:55:00
room: California West
track: Community
day: 20243
timeslot: 90
gridarea: "6/3/8/4"
gridarea: "6/3/7/4"
images:
- /images/sessions/2024/dagify.jpg
---

DAGify is a highly extensible, template driven, enterprise scheduler migration accelerator that helps organizations speed up their migration to Apache Airflow. While DAGify does not claim to migrate 100% of existing scheduler functionality it aims to heavily reduce the manual effort it takes for developers to convert their enterprise scheduler formats into Python Native Airflow DAGs.

DAGify is an open source tool under Apache 2.0 license and available on Github (https://github.com/GoogleCloudPlatform/dagify).
DAGify is an open source tool under Apache 2.0 license and available on Github (https://github.com/GoogleCloudPlatform/dagify).

In this session we will introduce DAGify, its use cases and demo its functionality by converting Control-M XML files to Airflow DAGs.
In this session we will introduce DAGify, its use cases and demo its functionality by converting Control-M XML files to Airflow DAGs.

Additionally we will highlight DAGify's "no-code" extensibility by creating custom conversion templates to map Control-M functionality to Airflow operators.
Additionally we will highlight DAGify's "no-code" extensibility by creating custom conversion templates to map Control-M functionality to Airflow operators.
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ time_end: 2024-09-12 12:15:00
room: Georgian
track: Use cases
day: 20243
timeslot: 92
timeslot: 91
gridarea: "6/5/8/6"

images:
Expand Down
19 changes: 19 additions & 0 deletions content/sessions/2024/92-lessons-learned-airflow-open-source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: "Lessons Learned While Using Airflow as Open-Source Software"
slug: lessons-learned-airflow-open-source
speakers:
- Xiaodong Deng
time_start: 2024-09-12 12:00:00
time_end: 2024-09-12 12:25:00
room: California West
track: Community
day: 20243
timeslot: 92
gridarea: "7/3/8/4"
images:
- /images/sessions/2024/lessons-learned-os.jpg

---

Apache Airflow is an essential piece of the data infrastructure for many organizations and has been largely adopted by data engineers across domains for orchestration. Due to its open-source nature, there are varied strategies to operate Airflow, resulting in different challenges. In this talk, we will explore common challenges when Airflow users operate it as an open source software, and the lessons learned. Such lessons should be applicable for operating other open source softwares as well.

Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,7 @@ images:

At Vibrant Planet, we're on a mission to make the world's communities and ecosystems more resilient in the face of climate change. Our cloud-based platform is designed for collaborative scenario planning to tackle wildfires, climate threats, and ecosystem restoration on a massive scale.



In this talk we will dive into how we are using Airflow. Particularly we will focus on how we're making Airflow pipelines smarter and more resilient, especially when dealing with the task of processing large satellite imagery and other geospatial data.
In this talk we will dive into how we are using Airflow. Particularly we will focus on how we're making Airflow pipelines smarter and more resilient, especially when dealing with the task of processing large satellite imagery and other geospatial data.



Expand Down
1 change: 1 addition & 0 deletions content/speakers/avichay-marciano/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ linkedin: https://www.linkedin.com/feed/
github:
events:
- 2024
draft: true
---

Senior Solutions Architect at Amazon Web Services. Specializing in Analytics and Machine Learning. Previously, I worked for 9 years at Intel Corporation as Backed Tech lead developing data oriented solutions.
Expand Down
14 changes: 14 additions & 0 deletions content/speakers/peiqiu-tian/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: "Peiqiu Tian"
date: 2024-08-17T12:42:52-05:00
images:
- /images/speakers/peiqiu-tian.jpg
designation: Software Engineer at Robinhood
twitter:
linkedin:
github:
events:
- 2024
---

Peiqiu Tian is a software engineer working on the Robinhood Workflow Infrastructure team. He is dedicated to building a reliable and efficient workflow infrastructure for Robinhood. Prior to this role, he was engaged in backend development at Wish’s logistic team.
1 change: 1 addition & 0 deletions content/speakers/tim-hiatt/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ linkedin: https://www.linkedin.com/in/timhiatt
github:
events:
- 2024
draft: true
---


Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/images/speakers/peiqiu-tian.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit d4c22af

Please sign in to comment.