Please refer to our report for a detailed explanation of our project, including the following sections:
- Introduction
- Background
- Neural Network Inference
- Traditional OS Scheduling
- Design
- Client
- Scheduler
- Evaluation
- Meeting SLOs Better
- Tradeoffs and Limitations
- Related Work
- Benefits of Overlapping CPUs and GPUs
- Overview of the Triton Server and Client
- Work Distribution
Thanks to course staff of CS 744 Big Data Systems for their guidance and blueprint for this project.