Skip to content

Latest commit

 

History

History
24 lines (17 loc) · 1.02 KB

README.md

File metadata and controls

24 lines (17 loc) · 1.02 KB

Training Operator

Overview

This repository hosts the Kubernetes Training Operator for Kubeflow training jobs.

Description

The Kubeflow Training Operator provides Kubernetes custom resources to run distributed or non-distributed training jobs, such as TFJobs and PytorchJobs. The Training Operator in this repository is a Python script which wraps the latest released Kubeflow Training Operator manifests, providing lifecycle management and handling events (install, upgrade, integrate, remove). It is one of the Charmed Kubeflow operators.

Usage

While it is possible to deploy the Training Operator as a standalone operator, it works best when deployed alongside other components included in the Kubeflow bundle. For installation steps, please refer to the installation guide.