Skip to content

Commit

Permalink
FEAT: add job runner for vertex ai custom training job (#18)
Browse files Browse the repository at this point in the history
  • Loading branch information
haru-256 authored Jan 5, 2025
1 parent a725861 commit 7ecf9be
Show file tree
Hide file tree
Showing 8 changed files with 364 additions and 0 deletions.
1 change: 1 addition & 0 deletions cmd/job-runner/.env
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
GCS_PATH="/gcs/bucket/path"
1 change: 1 addition & 0 deletions cmd/job-runner/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
!.env
1 change: 1 addition & 0 deletions cmd/job-runner/.python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.12.8
26 changes: 26 additions & 0 deletions cmd/job-runner/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
.PHONY: lint fmt help test lock setup train
.DEFAULT_GOAL := help

lint: ## Run Linter
uv run ruff check .
uv run mypy .

fmt: ## Run formatter
uv run ruff check --fix .
uv run ruff format .

test: ## Run tests
uv run pytest .

lock: ## Lock dependencies
uv lock

install: ## Setup the project
uv sync --all-groups

train: ## Train the model
uv run python train.py

help: ## Show options
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | \
awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-20s\033[0m %s\n", $$1, $$2}'
9 changes: 9 additions & 0 deletions cmd/job-runner/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Vertex AI Custom Training Job Runner

A command line utility for running machine learning training jobs on Google Cloud Vertex AI.

## Usage

```sh
uv run main.py --args=...
```
31 changes: 31 additions & 0 deletions cmd/job-runner/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import pathlib
from typing import Literal

from loguru import logger
from pydantic import Field
from pydantic_settings import BaseSettings, CliApp, SettingsConfigDict


class Settings(BaseSettings):
"""Settings for the job-runner"""

model_config = SettingsConfigDict(
cli_parse_args=True, cli_prog_name="job-runner", env_file=".env", env_file_encoding="utf-8"
)

# from environmental variables
# FIXME: これも必須parameterになり、CLIから指定する必要があるように見える
gcs_path: pathlib.Path = Field(description="[ENV] Google Cloud Storage path to the model")

# from args
machine_type: Literal["g2-instance=4", "g2-instance=12"] = Field(
description="Machine type to use for the job"
)

def cli_cmd(self) -> None:
logger.info("Running the job-runner cli command")
logger.info(self.model_dump())


if __name__ == "__main__":
s = CliApp.run(Settings)
35 changes: 35 additions & 0 deletions cmd/job-runner/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
[project]
name = "job-runner"
version = "0.1.0"
description = "Command line utilities to run ml training job"
readme = "README.md"
requires-python = ">=3.12"
dependencies = ["loguru~=0.7.3", "pydantic~=2.10.4", "pydantic-settings~=2.7.1"]

[dependency-groups]
dev = ["pytest~=8.3.4", "mypy~=1.14.1"]
lint = ["ruff~=0.8.4"]

[tool.ruff]
target-version = "py312"
line-length = 100

[tool.ruff.lint]
extend-select = ["I"]

[tool.mypy]
python_version = "3.12"
exclude = [".venv"]
plugins = ["pydantic.mypy"]

follow_imports = "silent"
warn_redundant_casts = true
warn_unused_ignores = true
disallow_any_generics = true
no_implicit_reexport = true
disallow_untyped_defs = true

[tool.pydantic-mypy]
init_forbid_extra = true
init_typed = true
warn_required_dynamic_aliases = true
260 changes: 260 additions & 0 deletions cmd/job-runner/uv.lock

Large diffs are not rendered by default.

0 comments on commit 7ecf9be

Please sign in to comment.