Skip to content

Latest commit

 

History

History
56 lines (40 loc) · 2.31 KB

developer-guide.md

File metadata and controls

56 lines (40 loc) · 2.31 KB

Developer Guide

Run a full development build

Use the following instructions if you want to run a full development build, compile the code, and build the Docker images locally.

Install:

  • go: a working Go environment is required to build the code

  • glide: the glide package manager for Go (https://glide.sh)

  • npm: the Node.js package manager (https://www.npmjs.com) for building the Web UI

  • Go is very specific about directory layouts. Make sure to set your $GOPATH and clone this repo to a directory $GOPATH/src/github.com/IBM/FfDL before proceeding with the next steps.

If you are developing on Minikube, please run the following commands to configure your Minikube and set up the Docker client to use the Minikube's Docker Daemon. Then start to build your own Docker images.

export VM_TYPE=minikube
make minikube
eval $(minikube docker-env)

Then, fetch the dependencies via:

glide install

Compile the code and build the Docker images via:

make build
make docker-build

Make sure kubectl points to the right target context/namespace, then deploy the services to your Kubernetes environment (using helm):

make deploy

Enable device plugin for GPU workloads with development build

Please modify the resourceGPU under lcm/service/lcm/container_helper.go and lcm/service/lcm/resources.go to "nvidia.com/gpu" and rebuild the lcm image to enable device plugin for all GPU workloads on your development build.

Enable custom learner images with development build

Please uncomment the following section under trainer/trainer/frameworks.go and rebuild the trainer image to enable custom learner images from any users. Alternatively, you can use the pre-built images ffdl/ffdl-trainer:customizable on DockerHub.

if fwName == "custom" {
  return true, ""
}

After you deployed ffdl-trainer with custom image feature, you can use your custom learner images by changing the framework.name to custom and framework.version to your learner image in your training job's manifest.yml file. If you are using any private registry, you need to enable access to the private registry in your Kubernetes default namespace.