Use the following instructions if you want to run a full development build, compile the code, and build the Docker images locally.
Install:
-
go
: a working Go environment is required to build the code -
glide
: the glide package manager for Go (https://glide.sh) -
npm
: the Node.js package manager (https://www.npmjs.com) for building the Web UI -
Go
is very specific about directory layouts. Make sure to set your$GOPATH
and clone this repo to a directory$GOPATH/src/github.com/IBM/FfDL
before proceeding with the next steps.
If you are developing on Minikube, please run the following commands to configure your Minikube and set up the Docker client to use the Minikube's Docker Daemon. Then start to build your own Docker images.
export VM_TYPE=minikube make minikube eval $(minikube docker-env)
Then, fetch the dependencies via:
glide install
Compile the code and build the Docker images via:
make build
make docker-build
Make sure kubectl
points to the right target context/namespace, then deploy the services to your Kubernetes
environment (using helm
):
make deploy
Please modify the resourceGPU
under lcm/service/lcm/container_helper.go and lcm/service/lcm/resources.go to "nvidia.com/gpu"
and rebuild the lcm image to enable device plugin for all GPU workloads on your development build.
Please uncomment the following section under trainer/trainer/frameworks.go and rebuild the trainer image to enable custom learner images from any users. Alternatively, you can use the pre-built images ffdl/ffdl-trainer:customizable
on DockerHub.
if fwName == "custom" {
return true, ""
}
After you deployed ffdl-trainer
with custom image feature, you can use your custom learner images by changing the framework.name
to custom and framework.version
to your learner image in your training job's manifest.yml file. If you are using any private registry, you need to enable access to the private registry in your Kubernetes default namespace.