The codebase on the integration_poc
folder is an implementation of a V6 node intended to work as a regular (Docker-based) V6 node, but working with a K8S cluster under the hood. It is based on the node_poc
.
- v6_k8s_node.py:
- The application that launches the node. It can be launched from the host for testing purposes, but it is intended to be running within a POD.
- container_manager.py:
- The refactored version of DockerManager (and other classes used by it) using the K8S API.
- vantage6:
- This folder contains the minimum set of original vantage6-node modules needed for the node to work (e.g., data-transfer classes, data exchange utilities, etc). These belong to version 4.5.5 of vantage6.
- Integrating the minimum number of V6 core dependencies for reimplementing a node
- Authentication against the server, Socket.io connection
- Creating I/O and token files, binding them to the POD, setting the ENV variables required by the algorithm
- Launching a V6-algorithm (tested through the kubernetes dashboard)
- Listening for task finalization (implemented on the PoC, adaptation is required)
- Launching the node and the proxy as a POD
- Reporting the results back to the server properly
- Making the NODE Proxy reachable by the Job-PODs (the ones running the algorithms) using a FQDN
- Running both central and partial functions with other K8S and regular nodes.
- Isolating jobs-pods: applying the networking policies as it was done on the PoC
- Encrypted data exchange
- Handling multiple K8S status (see reported issues)
- + Other features yet to be explored through the architectural proof of concept
-
Setup a vantage6 server (tested with version 4.7.0 and 4.8.1), create an organization for the K8S-V6-node, and a collaboration that includes it. Copy the JSON Web Token, as it will be used later. Or you can use the existing v6 server such as https://cotopaxi.vantage6.ai if you have access to it.
-
Setup microk8s on Linux. It can be installed on Windows, but this PoC has been tested only on Ubuntu environments.
-
Setup and enable the Kubernetes dashboard following the microk8s guidelines.
-
Clone the repository. Work on the
integration_poc
folder. -
Build the docker image and publish it on a Docker registry. You can edit and use the script
build_and_publish.sh
to build an image compatible with both ARM and x86 architectures. -
Edit the v6-node configuration file configs/node_legacy_config.yaml:
- 6.1 Add the connection settings of your V6 server:
# API key used to authenticate at the server. api_key: # URL of the vantage6 server server_url: # port the server listens to port: # API path prefix that the server uses. Usually '/api' or an empty string api_path:
- 6.2 Update the path of the csv included in the repository (or any other CVS you want to use), as the
default
database.
databases: - label: default uri: /<path>/v6-Kubernetes-PoC/csv/employees.csv type: csv
- 6.3 Set the
task_dir
setting (directory where local task files are stored).
task_dir: /<ouput_path>/tasks
Important
Do create the task folder on the machine where the v6 node will run. If you don't do this, Kubernetes will create it with root as the owner, which will cause problems as the JOB PODs don't have root privileges for creating sub-folders on it.
-
Edit the Kubernetes YAML configuration file used for launching the Node as a POD:
- Add the reference to the v6 node image (e.g., in Dockerhub) created in Step #5.
containers: - name: v6-node-server image: DOCKER_IMAGE_GENERATED_IN_PREVIOUS_STEPS tty: true env:
- Add the full URI of the default database defined in step #6. In the
mountPath
, include the prefix/app/.databases/
, and on thehostPath
add the URI as-is:
volumeMounts: ... - name: v6-node-default-database mountPath: /app/.databases/{ABSOLUTE_HOST_PATH_OF_DEFAULT_DATABASE} ... volumes: ... - name: v6-node-default-database hostPath: path: ABSOLUTE_HOST_PATH_OF_DEFAULT_DATABASE
- Add the absolute path of the task folder, as defined in the v6-node configuration file (step #6.3).
volumes: - name: task-files-root hostPath: path: ABSOLUTE_HOST_PATH_OF_THE_TASK_FOLDER
- Add the absolute path of the kubernetes configuration file. This integration PoC has been tested with Ubuntu server 22.04 and MicroK8S, where such configuration file is by default on
/home/<user_name>/.kube/config
.
volumes: ... - name: kube-config-file hostPath: path: ABSOLUTE_HOST_PATH_OF_THE_KUBE_CONFIG_FOLDER
- Add the absolute path of the vantage6 configuration file (the one edited on Step #6):
volumes: ... - name: v6-node-config-file hostPath: path: ABSOLUTE_HOST_PATH_OF_THE_VANTAGE6_NODE_CONFIG_FILE
-
Launch the Node with the
kubectl
command:bash:~/$ kubectl apply -f kubeconfs/node_pod_config.yaml
-
Open the Kubernetes dashboard, select the 'v6-jobs' namespace, and check that the
v6-node-pod
POD is now running.
- Open tbe
v6-node-pod
logs on the dashboard and check that the node startup sequence is completed with no problems.
-
Request the execution of a V6 algorithm on the vantage6 server. You can do this using the web-based user interface, or the python client. Reload the logs from Step #13 to verify how after this execution request is received, a new Job POD is created:
Check the Jobs list on the K8S dashboard. The ID of the Job should match the ID of the Task on the V6 server. Please note that once the job is finished, the PODs created for it are destroyed, so you may not be able to check the following details if the job has a short runtime. The following screenshots are from a function that 'sleeps' for two minutes before returning the results:
When checking the Job, you will see the related POD that is running the container. These will have the same name, plus a random string. From here, you can see the execution Logs and even open a shell terminal (Exec) on the container.
The POD view provide further details on the container's mounted volumes and environment variables.