Skip to content

Latest commit

 

History

History
171 lines (115 loc) · 6.82 KB

integration_poc.md

File metadata and controls

171 lines (115 loc) · 6.82 KB

Integration proof of concept

The codebase on the integration_poc folder is an implementation of a V6 node intended to work as a regular (Docker-based) V6 node, but working with a K8S cluster under the hood. It is based on the node_poc.

  • v6_k8s_node.py:
    • The application that launches the node. It can be launched from the host for testing purposes, but it is intended to be running within a POD.
  • container_manager.py:
    • The refactored version of DockerManager (and other classes used by it) using the K8S API.
  • vantage6:
    • This folder contains the minimum set of original vantage6-node modules needed for the node to work (e.g., data-transfer classes, data exchange utilities, etc). These belong to version 4.5.5 of vantage6.

Status

  • Integrating the minimum number of V6 core dependencies for reimplementing a node
  • Authentication against the server, Socket.io connection
  • Creating I/O and token files, binding them to the POD, setting the ENV variables required by the algorithm
  • Launching a V6-algorithm (tested through the kubernetes dashboard)
  • Listening for task finalization (implemented on the PoC, adaptation is required)
  • Launching the node and the proxy as a POD
  • Reporting the results back to the server properly
  • Making the NODE Proxy reachable by the Job-PODs (the ones running the algorithms) using a FQDN
  • Running both central and partial functions with other K8S and regular nodes.
  • Isolating jobs-pods: applying the networking policies as it was done on the PoC
  • Encrypted data exchange
  • Handling multiple K8S status (see reported issues)
  • + Other features yet to be explored through the architectural proof of concept

Launching the node and linking it to a V6 server (using microk8s)

  1. Setup a vantage6 server (tested with version 4.7.0 and 4.8.1), create an organization for the K8S-V6-node, and a collaboration that includes it. Copy the JSON Web Token, as it will be used later. Or you can use the existing v6 server such as https://cotopaxi.vantage6.ai if you have access to it.

  2. Setup microk8s on Linux. It can be installed on Windows, but this PoC has been tested only on Ubuntu environments.

  3. Setup and enable the Kubernetes dashboard following the microk8s guidelines.

  4. Clone the repository. Work on the integration_poc folder.

  5. Build the docker image and publish it on a Docker registry. You can edit and use the script build_and_publish.sh to build an image compatible with both ARM and x86 architectures.

  6. Edit the v6-node configuration file configs/node_legacy_config.yaml:

    • 6.1 Add the connection settings of your V6 server:
    # API key used to authenticate at the server.
    api_key:
    
    # URL of the vantage6 server
    server_url:
    
    # port the server listens to
    port:
    
    # API path prefix that the server uses. Usually '/api' or an empty string
    api_path:
    
    
    databases:
      - label: default
        uri: /<path>/v6-Kubernetes-PoC/csv/employees.csv
        type: csv
    
    • 6.3 Set the task_dir setting (directory where local task files are stored).
    task_dir: /<ouput_path>/tasks
    

Important

Do create the task folder on the machine where the v6 node will run. If you don't do this, Kubernetes will create it with root as the owner, which will cause problems as the JOB PODs don't have root privileges for creating sub-folders on it.

  1. Edit the Kubernetes YAML configuration file used for launching the Node as a POD:

    • Add the reference to the v6 node image (e.g., in Dockerhub) created in Step #5.
    containers:
    - name: v6-node-server
    image: DOCKER_IMAGE_GENERATED_IN_PREVIOUS_STEPS
    tty: true
    env:
    • Add the full URI of the default database defined in step #6. In the mountPath, include the prefix /app/.databases/, and on the hostPathadd the URI as-is:
    volumeMounts:
    ...
    - name: v6-node-default-database
      mountPath: /app/.databases/{ABSOLUTE_HOST_PATH_OF_DEFAULT_DATABASE}
    
    ...
    
    volumes:
    ...
    - name: v6-node-default-database
      hostPath:
        path: ABSOLUTE_HOST_PATH_OF_DEFAULT_DATABASE
    
    • Add the absolute path of the task folder, as defined in the v6-node configuration file (step #6.3).
    volumes:
    - name: task-files-root
      hostPath:
        path: ABSOLUTE_HOST_PATH_OF_THE_TASK_FOLDER
    • Add the absolute path of the kubernetes configuration file. This integration PoC has been tested with Ubuntu server 22.04 and MicroK8S, where such configuration file is by default on /home/<user_name>/.kube/config.
    volumes:
    ...
    - name: kube-config-file
      hostPath:
        path: ABSOLUTE_HOST_PATH_OF_THE_KUBE_CONFIG_FOLDER
    • Add the absolute path of the vantage6 configuration file (the one edited on Step #6):
    volumes:
    ...
    - name: v6-node-config-file
      hostPath:
        path: ABSOLUTE_HOST_PATH_OF_THE_VANTAGE6_NODE_CONFIG_FILE
    
  2. Launch the Node with the kubectl command:

    bash:~/$ kubectl apply -f kubeconfs/node_pod_config.yaml
    
    
  3. Open the Kubernetes dashboard, select the 'v6-jobs' namespace, and check that the v6-node-pod POD is now running.

alt text

  1. Open tbe v6-node-pod logs on the dashboard and check that the node startup sequence is completed with no problems.

alt text

  1. Request the execution of a V6 algorithm on the vantage6 server. You can do this using the web-based user interface, or the python client. Reload the logs from Step #13 to verify how after this execution request is received, a new Job POD is created:

    alt text

    Check the Jobs list on the K8S dashboard. The ID of the Job should match the ID of the Task on the V6 server. Please note that once the job is finished, the PODs created for it are destroyed, so you may not be able to check the following details if the job has a short runtime. The following screenshots are from a function that 'sleeps' for two minutes before returning the results:

    alt text

    When checking the Job, you will see the related POD that is running the container. These will have the same name, plus a random string. From here, you can see the execution Logs and even open a shell terminal (Exec) on the container.

    alt text

    alt text

    The POD view provide further details on the container's mounted volumes and environment variables.

    alt text