QHub handles the initial setup and management of configurable data science environments, allowing users to deploy seamlessly using Github Actions.
QHub can be installed on a bare-metal server using HPC, on a Cloud provider or even locally for testing purposes. Review the options below to discover which option best suits your needs.
The local version is recommended for testing QHub's components due to its simplicity. Choose the local mode if:
- You already have Kubernetes clusters
- You want to test these Kubernetes clusters
- You have available local compute setup
- You want to try out QHub with a quick-install to see how it works, without setting up environment variables
You should choose another installation option if you are starting from scratch (i.e., no clusters yet) and aiming to have a production environment.
The QHub HPC should be your choice if:
- You have highly optimized code that require highly performant infrastructure
- You have existing infrastructure already available
- You expect that your infrastructure will not exceed the existing resources capabilities
NOTE: Although it is possible to deploy QHub HPC on the Cloud, it is not generally recommended due to possible high costs. For more information, check out the base cost section of the docs.
The Kubernetes deployment of QHub is considered to be the default option. If you are not sure which option to choose, try this one. It is suitable for most use cases, especially if:
- You require scalable infrastructure
- You aim to have a production environment with GitOps enabled by default
The QHub version requires a choice of Cloud provider, authentication (using Auth0, GitHub, custom OAuth provider, or password based), domain registration, and CI provider (GitHub Actions, GitLab CI).
These services require global environment variables that once set up, will trigger QHub's automatic deploy using your CI/CD platform of choice.
To find and set the environment variables, follow the steps described on the subsections below.
The first required step is to choose a Cloud Provider to host the project deployment. The cloud installation is based on Kubernetes, but knowledge of Kubernetes is NOT required nor is in depth knowledge about the specific provider required either. QHub supports Amazon AWS, DigitalOcean, GCP, and Azure.
To deploy QHub, all access keys require fairly wide permissions to create all the necessary cloud resources. Hence, once the Cloud provider has been chosen, follow the steps below and set the environment variables as specified with owner/admin level permissions.
For more details on configuration for each Cloud provider, check the How-To Guides section of the documentation.
Click for AWS configuration instructions
Please see these instructions for creating an IAM role with administrator permissions. Upon generation, the IAM role will provide a public access key ID and a secret key which will need to be added to the environment variables.
To define the environment variables paste the commands below with your respective keys.
export AWS_ACCESS_KEY_ID="HAKUNAMATATA"
export AWS_SECRET_ACCESS_KEY="iNtheJUng1etheMightyJUNgleTHEl10N51eEpsT0n1ghy;"
Click to expand DigitalOcean configuration directions
Please see these instructions for creating a Digital Ocean token. In addition to a token
, a
spaces key
(similar to AWS S3) credentials are also required. Follow the instructions on the
official docs for more information.
Note: DigitalOcean's permissions model isn't as fine-grained as the other supported Cloud providers.
Set the required environment variables as specified below:
export DIGITALOCEAN_TOKEN="" # API token required to generate resources
export SPACES_ACCESS_KEY_ID="" # public access key for access spaces
export SPACES_SECRET_ACCESS_KEY="" # the private key for access spaces
export AWS_ACCESS_KEY_ID="" # set this variable with the same value as `SPACES_ACCESS_KEY_ID`
export AWS_SECRET_ACCESS_KEY="" # set this variable identical to `SPACES_SECRET_ACCESS_KEY`
Click for CGP configuration specs
Follow these detailed instructions to create a Google Service Account with owner level permissions. Then, follow the steps described on the official GCP docs to create and download a JSON credentials file. Store this credentials file in a well known location and make sure to set yourself exclusive permissions.
You can change the file permissions by running the command chmod 600 <filename>
on your terminal.
In this case the environment variables will be such as follows:
export GOOGLE_CREDENTIALS="path/to/JSON/file/with/credentials"
export PROJECT_ID="projectIDName"
NOTE: the
PROJECT_ID
variable can be found at the Google Console homepage, underProject info
.
Click for Azure configuration details
Follow these instructions to create a Service Principal in the Azure Portal. After completing the steps described on the link, set the following environment variables such as below:
export ARM_CLIENT_ID="" # application (client) ID
export ARM_CLIENT_SECRET="" # client's secret
export ARM_SUBSCRIPTION_ID="" # value available at the `Subscription` section under the `Overview` tab
export ARM_TENANT_ID="" # field available under `Azure Active Directories` > `Properties` > `Tenant ID`
NOTE 1: Having trouble finding your Subscription ID? Azure's official docs might help.
NOTE 2: Tenant ID values can be also found using PowerShell and CLI.
User identity in QHub is now managed within Keycloak which is a robust and highly flexible open source identity and access management solution. A Keycloak instance will be deployed inside your QHub. It can be configured to work with many OAuth 2.0 identity providers, it can federate users from existing databases (such as LDAP), or it can be used as a simple database of username/passwords.
The full extent of possible configuration can't be covered here, so we provide three simple options that can be configured automatically by QHub when it sets up your new platform. These options are basic passwords, GitHub single-sign on, or Auth0 single-sign on (which in turn can be configured to allow identity to be provided by social login etc).
You will actually instruct qhub init
which method you have chosen when you move on to the Usage section, but at this stage you may need to set environment variables
corresponding to your choice:
Click for Auth0 configuration details
Auth0 is a great choice to enable flexible authentication via multiple providers. To create the necessary access tokens you will need to have an Auth0 account and be logged in. Directions for creating an Auth0 application.
- Click on the
Applications
button on the left - Select
Create Application
>Machine to Machine Applications
>Auth0 Management API
from the dropdown menu - Next, click
All
next toSelect all
and clickAuthorize
- Set the variable
AUTH0_CLIENT_ID
equal to theClient ID
string, and do the same for theClient secret
by running the command below.
With the application created set the following environment variables:
AUTH0_CLIENT_ID
: client ID of Auth0 machine-to-machine application found at top of the newly created application pageAUTH0_CLIENT_SECRET
: secret ID of Auth0 machine-to-machine application found in theSettings
tab of the newly created applicationAUTH0_DOMAIN
: TheTenant Name
which can be found in the general account settings on the left hand side of the page appended with.auth0.com
, for example:
export AUTH_DOMAIN="qhub-test.auth0.com" # in case the Tenant Name was called 'qhub-test'
Click for GitHub SSO configuration details
To use GitHub as a single-sign on provider, you will need to create a new OAuth 2.0 app.
No environment variables are needed for this - you will be given the relevant information and prompted for various inputs during the next stage, when you run
qhub init
if you provide the flag --auth-provider github
. This will be covered when you reach that point in this documentation.
If you would like to use a different method for authentication that is not automatically covered in Qhub, such as Facebook, Microsoft AAD, or a custom provider. You can do so by defining the expected identity provider in the Keyclok admin panel. An identity provider derives from a specific protocol used to authenticate and send authentication and authorization information to users. It can be:
-
A social provider such as Facebook, Google, or Twitter.
-
A business partner whose users need to access your services.
-
A cloud-based identity service you want to integrate.
Typically, Keycloak bases identity providers on the following protocols:
-
SAML v2.0
-
OpenID Connect v1.0
-
OAuth v2.0
If your authentication provider uses the above mentioned protocols you can follow this steps to create a new identity provider or use an existing configuration for a social identity provider.
In the Usage section, you will need to run qhub init
(this only ever needs to be run once - it creates your configuration YAML file) and then qhub deploy
to set up
the cloud infrastructure and deploy QHub for the first time.
For subsequent deployments, it's possible to run qhub deploy
again in exactly the same way, providing the configuration YAML file as you would the first time. However, it's also
possible to automate future deployments using 'DevOps' - the configuration YAML file stored in git will trigger automatic redeployment whenever it's edited.
This DevOps approach can be provided by GitHub Actions or GitLab Workflows. As for the other choices, you will only need to specify the CI/CD provider when you come to run
qhub init
, but you may need to set relevant environment variables unless you choose 'none' because you plan to always redeploy manually.
Click for GitHub Actions configuration details
QHub uses GitHub Actions to enable Infrastructure as Code and trigger the CI/CD checks on the configuration file that automatically generates the deployment modules for the infrastructure. To do that, it will be necessary to set the GitHub username and token as environment variables. First create a github personal access token via these instructions. The token needs permissions to create a repo and create secrets on the repo. At the moment we don't have the permissions well scoped out so to be on the safe side enable all permissions.
GITHUB_USERNAME
: GitHub usernameGITHUB_TOKEN
: GitHub-generated token
Click for GitLab Workflow configuration details
If you want to use GitLab CI to automatically deploy changes to your configuration, then no extra environment variables are needed for this.
All git repo and CI setup on GitLab will need to be done manually. At the next stage, when you run qhub init
please provide the flag --ci-provider gitlab-ci
.
After initial deploy, the documentation should tell you when to commit your configuration files into your GitLab repo. There should be your qhub-config.yaml
file as well as a
generated file called .gitlab-ci.yml
. You will need to manually set environment variables for your cloud provider as secrets in your GitLab CI for the repo.
Finally, you will need to have a domain name for hosting QHub. This domain will be where your application will be exposed.
Currently, QHub only supports CloudFlare for automatic DNS registration. If an alternate DNS provider is desired, change the --dns-provider
flag from cloudflare
to none
on
the qhub deploy
command. The deployment then will be paused when it asks for an IP address (or CNAME, if using AWS) and prompt to register the desired URL. Setting a DNS record
heavily depends on the provider thus it's not possible to have detailed docs on how to create a record on your provider. Googling setting <A/CNAME> record on <provider name>
should yield good results on doing it for your specific provider.
Click for Cloudflare configuration details
QHub supports Cloudflare as a DNS provider. If you choose to use Cloudflare, first create an account, then there are two possible following options:
- You can register your application domain name on it, using the Cloudflare nameserver (recommended).
- You can outright buy a new domain with Cloudflare (this action isn't particularly recommended).
To generate a token follow these steps:
- Under
Profile
, select theAPI Tokens
menu and click onCreate API Token
. - On
Edit zone DNS
click onUse Template
. - Configure
Permissions
such as the image below: - On
Account Resources
set the configuration to include your desired account - On
Zone Resources
set it toInclude | Specific zone
and your domain name - Click continue to summary
- Click on the
Create Token
button and set the token generated as an environment variable on your machine.
Finally, set the environment variable such as:
export CLOUDFLARE_TOKEN="cloudflaretokenvalue"
You are now done with the hardest part of the deployment.
In the next section, you will create the main configuration YAML file and then deploy the QHub infrastructure.