This is a simple boilerplate that allows you to quickly set up a Magda instance - the idea is that you can fork this config, commit changes but keep merging in master in order to stay up to date. If you are new to Magda, you might also be interested in our tutorial repo.
We have upgraded our Terraform module to work with Magda v0.0.57 or later (using Helm3 Terraform Provier).
If you need the Terraform module to deploy an older version Magda (v0.0.56-RC or earlier), please check out branch v0.0.56-RC6 and use Terraform module there.
With this repo we're trying to make it as easy to get started with Magda as possible... but we're not there yet. To setup Magda in a similar configuration to data.gov.au (i.e. an openly-available, pure-open-data search engine) is fairly simple, but using other features (e.g. Add Dataset, the Admin UI) will almost certainly result in getting stuck in some way that requires Kubernetes skills to get out of.
This doesn't mean you shouldn't try, and we're happy to answer any questions you have on our Github Discussions Forum. Just be aware that at best, this repo works a bit like a Linux installer - it can get you started easily, but if you want to mess around you'll still have to learn how it works.
How you get started with Magda will depend on where you're starting from:
- I have nothing already set up, and I'm happy to run everything on Google Cloud through Terraform: Please use the instructions below.
- I already have a kubernetes cluster, or want to use a local environment/cloud environment other than Google Cloud, or I just don't like Terraform: Please have a look at our tutorial repo.
NOTE: Since version
v0.0.57
, Magda requires Helm v3 to deploy. The Terraform helm provider has been upgraded to version 1.1.1 to support Helm v3. If you previously deployed an older version (e.g. v0.0.56-RC6) Magda, please refer to this migration document to upgrade your release before use terraform to upgrade your existing release to a newer version.
For new users setting up Magda for the first time we recommend using these instructions - these use Terraform to set you up with a instance running on Google Cloud Engine very quickly (about 5 minutes of entering commands / editing config and 20 minutes of waiting), and gives you a basic instance, and in another 30-60 minutes of waiting will get you HTTPS working on your own domain.
git clone --single-branch --branch master https://github.com/magda-io/magda-config.git
or download it with the "Clone or download" button in Github.
Go to https://learn.hashicorp.com/terraform/getting-started/install.html for instructions
Go to https://helm.sh/docs/intro/install/ for instructions
Version 3.2.0 or higher is required.
You can test your install by:
helm version
this should tell you the version of the helm installed.
4. Install Google Cloud SDK
Go to https://cloud.google.com/sdk/docs/downloads-interactive for instructions.
Once Google Cloud SDK
is installed, you also need to install gcloud beta components by the following command:
gcloud components install beta
Before you start the deployment process, you need to create a google cloud project via Google Cloud Console and note down the Project Id
. Note that this isn't necessarily exactly the same as the id you specified - if it's already been taken, Google will append some numbers to it. Make sure by checking the "Select a Project" dialog in Google Cloud:
Set the project id you noted down to an environment variable, because you'll need it in a few places - this will work in bash. If you're using another shell use the equivalent command or just manually replace $PROJECT_ID
with your project id.
export PROJECT_ID=[your-project-id]
Then set it as the default in Google Cloud
gcloud config set project $PROJECT_ID
gcloud services enable compute.googleapis.com
gcloud services enable container.googleapis.com
gcloud iam service-accounts create magda-robot
Feel free to use a name other than magda-robot
if you like.
You need to find out the service account email of your newly created service account to be used as the identifier in other commands.
To do so, first list all service accounts:
gcloud iam service-accounts list
Find the row of your service account. The service account email should be something similar to magda-robot@[your-project-id].iam.gserviceaccount.com
. You'll need this a few times, so it's worth saving it to an environment variable - once again, if you're not using a shell that supports this you can just manually replace $SERVICE_ACCOUNT_EMAIL with the email address itself.
export SERVICE_ACCOUNT_EMAIL=[your-service-account-email]
First go to the terraform/magda
directory inside your cloned version of this repository.
cd magda-config/terraform/magda
gcloud iam service-accounts keys create key.json --iam-account=$SERVICE_ACCOUNT_EMAIL
You will now have a key.json
file in terraform/magda
, containing a private key. We suggest you put this somewhere safe like a password manager.
DO NOT CHECK IT INTO SOURCE CONTROL.
Grant editor
role to your service account:
gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$SERVICE_ACCOUNT_EMAIL --role roles/editor
Grant k8s admin
role to your service account:
gcloud projects add-iam-policy-binding $PROJECT_ID --member serviceAccount:$SERVICE_ACCOUNT_EMAIL --role roles/container.admin
To do so, run:
terraform init
After a bit of waiting you should get this message:
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
Edit terraform/magda/terraform.tfvars
and supply the follow parameters:
Project id
: the id of the google cloud project that you created (echo $PROJECT_ID
)Deploy Region
: which region you want to deploy magda tocredential_file_path
: the path of the service account key file (key.json
) that we just generatednamespace
: which kubernetes namespace you want to deploy Magda to (generally this should just be "default")external_domain
: Optional: what domain you want the Magda server to be accessed from (which requires a bit of extra configuration). Leave blank to just access your instance through a temporary domain. You can set this later if necessary.
Other optional settings and their default values (if not set) are:
cluster_node_pool_machine_type
: The machine type to use, see https://cloud.google.com/compute/vm-instance-pricing for more details. Default:n1-standard-4
kubernetes_dashboard
: Whether turn on kubernetes_dashboard or not; Default:false
You can find full list of configurable options from here.
Look at values.yaml. It has reasonable defaults but you might want to edit something - it will give you a new instance with a standard colour scheme/logos and no datasets (yet).
terraform apply -auto-approve
This will take quite a while (like 20 minutes), but it should update you about its progress. Take this opportunity to make a cup of tea or stretch!
Once the deployment is complete, you should get a bunch of output including something like this:
Apply complete! Resources: 12 added, 0 changed, 0 destroyed.
Outputs:
external_access_url = http://34.98.120.7.xip.io/
external_ip = 34.98.120.7
You should be able to go to http://[external_ip]
right away and see your Magda homepage come up. If you didn't specify external_domain
, then the external_access_url
will also work, otherwise see below:
If you specified external_domain
, you need to create a DNS A
record in your DNS registrar's system. The A
record needs to point to the external_ip
that was generated when deploying Magda.
As long as you specified external_domain
in config file terraform/magda/terraform.tfvars
and you've set an A
record from that domain to the value that came back from external_ip
, the SSL certificate will be automatically generated and set up for you. The process is going to take 30 to 60 minutes as specified by Google:
With a correct configuration the total time for provisioning certificates is likely to take from 30 to 60 minutes.
If you didn't supply a value for external_domain
config field during your initial deployment, you can edit the config file and update your deployment by re-running:
terraform apply -auto-approve
Start playing around!
- If you want to get some datasets into your system, turn the
connectors
tag totrue
in values.yaml and re-runterraform apply -auto-approve
. A connector job will be created and start pulling datasets fromdata.gov.au
... or you can modifyconnectors:
in values.yaml to pull in datasets from somewhere else. - In the Google Cloud console, go to Kubernetes Engine / Clusters and click the "Connect" button, then use the
kubectl
command (should be installed along with the Google Cloud command line) to look at your new Magda cluster.
Use kubectl get pods
to see all of the running containers and kubectl logs -f <container name>
to tail the logs of one. You can also use kubectl port-forward combined-db-0 5432
to open a tunnel to the database, and use psql, PgAdmin or equivalent to investigate the database - you can find the password in terraform.tfstate.
- Sign up for an API key on Facebook or Google, and put your client secret in terraform.tfvars and your client id in values.yaml to enable signing in via OAuth.
- Configure an SMTP server in terraform.tfvars and values.yaml and switch the
correspondence
flag to true in order to be able to send emails from the app. - Set
scssVars
in values.yaml to change the colours - Ask us questions on https://github.com/magda-io/magda/discussions
- Send us an email at [email protected] to tell us about your new Magda server.
You might also be interested in our tutorial repo which will not only help you to get familiar with more advanced configuration but also give you a quick registry API tour.
This is harder than it should be at this point.
- Use
kubectl port-forward combined-db-0 5432 -n <your-namespace>
to get a connection to the database - Get your db password out of the
db-passwords
secret - in bash you can use
kubectl get secrets db-passwords -o yaml -n <your namespace> | grep authorization-db: | awk '{print $2}' | base64 -D
or you can just use kubectl get secrets db-passwords -o yaml -n <your namespace>
to get the secret then base64 decode it to get the password.
3. Use acs-cmd to set / unset a user as an admin
After login as an Admin user, you will see the Admin
button on your account details page.
Please refer to How to create API key doc for more information of accessing APIs with an API key.
After login as an admin user, you will see a button for creating a new dataset on Home Page
.
- If something goes wrong, often you can fix it by just running
terraform apply
again. - If that fails, and you've got up to the
helm release
stage, you can try deleting the helm release by running:
terraform taint helm_release.magda_helm_release
terraform taint kubernetes_secret.auth_secrets
terraform taint kubernetes_secret.db_passwords
terraform taint kubernetes_namespace.magda_namespace
terraform taint kubernetes_namespace.magda_openfaas_namespace
terraform taint kubernetes_namespace.magda_openfaas_fn_namespace
And then terraform apply
again. Note that this will probably destroy any data you've entered so far.
- If that fails, you can start the entire process from scratch by running
terraform destroy
and re-runningterraform apply
. This will definitely destroy any data you've entered so far.