Standalone metadata server to simplify the use of vendor cloud images with a standalone kvm/libvirt server
- supports a subset of the EC2 metadata "standard" (as documented by Amazon), compatible with the EC2 data source as implemented by cloud-init
- allows the user to configure cloud-init via user-data
- cloud-init config can be templated, making use of data from the system configuration as well as information about the host itself
- provides IP address and DNS management via dnsmasq, including generating host names from instance names and supporting DNS resolution of the generated names
- integrates with libvirt to automatically update records as new instances come online
See the sample config file for the full set of configuration options.
mdserver as of 0.6.0 only supports Python3 - if you need to run with Python2 use version 0.5.1 or later.
Package dependencies:
- bottle (>= 0.12.0)
- xmltodict (>= 0.9.0)
Start by creating a libvirt network using the sample network XML
files in the distribution at doc/mds-network.xml
:
# virsh net-define --file doc/mds-network.xml
This creates a NATed network using bridge br-mds
with address
10.122.0.0/16, and the EC2 "magic" IP address 169.254.169.254. Any
instance that will be managed by mdserver needs to have its first
<interface>
defined something like this:
<interface type='network'>
<source network='mds'/>
<model type='virtio'/>
</interface>
The MAC address assigned to this interface is the one that mdserver adds to its database, and uses to create the DHCP configuration used by dnsmasq.
To install requirements using pip run the following:
# pip3 install -r requirements.txt
Since mdserver is a system package it's not usefully installable using the typical Python packaging tools - the core application can be installed, but the additional system integration cannot. To work around this a simple script is included to install these additional components in default locations.
To install the core application from a source distribution:
# python3 setup.py install
Once that has been done, run the system integration script:
# ./system-integration.sh
This will install the main configuration file in
/etc/mdserver/mdserver.conf
, the systemd unit files in
/etc/systemd/system/
, and the libvirt hook script in
/etc/libvirt/hooks/qemu
.
The default mdserver.conf
file will need to be modified before the
system can do anything very useful - the file is well documented
and lists the default values for everything, so it should be easy
to adjust to your needs. You will also need to add your ssh public
keys to the config before you can ssh into instances configured via
mdserver.
User data files are sourced by default from
/etc/mdserver/userdata
.
mdserver assumes that it's running in a systemd context, though it doesn't strictly rely on any systemd features - in particular, it uses systemd's support for defining relationships between units in order to manage dnsmasq. In a non-systemd context this can be emulated within a traditional init script, but this is not an explicitly supported use case.
The supplied systemd unit files should work most of the time, but
will require editing if the location of the config file is changed,
or if the base dir is changed from the default /var/lib/mdserver
.
Once set up is complete, starting the system can be done in the expected way:
# systemctl start mdserver
This will bring up both mdserver and dnsmasq
The server can also be run manually:
/usr/local/bin/mdserver /etc/mdserver/mdserver.conf
In this case you will need to also start dnsmasq:
/usr/sbin/dnsmasq --conf-file=/var/lib/mdserver/dnsmasq/mds.conf --keep-in-foreground
In all cases you will need to ensure that the libvirt hook script
is installed in the appropriate location - typically this is
/etc/libvirt/hooks/qemu
- and it will need to be made executable.
The libvirt hook is hard-coded with the listening address of the mdserver process, since it needs to communicate with the mdserver process - however you configure the mdserver to listen, it needs to match the address in the default file.
Finally, by default logs go to /var/log/mdserver.log
.
Cloud-init tries to automatically detect the correct datasource to use, based on the environment it's running in; if it can't recognise its environment then cloud-init will not run. Current cloud-init has no way to detect that md_server is in use - to get cloud-init to run we need to configure the environment appropriately.
This can be done in three ways: configure cloud-init to use the Ec2 datasource unconditionally; make your instance look like an AWS instance so cloud-init decides to use the Ec2 datasource itself; or customise the image to force cloud-init to run its network datasource detection, which will detect the Ec2 datasource on the "magic" IP address.
Note: by far the best approach is to explicitly configure cloud-init; the other methods described below should be considered fallback options, particularly for older cloud-init versions.
Cloud-init can be configured with a list of datasources to test for; if that list contains exactly one entry, or one valid entry and None, then cloud-init will use the specified datasource unconditionally.
In our case, we need to specify the Ec2 datasource, which can be done
by adding the following to /etc/cloud/cloud.cfg.d/98_mdserver_ds.cfg
in the image:
# Override datasource detection, use Ec2 unconditionally
datasource_list: [Ec2, None]
It may also be necessary to force cloud-init to configure the mds
interface using DHCP4 - this can be done by adding some networking
configuration. The exact details will depend on the Linux
distribution; the following should work with Ubuntu 20.04 or later
using netplan, where the mds interface is ens2
:
network:
version: 2
ethernets:
ens2:
dhcp4: true
Cloud-init determines that it's running on an AWS instance by looking at the BIOS serial number and uuid values: they must be the same string, and the string must start with 'ec2'. This can be achieved by adding something like the following snippet to your domain XML file:
<os>
...other os data...
<smbios mode='sysinfo'/>
</os>
<sysinfo type='smbios'>
<system>
<entry name='manufacturer'>Plain Old Virtual Machine</entry>
<entry name='product'>Plain old VM</entry>
<entry name='serial'>ec242E85-6EAB-43A9-8B73-AE498ED416A8</entry>
<entry name='uuid'>ec242E85-6EAB-43A9-8B73-AE498ED416A8</entry>
</system>
</sysinfo>
The uuid must be valid, so the easiest way to create this string is to generate a fresh uuid and replace the first three characters with 'ec2'.
Older versions of cloud-init (prior to 22.0) can be forced to run
their network datasource detection by editing the systemd
configuration in the instance. This can be achieved by adding the
/etc/systemd/network/default.network
file to the image with the
following contents, (again, the mds network interface is ens2
):
[Match]
Type=en*
Name=ens2
[Network]
DHCP=yes
A symlink pointing at /lib/systemd/system/cloud-init.target
should
then be added to /etc/system/system/multi-user.target.wants/
so that
systemd will start cloud-init.
mdserver maintains a persistent database, typically stored in
/var/lib/mdserver/
, from which it gets the information that it
needs to respond to requests. A clean install of mdserver will have
an empty database, which must be initialised before mdserver can
respond usefully to anything.
Initialising the database is done by uploading the full domain XML for each instance that wants to use it. The domain XML for an instance can be acquired using the following command:
virsh dumpxml instance1 > instance1.xml
The resulting XML file can be uploaded to the mdserver using a simple curl command (from the local host - access is denied from any other IP address):
curl -s -d @instance1.xml http://169.254.169.254/instance-upload
The mdserver will parse the XML file, extract the information it needs, allocate an IP address, and then store that information in its database. It will then update the dnsmasq DHCP and DNS files so that when the instance comes up and attempts to get on the network it will receive a known IP address from dnsmasq, and its host name will resolve to that IP address in a DNS lookup.
Thanks to the libvirt hook script any new instances will be uploaded at start up, so this is a one time task (though this process can be used to update the database if so desired).
When cloud-init runs on boot it will attempt to contact an EC2 metadata server on the "magic" IP 169.254.169.254:80. mdserver listens on this address for requests and generates a response based on information from its database, using the source IP address of the request to locate the host data in the database.
Most of the requests are quite simple, responding with a single line generated from the database. However, the user-data request is far more involved.
When mdserver receives a user-data request it starts by resolving
the instance in the database, and then searches for a file in the
userdata directory (typically /etc/mdserver/userdata/
) using the
following filenames:
<userdata_dir>/<instance>
<userdata_dir>/<instance>.yaml
<userdata_dir>/<MAC>
<userdata_dir>/<MAC>.yaml
A default template userdata file can also be specified in the
configuration which will be used as a fallback if nothing
more specific is found - this is typically something like
<userdata_dir>/base.yaml
. If the default template path is not set
then a minimal hard-coded template will be used instead.
Once the template to use is determined it is processed using Bottle's Simple Template library, with details about the instance made available to the template processor along with the following values from the mdserver configuration:
- all public keys, in the form
public_key_<entry name>
i.e. an entry in the[public-keys]
section nameddefault
will be available in the userdata template as a value namedpublic_key_default
- all public keys in a hash keyed by the key name, under the name
public_keys
- i.e. thedefault
key would bepublic_keys['default']
- a default password (
mdserver_password
) - only if set by the user! - the host name (
hostname
)
Additional key-value data to be made available to the template can be
specified in the [template-data]
section of the config file. e.g:
[template-data]
foo=bar
would result in bar
being added to the template data under the key
foo
.
In addition, the "magic" key _config_items_
may be used to specify
a list of broader config items to be made available to the template -
this is a comma separated list of section.key
values, each of which
will have the value copied to a top level key
entry in the data
presented to the template. e.g.:
[template-data]
_config_items_ = 'dnsmasq.prefix,dnsmasq.domain'
would result in the prefix
and domain
settings from the dnsmasq
section being visible in the templates' namespace as prefix
and
domain
.
Any values visible to the template can be interpolated into the file
using the {{<key>}}
syntax. More sophisticated template behaviour
can be used, including embedding arbitrary python code - see the
Bottle templating engine documentation for more details.
The output of the template processing is then returned to the client.
By default mdserver will generate a DNS hosts file that dnsmasq will read and track updates to over time. This means that if you attempt to resolve the name of an instance through this dnsmasq instance you will get the correct IP address, and vice-versa for A look-ups.
By default the dnsmasq DHCP configuration does not specify any DNS servers, but it can be configured to specify the mdserver-managed dnsmasq instance as a DNS server by adding
[dnsmasq]
use_dns=yes
to the mdserver configuration. Since dnsmasq acts as a forwarding resolver this will generally work without issues, however the reliability in any given network cannot be guaranteed.
Adding the dnsmasq instance to the hypervisor resolv.conf should also work without issues, but again the exact details of performance and reliability will depend on the local circumstances.