\ No newline at end of file
diff --git a/search/search_index.json b/search/search_index.json
index 7c7a40be14b7d..f11cb1c73634b 100644
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Agent Integrations","text":"
Welcome to the wonderful world of developing Agent Integrations for Datadog. Here we document how we do things, the processes for various tasks, coding conventions & best practices, the internals of our testing infrastructure, and so much more.
If you are intrigued, continue reading. If not, continue all the same
To start an environment run ddev env start <INTEGRATION> <ENVIRONMENT>, for example:
$ ddev env start postgres py3.9-14.0\n\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500 Starting: py3.9-14.0 \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n[+] Running 4/4\n - Network compose_pg-net Created 0.1s\n - Container compose-postgres_replica2-1 Started 0.9s\n - Container compose-postgres_replica-1 Started 0.9s\n - Container compose-postgres-1 Started 0.9s\n\nmaster-py3: Pulling from datadog/agent-dev\nDigest: sha256:72824c9a986b0ef017eabba4e2cc9872333c7e16eec453b02b2276a40518655c\nStatus: Image is up to date for datadog/agent-dev:master-py3\ndocker.io/datadog/agent-dev:master-py3\n\nStop environment -> ddev env stop postgres py3.9-14.0\nExecute tests -> ddev env test postgres py3.9-14.0\nCheck status -> ddev env agent postgres py3.9-14.0 status\nTrigger run -> ddev env agent postgres py3.9-14.0 check\nReload config -> ddev env reload postgres py3.9-14.0\nManage config -> ddev env config\nConfig file -> C:\\Users\\ofek\\AppData\\Local\\ddev\\env\\postgres\\py3.9-14.0\\config\\postgres.yaml\n
This sets up the selected environment and an instance of the Agent running in a Docker container. The default configuration is defined by each environment's test suite and is saved to a file, which is then mounted to the Agent container so you may freely modify it.
Let's see what we have running:
$ docker ps --format \"table {{.Image}}\\t{{.Status}}\\t{{.Ports}}\\t{{.Names}}\"\nIMAGE STATUS PORTS NAMES\ndatadog/agent-dev:master-py3 Up 3 minutes (healthy) dd_postgres_py3.9-14.0\npostgres:14-alpine Up 3 minutes (healthy) 5432/tcp, 0.0.0.0:5434->5434/tcp compose-postgres_replica2-1\npostgres:14-alpine Up 3 minutes (healthy) 0.0.0.0:5432->5432/tcp compose-postgres-1\npostgres:14-alpine Up 3 minutes (healthy) 5432/tcp, 0.0.0.0:5433->5433/tcp compose-postgres_replica-1\n
By default the version of the integration used will be the one shipped with the chosen Agent version. If you wish to modify an integration and test changes in real time, use the --dev flag.
Doing so will mount and install the integration in the Agent container. All modifications to the integration's directory will be propagated to the Agent, whether it be a code change or switching to a different Git branch.
If you modify the base package then you will need to mount that with the --base flag, which implicitly activates --dev.
To run tests against the live Agent, use the ddev env test command. It is similar to the test command except it is capable of running tests marked as E2E, and only runs such tests.
You may start an interactive debugging session using the --breakpoint/-b option.
The option accepts an integer representing the line number at which to break. For convenience, 0 and -1 are shortcuts to the first and last line of the integration's check method, respectively.
$ ddev env agent postgres py3.9-14.0 check -b 0\n> /opt/datadog-agent/embedded/lib/python3.9/site-packages/datadog_checks/postgres/postgres.py(851)check()\n-> tags = copy.copy(self.tags)\n(Pdb) list\n846 }\n847 self._database_instance_emitted[self.resolved_hostname] = event\n848 self.database_monitoring_metadata(json.dumps(event, default=default_json_event_encoding))\n849\n850 def check(self, _):\n851 B-> tags = copy.copy(self.tags)\n852 # Collect metrics\n853 try:\n854 # Check version\n855 self._connect()\n856 self.load_version() # We don't want to cache versions between runs to capture minor updates for metadata\n
Caveat
The line number must be within the integration's check method.
Testing and manual check runs always reflect the current state of code and configuration however, if you want to see the result of changes in-app, you will need to refresh the environment by running ddev env reload <INTEGRATION> <ENVIRONMENT>.
To work on any integration you must install Python 3.12.
After installation, restart your terminal and ensure that your newly installed Python comes first in your PATH.
macOSWindowsLinux
First update the formulae and Homebrew itself:
brew update\n
then install Python:
brew install python@3.12\n
After it completes, check the output to see if it asked you to run any extra commands and if so, execute them.
Verify successful PATH modification:
which -a python\n
Windows users have it the easiest.
Download the Python 3.12 64-bit executable installer and run it. When prompted, be sure to select the option to add to your PATH. Also, it is recommended that you choose the per-user installation method.
Verify successful PATH modification:
where python\n
Ah, you enjoy difficult things. Are you using Gentoo?
We recommend using either Miniconda or pyenv to install Python 3.12. Whatever you do, never modify the system Python.
"},{"location":"setup/#installers","title":"Installers","text":"macOSWindows GUI installerCommand line installer
In your browser, download the .pkg file: ddev-10.2.0.pkg
Run your downloaded file and follow the on-screen instructions.
Restart your terminal.
To verify that the shell can find and run the ddev command in your PATH, use the following command.
$ ddev --version\n10.2.0\n
Download the file using the curl command. The -o option specifies the file name that the downloaded package is written to. In this example, the file is written to ddev-10.2.0.pkg in the current directory.
Run the standard macOS installer program, specifying the downloaded .pkg file as the source. Use the -pkg parameter to specify the name of the package to install, and the -target / parameter for the drive in which to install the package. The files are installed to /usr/local/ddev, and an entry is created at /etc/paths.d/ddev that instructs shells to add the /usr/local/ddev directory to. You must include sudo on the command to grant write permissions to those folders.
sudo installer -pkg ./ddev-10.2.0.pkg -target /\n
Restart your terminal.
To verify that the shell can find and run the ddev command in your PATH, use the following command.
$ ddev --version\n10.2.0\n
GUI installerCommand line installer
In your browser, download one the .msi files:
ddev-10.2.0-x64.msi
ddev-10.2.0-x86.msi
Run your downloaded file and follow the on-screen instructions.
Restart your terminal.
To verify that the shell can find and run the ddev command in your PATH, use the following command.
$ ddev --version\n10.2.0\n
Download and run the installer using the standard Windows msiexec program, specifying one of the .msi files as the source. Use the /passive and /i parameters to request an unattended, normal installation.
After downloading the archive corresponding to your platform and architecture, extract the binary to a directory that is on your PATH and rename to ddev.
Do not use sudo as it may result in a broken installation!
Run:
pipx install -e /path/to/integrations-core/ddev\n
Run:
pipx install -e /path/to/integrations-core/ddev\n
Warning
Do not use sudo as it may result in a broken installation!
Re-sync dependencies at any time by running:
pipx upgrade ddev\n
Note
Be aware that this method does not keep track of dependencies so you will need to re-run the command if/when the required dependencies are changed.
Note
Also be aware that this method does not get any changes from datadog_checks_dev, so if you have unreleased changes from datadog_checks_dev that may affect ddev, you will need to run the following to get the most recent changes from datadog_checks_dev to your ddev:
You'll notice that all environments for running tests are prefixed with pyX.Y, indicating the Python version to use. If you don't have a particular version installed (for example Python 2.7), such environments will be skipped.
The second part of a test environment's name corresponds to the version of the product. For example, the 14.0 in py3.9-14.0 implies tests will run against version 14.x of PostgreSQL.
If there is no version suffix, it means that either:
the version is pinned, usually set to pull the latest release, or
there is no concept of a product, such as the disk check
Passing just the integration name will run every test environment. You may select a subset of environments to run by appending a : followed by a comma-separated list of environments.
For example, executing:
ddev test postgres:py3.9-13.0,py3.9-11.0\n
will run tests for the environment py3.9-13.0 followed by the environment py3.9-11.0.
If no integrations are specified then only integrations that were changed will be tested, based on a diff between the latest commit to the current and master branches.
The criteria for an integration to be considered changed is based on the file extension of paths in the diff. So for example if only Markdown files were modified then nothing will be tested.
The integrations will be tested in lexicographical order.
To measure code coverage, use the --cov/-c flag. Doing so will display a summary of coverage statistics after successful execution of integrations' tests.
To run only the lint checks, use the --lint/-s shortcut flag.
You may also only run the formatter using the --fmt/-fs shortcut flag. The formatter will automatically resolve the most common errors caught by the lint checks.
The IBM i integration uses ODBC to connect to IBM i hosts and query system data through an SQL interface. To do so, it uses the ODBC Driver for IBM i Access Client Solutions, an IBM propietary ODBC driver that manages connections to IBM i hosts.
Limitations in the IBM i ODBC driver make it necessary to structure the check in a more complex way than would be expected, to avoid the check from hanging or leaking threads.
"},{"location":"architecture/ibm_i/#ibm-i-odbc-driver-limitations","title":"IBM i ODBC driver limitations","text":"
ODBC drivers can optionally support custom configuration through connection attributes, which help configure how a connection works. One fundamental connection attribute is SQL_ATTR_QUERY_TIMEOUT (and related _TIMEOUT attributes), which set the timeout for SQL queries done through the driver (or the timeout for other connection steps for other _TIMEOUT attributes). If this connection attribute is not set there is no timeout, which means the driver gets stuck waiting for a reply when a network issue happens.
As of the writing of this document, the IBM i ODBC driver behavior when setting the SQL_ATTR_QUERY_TIMEOUT connection attribute is similar to the one described in ODBC Query Timeout Property. For the IBM i DB2 driver: the driver estimates the running time of a query and preemptively aborts the query if the estimate is above the specified threshold, but it does not take into account the actual running time of the query (and thus, it's not useful for avoiding network issues).
"},{"location":"architecture/ibm_i/#ibm-i-check-workaround","title":"IBM i check workaround","text":"
To deal with the OBDC driver limitations, the IBM i check needs to have an alternative way to abort a query once a given timeout has passed. To do so, the IBM i check runs queries in a subprocess which it kills and restarts when timeouts pass. This subprocess runs query_script.py using the embedded Python interpreter.
It is essential that the connection is kept across queries. For a given connection, ELAPSED_ columns on IBM i views report statistics since the last time the table was queried on that connection, thus if using different connections these values are always zero.
To communicate with the main Agent process, the subprocess and the IBM i check exchange JSON-encoded messages through pipes until the special ENDOFQUERY message is received. Special care is needed to avoid blocking on reads and writes of the pipes.
For adding/modifying the queries, the check uses the standard QueryManager class used for SQL-based checks, except that each query needs to include a timeout value (since, empirically, some queries take much longer to complete on IBM i hosts).
While most integrations are either Python, JMX, or implemented in the Agent in Go, the SNMP integration is a bit more complex.
Here's an overview of what this integration involves:
A Python check, responsible for:
Collecting metrics from a specific device IP. Metrics typically come from profiles, but they can also be specified explicitly.
Auto-discovering devices over a network. (Pending deprecation in favor of Agent auto-discovery.)
An Agent service listener, responsible for auto-discovering devices over a network and forwarding discovered instances to the existing Agent check scheduling pipeline. Also known as \"Agent SNMP auto-discovery\".
The diagram below shows how these components interact for a typical VM-based setup (single Agent on a host). For Datadog Cluster Agent (DCA) deployments, see Cluster Agent support.
The Python check includes a multithreaded implementation of device auto-discovery. It runs on instances that use network_address instead of ip_address:
The main tasks performed by device auto-discovery are:
Find new devices: For each IP in the network_address CIDR range, the check queries the device sysObjectID. If the query succeeds and the sysObjectID matches one of the registered profiles, the device is added as a discovered instance. This logic is run at regular intervals in a separate thread.
Cache devices: To improve performance, discovered instances are cached on disk based on a hash of the instance. Since options from the network_address instance are copied into discovered instances, the cache is invalidated if the network_address changes.
Check devices: On each check run, the check runs a check on all discovered instances. This is done in parallel using a threadpool. The check waits for all sub-checks to finish.
Handle failures: Discovered instances that fail after a configured number of times are dropped. They may be rediscovered later.
Submit discovery-related metrics: the check submits the total number of discovered devices for a given network_address instance.
The approach described above is not ideal for several reasons:
The check code is harder to understand since the two distinct paths (\"single device\" vs \"entire network\") live in a single integration.
Each network instance manages several long-running threads that span well beyond the lifespan of a single check run.
Each network check pseudo-schedules other instances, which is normally the responsibility of the Agent.
For this reason, auto-discovery was eventually implemented in the Agent as a proper service listener (see below), and users should be discouraged from using Python auto-discovery. When the deprecation period expires, we will be able to remove auto-discovery logic from the Python check, making it exclusively focused on checking single devices.
Agent auto-discovery implements the same logic than the Python auto-discovery, but as a service listener in the Agent Go package.
This approach leverages the existing Agent scheduling logic, and makes it possible to scale device auto-discovery using the Datadog Cluster Agent (see Cluster Agent support).
Pending official documentation, here is an example configuration:
For Kubernetes environments, the Cluster Agent can be configured to use the SNMP Agent auto-discovery (via snmp listener) logic as a source of Cluster checks.
The Datadog Cluster Agent (DCA) uses the snmp_listener config (Agent auto-discovery) to listen for IP ranges, then schedules snmp check instances to be run by one or more normal Datadog Agents.
Agent auto-discovery combined with Cluster Agent is very scalable, it can be used to monitor a large number of snmp devices.
"},{"location":"architecture/snmp/#example-cluster-agent-setup-with-snmp-agent-auto-discovery-using-datadog-helm-chart","title":"Example Cluster Agent setup with SNMP Agent auto-discovery using Datadog helm-chart","text":"
datadog:\n ## @param apiKey - string - required\n ## Set this to your Datadog API key before the Agent runs.\n ## ref: https://app.datadoghq.com/account/settings/agent/latest?platform=kubernetes\n #\n apiKey: <DATADOG_API_KEY>\n\n ## @param clusterName - string - optional\n ## Set a unique cluster name to allow scoping hosts and Cluster Checks easily\n ## The name must be unique and must be dot-separated tokens where a token can be up to 40 characters with the following restrictions:\n ## * Lowercase letters, numbers, and hyphens only.\n ## * Must start with a letter.\n ## * Must end with a number or a letter.\n ## Compared to the rules of GKE, dots are allowed whereas they are not allowed on GKE:\n ## https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1beta1/projects.locations.clusters#Cluster.FIELDS.name\n #\n clusterName: my-snmp-cluster\n\n ## @param clusterChecks - object - required\n ## Enable the Cluster Checks feature on both the cluster-agents and the daemonset\n ## ref: https://docs.datadoghq.com/agent/autodiscovery/clusterchecks/\n ## Autodiscovery via Kube Service annotations is automatically enabled\n #\n clusterChecks:\n enabled: true\n\n ## @param tags - list of key:value elements - optional\n ## List of tags to attach to every metric, event and service check collected by this Agent.\n ##\n ## Learn more about tagging: https://docs.datadoghq.com/tagging/\n #\n tags:\n - 'env:test-snmp-cluster-agent'\n\n## @param clusterAgent - object - required\n## This is the Datadog Cluster Agent implementation that handles cluster-wide\n## metrics more cleanly, separates concerns for better rbac, and implements\n## the external metrics API so you can autoscale HPAs based on datadog metrics\n## ref: https://docs.datadoghq.com/agent/kubernetes/cluster/\n#\nclusterAgent:\n ## @param enabled - boolean - required\n ## Set this to true to enable Datadog Cluster Agent\n #\n enabled: true\n\n ## @param confd - list of objects - optional\n ## Provide additional cluster check configurations\n ## Each key will become a file in /conf.d\n ## ref: https://docs.datadoghq.com/agent/autodiscovery/\n #\n confd:\n # Static checks\n http_check.yaml: |-\n cluster_check: true\n instances:\n - name: 'Check Example Site1'\n url: http://example.net\n - name: 'Check Example Site2'\n url: http://example.net\n - name: 'Check Example Site3'\n url: http://example.net\n # Autodiscovery template needed for `snmp_listener` to create instance configs\n snmp.yaml: |-\n cluster_check: true\n\n # AD config below is copied from: https://github.com/DataDog/datadog-agent/blob/master/cmd/agent/dist/conf.d/snmp.d/auto_conf.yaml\n ad_identifiers:\n - snmp\n init_config:\n instances:\n -\n ## @param ip_address - string - optional\n ## The IP address of the device to monitor.\n #\n ip_address: \"%%host%%\"\n\n ## @param port - integer - optional - default: 161\n ## Default SNMP port.\n #\n port: \"%%port%%\"\n\n ## @param snmp_version - integer - optional - default: 2\n ## If you are using SNMP v1 set snmp_version to 1 (required)\n ## If you are using SNMP v3 set snmp_version to 3 (required)\n #\n snmp_version: \"%%extra_version%%\"\n\n ## @param timeout - integer - optional - default: 5\n ## Amount of second before timing out.\n #\n timeout: \"%%extra_timeout%%\"\n\n ## @param retries - integer - optional - default: 5\n ## Amount of retries before failure.\n #\n retries: \"%%extra_retries%%\"\n\n ## @param community_string - string - optional\n ## Only useful for SNMP v1 & v2.\n #\n community_string: \"%%extra_community%%\"\n\n ## @param user - string - optional\n ## USERNAME to connect to your SNMP devices.\n #\n user: \"%%extra_user%%\"\n\n ## @param authKey - string - optional\n ## Authentication key to use with your Authentication type.\n #\n authKey: \"%%extra_auth_key%%\"\n\n ## @param authProtocol - string - optional\n ## Authentication type to use when connecting to your SNMP devices.\n ## It can be one of: MD5, SHA, SHA224, SHA256, SHA384, SHA512.\n ## Default to MD5 when `authKey` is specified.\n #\n authProtocol: \"%%extra_auth_protocol%%\"\n\n ## @param privKey - string - optional\n ## Privacy type key to use with your Privacy type.\n #\n privKey: \"%%extra_priv_key%%\"\n\n ## @param privProtocol - string - optional\n ## Privacy type to use when connecting to your SNMP devices.\n ## It can be one of: DES, 3DES, AES, AES192, AES256, AES192C, AES256C.\n ## Default to DES when `privKey` is specified.\n #\n privProtocol: \"%%extra_priv_protocol%%\"\n\n ## @param context_engine_id - string - optional\n ## ID of your context engine; typically unneeded.\n ## (optional SNMP v3-only parameter)\n #\n context_engine_id: \"%%extra_context_engine_id%%\"\n\n ## @param context_name - string - optional\n ## Name of your context (optional SNMP v3-only parameter).\n #\n context_name: \"%%extra_context_name%%\"\n\n ## @param tags - list of key:value element - optional\n ## List of tags to attach to every metric, event and service check emitted by this integration.\n ##\n ## Learn more about tagging: https://docs.datadoghq.com/tagging/\n #\n tags:\n # The autodiscovery subnet the device is part of.\n # Used by Agent autodiscovery to pass subnet name.\n - \"autodiscovery_subnet:%%extra_autodiscovery_subnet%%\"\n\n ## @param extra_tags - string - optional\n ## Comma separated tags to attach to every metric, event and service check emitted by this integration.\n ## Example:\n ## extra_tags: \"tag1:val1,tag2:val2\"\n #\n extra_tags: \"%%extra_tags%%\"\n\n ## @param oid_batch_size - integer - optional - default: 60\n ## The number of OIDs handled by each batch. Increasing this number improves performance but\n ## uses more resources.\n #\n oid_batch_size: \"%%extra_oid_batch_size%%\"\n\n ## @param datadog-cluster.yaml - object - optional\n ## Specify custom contents for the datadog cluster agent config (datadog-cluster.yaml).\n #\n datadog_cluster_yaml:\n listeners:\n - name: snmp\n\n # See here for all `snmp_listener` configs: https://github.com/DataDog/datadog-agent/blob/master/pkg/config/config_template.yaml\n snmp_listener:\n workers: 2\n discovery_interval: 10\n configs:\n - network: 192.168.1.16/29\n version: 2\n port: 1161\n community: cisco_icm\n - network: 192.168.1.16/29\n version: 2\n port: 1161\n community: f5\n
TODO: architecture diagram, example setup, affected files and repos, local testing tools, etc.
vSphere is a VMware product dedicated to managing a (usually) on-premise infrastructure. From physical machines running VMware ESXi that are called ESXi Hosts, users can spin up or migrate Virtual Machines from one host to another.
vSphere is an integrated solution and provides an easy managing interface over concepts like data storage, or computing resource.
This section details some of vSphere specific elements. This section does not intend to be an extensive list, but rather a place for those unfamiliar with the product to have the basics required to understand how the Datadog integration works.
vSphere - The complete suite of tools and technologies detailed in this article.
vCenter server - The main machine which controls ESXi hosts and provides both a web UI and an API to control the vSphere environment.
vCSA (vCenter Server Appliance) - A specific kind of vCenter where the software runs in a dedicated Linux machine (more recent). By opposition, the legacy vCenter is typically installed on an existing Windows machine.
ESXi host - The physical machine controlled by vCenter where the ESXi (bare-metal) virtualizer is installed. The host boots a minimal OS that can run Virtual Machines.
VM - What anyone using vSphere really needs in the end, instances that can run applications and code. Note: Datadog monitors both ESXi hosts and VMs and it calls them both \"host\" (they are in the host map).
Attributes/tags - It is possible to add attributes and tags to any vSphere resource, note that those two are now very similar with \"attributes\" being the deprecated thing to use.
Datacenter - A set of resources grouped together. A single vCenter server can handle multiple datacenters.
Datastore - A virtual vSphere concept to represent data storing capabilities. It can be an NFS server that ESXi hosts have read/write access to, it can be a mounted disk on the host and more. Datastores are often shared between multiple hosts. This allows Virtual Machines to be migrated from one host to another.
Cluster - A logical grouping of computational resources, you can add multiple ESXi hosts in your cluster and then you can create VM in the cluster (and not on a specific host, vSphere will take care of placing your VM in one of the ESXi hosts and migrating it when needed).
Photon OS - An open-source minimal Linux distribution and used by both ESXi and vCSA as a base.
The Datadog vSphere integration runs from a single agent and pulls all the information from a single vCenter endpoint. Because the agent cannot run directly on Photon OS, it is usually required that the agent runs within a dedicated VM inside the vSphere infrastructure.
Once the agent is running, the minimal configuration (as of version 5.x) is as follows:
host is the endpoint used to access the vSphere Client from a web browser. The host is either a FQDN or an IP, not an http url.
username and password are the credentials to log in to vCenter.
use_legacy_check_version is a backward compatibility flag. It should always be set to false and this flag will be removed in a future version of the integration. Setting it to true tells the agent to use an older and deprecated version of the vSphere integration.
empty_default_hostname is a field used by the agent directly (and not the integration). By default, the agent does not allow submitting metrics without attaching an explicit host tag unless this flag is set to true. The vSphere integration uses that behavior for some metrics and service checks. For example, the vsphere.vm.count metric which gives a count of the VMs in the infra is not submitted with a host tag. This is particularly important if the agent runs inside a vSphere VM. If the vsphere.vm.count was submitted with a host tag, the Datadog backend would attach all the other host tags to the metric, for example vsphere_type:vm or vsphere_host:<NAME_OF_THE_ESX_HOST> which makes the metric almost impossible to use.
vSphere metrics are documented in their documentation page an each metric has a defined \"collection level\".
That level determines the amount of data gathered by the integration and especially which metrics are available. More details here.
By default, only the level 1 metrics are collected but this can be increased in the integration configuration file.
"},{"location":"architecture/vsphere/#realtime-vs-historical","title":"Realtime vs historical","text":"
Each ESXi host collects and stores data for each metric on himself and every VM it hosts every 20 seconds. Those data points are stored for up to one hour and are called realtime. Note: Each metric concerns always either a VM or an ESXi hosts. Metrics that concern datastore for example are not collected in the ESXi hosts.
Additionally, the vCenter server collects data from all the ESXi hosts and stores the datapoint with some aggregation rollup into its own database. Those data points are called \"historical\".
Finally, the vCenter server also collects metrics for other kinds of resources (like Datastore, ClusterComputeResource, Datacenter...) Those data points are necessarily \"historical\".
The reason for such an important distinction is that historical metrics are much MUCH slower to collect than realtime metrics. The vSphere integration will always collect the \"realtime\" data for metrics that concern ESXi hosts and VMs. But the integration also collects metrics for Datastores, ClusterComputeResources, Datacenters, and maybe others in the future.
That's why, in the context of the Datadog vSphere integration, we usually simplify by considering that:
VMs and ESXi hosts are \"realtime resources\". Metrics for such resources are quick and easy to get by querying vCenter that will in turn query all the ESXi hosts.
Datastores, ClusterComputeResources, and Datacenters are \"historical resources\" and are much slower to collect.
To collect all metrics (realtime and historical), it is advised to use two \"check instances\". One with collection_type: realtime and one with collection_type: historical . This way all metrics will be collected but because both check instances are on different schedules, the slowness of collecting historical metrics won't affect the rate at which realtime metrics are collected.
"},{"location":"architecture/vsphere/#vsphere-tags-and-attributes","title":"vSphere tags and attributes","text":"
Similarly to how Datadog allows you to add tags to your different hosts (thins like the os or the instance-type of your machines), vSphere has \"tags\" and \"attributes\".
A lot of details can be found here: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vcenterhost.doc/GUID-E8E854DD-AA97-4E0C-8419-CE84F93C4058.html#:~:text=Tags%20and%20attributes%20allow%20you,that%20tag%20to%20a%20category.
But the overall idea is that both tags and attributes are additional information that you can attach to your vSphere resources and that \"tags\" are newer and more featureful than \"attributes\".
A very flexible filtering system has been implemented with the vSphere integration.
This allows fine-tuned configuration so that:
You only pay for the host and VMs you really want to monitor.
You reduce the load on your vCenter server by running just the queries that you need.
You improve the check runtime which otherwise increases linearly with the size of their infrastructure and that was seen to take up to 10min in some large environments.
We provide two types of filtering, one based on metrics, the other based on resources.
The metric filter is fairly simple, for each resource type, you can provide some regexes. If a metric match any of the filter, it will be fetched and submitted. The configuration looks like this:
The resource filter on the other hand, allows to exclude some vSphere resources (VM, ESXi host, etc.), based on an \"attribute\" of that resource. The possible attributes as of today are: - name, literally the name of the resource (as defined in vCenter) - inventory_path, a path-like string that represents the location of the resource in the inventory tree as each resource only ever has a single parent and recursively up to the root. For example: /my.datacenter.local/vm/staging/myservice/vm_name - tag, see the tags and attributes section. Used to filter resources based on the attached tags. - attribute, see the tags and attributes section. Used to filter resources based on the attached attributes. - hostname (only for VMs), the name of the ESXi host where the VM is running. - guest_hostname (only for VMs), the name of the OS as reported from within the machine. VMware tools have to be installed on the VM otherwise, vCenter is not able to fetch this information.
A possible filtering configuration would look like this:
In vSphere each metric is defined by three \"dimensions\".
The resource on which the metric applies (for example the VM called \"abc1\")
The name of the metric (for example cpu.usage).
An additional available dimension that varies between metrics. (for example the cpu core id)
This is similar to how Datadog represent metrics, except that the context cardinality is limited to two \"keys\", the name of the resource (usually the \"host\" tag), and there is space for one additional tag key.
This available tag key is defined as the \"instance\" property, or \"instance tag\" in vSphere, and this dimension is not collected by default by the Datadog integration as it can have too big performance implications in large systems when compared to their added value from a monitoring perspective.
Also when fetching metrics with the instance tag, vSphere only provides the value of the instance tag, it doesn't expose a human-readable \"key\" for that tag. In the cpu.usage metric with the core_id as the instance tag, the integration has to \"know\" that the meaning of the instance tag and that's why we rely on a hardcoded list in the integration.
Because this instance tag can provide additional visibility, it is possible to enable it for some metrics from the configuration. For example, if we're really interested in getting the usage of the cpu per core, the setup can look like this:
Users set a path with which to collect events from that is the name of a channel like System, Application, etc.
There are 3 ways to select filter criteria rather than collecting all events:
query - A raw XPath or structured XML query used to filter events. This overrides any selected filters.
filters - A mapping of properties to allowed values. Every filter (equivalent to the and operator) must match any value (equivalent to the or operator). This option is a convenience for a query that is relatively basic.
Rather than collect all events and perform filtering within the check, the filters are converted to an XPath expression. This approach offloads all filtering to the kernel (like query), which increases performance and reduces bandwidth usage when connecting to a remote machine.
included_messages/excluded_messages - These are regular expression patterns used to filter by events' messages specifically (if a message is found), with the exclude list taking precedence. These may be used in place of or with query/filters, as there exists no query construct by which to select a message attribute.
A pull subscription model is used. At every check run, the cached event log handle waits to be signaled for a configurable number of seconds. If signaled, the check then polls all available events in batches of a configurable size.
At configurable intervals, the most recently encountered event is saved to the filesystem. This is useful for preventing duplicate events being sent as a consequence of Agent restarts, especially when the start option is set to oldest.
Events may alternatively be configured to be submitted as logs. The code for that resides here.
Only a subset of the check's functionality is available. Namely, each log configuration will collect all events of the given channel without filtering, tagging, nor remote connection options.
This implementation uses the push subscription model. There is a bit of C in charge of rendering the relevant data and registering the Go tailer callback that ultimately sends the log to the backend.
Setting legacy_mode to true in the check will use WMI to collect events, which is significantly more resource intensive. This mode has entirely different configuration options and will be removed in a future release.
Agent 6 can only use this mode as Python 2 does not support the new implementation.
The Base package provides all the functionality and utilities necessary for writing Agent Integrations. Most importantly it provides the AgentCheck base class from which every Check must be inherited.
The check method is what the Datadog Agent will execute.
In this example we created a Check and gave it a namespace of awesome. This means that by default, every submission's name will be prefixed with awesome..
We submitted a gauge metric named awesome.test with a value of 1.23 tagged by foo:bar.
The magic hidden by the usability of the API is that this actually calls a C binding which communicates with the Agent (written in Go).
In general, you don't need to and you should not override anything from the base class except the check method but sometimes it might be useful for a Check to have its own constructor.
When overriding __init__ you have to remember that, depending on the configuration, the Agent might create several different Check instances and the method would be called as many times.
Agent 6,7 signature:
AgentCheck(name, init_config, instances) # instances contain only 1 instance\nAgentCheck.check(instance)\n
Agent 8 signature:
AgentCheck(name, init_config, instance) # one instance\nAgentCheck.check() # no more instance argument for check method\n
Note
when loading a Custom check, the Agent will inspect the module searching for a subclass of AgentCheck. If such a class exists but has been derived in turn, it'll be ignored - you should never derive from an existing Check.
Source code in datadog_checks_base/datadog_checks/base/checks/base.py
@traced_class\nclass AgentCheck(object):\n \"\"\"\n The base class for any Agent based integration.\n\n In general, you don't need to and you should not override anything from the base\n class except the `check` method but sometimes it might be useful for a Check to\n have its own constructor.\n\n When overriding `__init__` you have to remember that, depending on the configuration,\n the Agent might create several different Check instances and the method would be\n called as many times.\n\n Agent 6,7 signature:\n\n AgentCheck(name, init_config, instances) # instances contain only 1 instance\n AgentCheck.check(instance)\n\n Agent 8 signature:\n\n AgentCheck(name, init_config, instance) # one instance\n AgentCheck.check() # no more instance argument for check method\n\n !!! note\n when loading a Custom check, the Agent will inspect the module searching\n for a subclass of `AgentCheck`. If such a class exists but has been derived in\n turn, it'll be ignored - **you should never derive from an existing Check**.\n \"\"\"\n\n # If defined, this will be the prefix of every metric/service check and the source type of events\n __NAMESPACE__ = ''\n\n OK, WARNING, CRITICAL, UNKNOWN = ServiceCheck\n\n # Used by `self.http` for an instance of RequestsWrapper\n HTTP_CONFIG_REMAPPER = None\n\n # Used by `create_tls_context` for an instance of RequestsWrapper\n TLS_CONFIG_REMAPPER = None\n\n # Used by `self.set_metadata` for an instance of MetadataManager\n #\n # This is a mapping of metadata names to functions. When you call `self.set_metadata(name, value, **options)`,\n # if `name` is in this mapping then the corresponding function will be called with the `value`, and the\n # return value(s) will be sent instead.\n #\n # Transformer functions must satisfy the following signature:\n #\n # def transform_<NAME>(value: Any, options: dict) -> Union[str, Dict[str, str]]:\n #\n # If the return type is a string, then it will be sent as the value for `name`. If the return type is\n # a mapping type, then each key will be considered a `name` and will be sent with its (str) value.\n METADATA_TRANSFORMERS = None\n\n FIRST_CAP_RE = re.compile(br'(.)([A-Z][a-z]+)')\n ALL_CAP_RE = re.compile(br'([a-z0-9])([A-Z])')\n METRIC_REPLACEMENT = re.compile(br'([^a-zA-Z0-9_.]+)|(^[^a-zA-Z]+)')\n TAG_REPLACEMENT = re.compile(br'[,\\+\\*\\-/()\\[\\]{}\\s]')\n MULTIPLE_UNDERSCORE_CLEANUP = re.compile(br'__+')\n DOT_UNDERSCORE_CLEANUP = re.compile(br'_*\\._*')\n\n # allows to set a limit on the number of metric name and tags combination\n # this check can send per run. This is useful for checks that have an unbounded\n # number of tag values that depend on the input payload.\n # The logic counts one set of tags per gauge/rate/monotonic_count call, and de-duplicates\n # sets of tags for other metric types. The first N sets of tags in submission order will\n # be sent to the aggregator, the rest are dropped. The state is reset after each run.\n # See https://github.com/DataDog/integrations-core/pull/2093 for more information.\n DEFAULT_METRIC_LIMIT = 0\n\n # Allow tracing for classic integrations\n def __init_subclass__(cls, *args, **kwargs):\n try:\n # https://github.com/python/mypy/issues/4660\n super().__init_subclass__(*args, **kwargs) # type: ignore\n return traced_class(cls)\n except Exception:\n return cls\n\n def __init__(self, *args, **kwargs):\n # type: (*Any, **Any) -> None\n \"\"\"\n Parameters:\n name (str):\n the name of the check\n init_config (dict):\n the `init_config` section of the configuration.\n instance (list[dict]):\n a one-element list containing the instance options from the\n configuration file (a list is used to keep backward compatibility with\n older versions of the Agent).\n \"\"\"\n # NOTE: these variable assignments exist to ease type checking when eventually assigned as attributes.\n name = kwargs.get('name', '')\n init_config = kwargs.get('init_config', {})\n agentConfig = kwargs.get('agentConfig', {})\n instances = kwargs.get('instances', [])\n\n if len(args) > 0:\n name = args[0]\n if len(args) > 1:\n init_config = args[1]\n if len(args) > 2:\n # agent pass instances as tuple but in test we are usually using list, so we are testing for both\n if len(args) > 3 or not isinstance(args[2], (list, tuple)) or 'instances' in kwargs:\n # old-style init: the 3rd argument is `agentConfig`\n agentConfig = args[2]\n if len(args) > 3:\n instances = args[3]\n else:\n # new-style init: the 3rd argument is `instances`\n instances = args[2]\n\n # NOTE: Agent 6+ should pass exactly one instance... But we are not abiding by that rule on our side\n # everywhere just yet. It's complicated... See: https://github.com/DataDog/integrations-core/pull/5573\n instance = instances[0] if instances else None\n\n self.check_id = ''\n self.name = name # type: str\n self.init_config = init_config # type: InitConfigType\n self.agentConfig = agentConfig # type: AgentConfigType\n self.instance = instance # type: InstanceType\n self.instances = instances # type: List[InstanceType]\n self.warnings = [] # type: List[str]\n self.disable_generic_tags = (\n is_affirmative(self.instance.get('disable_generic_tags', False)) if instance else False\n )\n self.debug_metrics = {}\n if self.init_config is not None:\n self.debug_metrics.update(self.init_config.get('debug_metrics', {}))\n if self.instance is not None:\n self.debug_metrics.update(self.instance.get('debug_metrics', {}))\n\n # `self.hostname` is deprecated, use `datadog_agent.get_hostname()` instead\n self.hostname = datadog_agent.get_hostname() # type: str\n\n logger = logging.getLogger('{}.{}'.format(__name__, self.name))\n self.log = CheckLoggingAdapter(logger, self)\n\n metric_patterns = self.instance.get('metric_patterns', {}) if instance else {}\n if not isinstance(metric_patterns, dict):\n raise ConfigurationError('Setting `metric_patterns` must be a mapping')\n\n self.exclude_metrics_pattern = self._create_metrics_pattern(metric_patterns, 'exclude')\n self.include_metrics_pattern = self._create_metrics_pattern(metric_patterns, 'include')\n\n # TODO: Remove with Agent 5\n # Set proxy settings\n self.proxies = self._get_requests_proxy()\n if not self.init_config:\n self._use_agent_proxy = True\n else:\n self._use_agent_proxy = is_affirmative(self.init_config.get('use_agent_proxy', True))\n\n # TODO: Remove with Agent 5\n self.default_integration_http_timeout = float(self.agentConfig.get('default_integration_http_timeout', 9))\n\n self._deprecations = {\n 'increment': (\n False,\n (\n 'DEPRECATION NOTICE: `AgentCheck.increment`/`AgentCheck.decrement` are deprecated, please '\n 'use `AgentCheck.gauge` or `AgentCheck.count` instead, with a different metric name'\n ),\n ),\n 'device_name': (\n False,\n (\n 'DEPRECATION NOTICE: `device_name` is deprecated, please use a `device:` '\n 'tag in the `tags` list instead'\n ),\n ),\n 'in_developer_mode': (\n False,\n 'DEPRECATION NOTICE: `in_developer_mode` is deprecated, please stop using it.',\n ),\n 'no_proxy': (\n False,\n (\n 'DEPRECATION NOTICE: The `no_proxy` config option has been renamed '\n 'to `skip_proxy` and will be removed in a future release.'\n ),\n ),\n 'service_tag': (\n False,\n (\n 'DEPRECATION NOTICE: The `service` tag is deprecated and has been renamed to `%s`. '\n 'Set `disable_legacy_service_tag` to `true` to disable this warning. '\n 'The default will become `true` and cannot be changed in Agent version 8.'\n ),\n ),\n '_config_renamed': (\n False,\n (\n 'DEPRECATION NOTICE: The `%s` config option has been renamed '\n 'to `%s` and will be removed in a future release.'\n ),\n ),\n } # type: Dict[str, Tuple[bool, str]]\n\n # Setup metric limits\n self.metric_limiter = self._get_metric_limiter(self.name, instance=self.instance)\n\n # Lazily load and validate config\n self._config_model_instance = None # type: Any\n self._config_model_shared = None # type: Any\n\n # Functions that will be called exactly once (if successful) before the first check run\n self.check_initializations = deque() # type: Deque[Callable[[], None]]\n\n if not PY2:\n self.check_initializations.append(self.load_configuration_models)\n\n self.__formatted_tags = None\n self.__logs_enabled = None\n\n def _create_metrics_pattern(self, metric_patterns, option_name):\n all_patterns = metric_patterns.get(option_name, [])\n\n if not isinstance(all_patterns, list):\n raise ConfigurationError('Setting `{}` of `metric_patterns` must be an array'.format(option_name))\n\n metrics_patterns = []\n for i, entry in enumerate(all_patterns, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(\n 'Entry #{} of setting `{}` of `metric_patterns` must be a string'.format(i, option_name)\n )\n if not entry:\n self.log.debug(\n 'Entry #%s of setting `%s` of `metric_patterns` must not be empty, ignoring', i, option_name\n )\n continue\n\n metrics_patterns.append(entry)\n\n if metrics_patterns:\n return re.compile('|'.join(metrics_patterns))\n\n return None\n\n def _get_metric_limiter(self, name, instance=None):\n # type: (str, InstanceType) -> Optional[Limiter]\n limit = self._get_metric_limit(instance=instance)\n\n if limit > 0:\n return Limiter(name, 'metrics', limit, self.warning)\n\n return None\n\n def _get_metric_limit(self, instance=None):\n # type: (InstanceType) -> int\n if instance is None:\n # NOTE: Agent 6+ will now always pass an instance when calling into a check, but we still need to\n # account for this case due to some tests not always passing an instance on init.\n self.log.debug(\n \"No instance provided (this is deprecated!). Reverting to the default metric limit: %s\",\n self.DEFAULT_METRIC_LIMIT,\n )\n return self.DEFAULT_METRIC_LIMIT\n\n max_returned_metrics = instance.get('max_returned_metrics', self.DEFAULT_METRIC_LIMIT)\n\n try:\n limit = int(max_returned_metrics)\n except (ValueError, TypeError):\n self.warning(\n \"Configured 'max_returned_metrics' cannot be interpreted as an integer: %s. \"\n \"Reverting to the default limit: %s\",\n max_returned_metrics,\n self.DEFAULT_METRIC_LIMIT,\n )\n return self.DEFAULT_METRIC_LIMIT\n\n # Do not allow to disable limiting if the class has set a non-zero default value.\n if limit == 0 and self.DEFAULT_METRIC_LIMIT > 0:\n self.warning(\n \"Setting 'max_returned_metrics' to zero is not allowed. Reverting to the default metric limit: %s\",\n self.DEFAULT_METRIC_LIMIT,\n )\n return self.DEFAULT_METRIC_LIMIT\n\n return limit\n\n @staticmethod\n def load_config(yaml_str):\n # type: (str) -> Any\n \"\"\"\n Convenience wrapper to ease programmatic use of this class from the C API.\n \"\"\"\n return yaml.safe_load(yaml_str)\n\n @property\n def http(self):\n # type: () -> RequestsWrapper\n \"\"\"\n Provides logic to yield consistent network behavior based on user configuration.\n\n Only new checks or checks on Agent 6.13+ can and should use this for HTTP requests.\n \"\"\"\n if not hasattr(self, '_http'):\n self._http = RequestsWrapper(self.instance or {}, self.init_config, self.HTTP_CONFIG_REMAPPER, self.log)\n\n return self._http\n\n @property\n def logs_enabled(self):\n # type: () -> bool\n \"\"\"\n Returns True if logs are enabled, False otherwise.\n \"\"\"\n if self.__logs_enabled is None:\n self.__logs_enabled = bool(datadog_agent.get_config('logs_enabled'))\n\n return self.__logs_enabled\n\n @property\n def formatted_tags(self):\n # type: () -> str\n if self.__formatted_tags is None:\n normalized_tags = set()\n for tag in self.instance.get('tags', []):\n key, _, value = tag.partition(':')\n if not value:\n continue\n\n if self.disable_generic_tags and key in GENERIC_TAGS:\n key = '{}_{}'.format(self.name, key)\n\n normalized_tags.add('{}:{}'.format(key, value))\n\n self.__formatted_tags = ','.join(sorted(normalized_tags))\n\n return self.__formatted_tags\n\n @property\n def diagnosis(self):\n # type: () -> Diagnosis\n \"\"\"\n A Diagnosis object to register explicit diagnostics and record diagnoses.\n \"\"\"\n if not hasattr(self, '_diagnosis'):\n self._diagnosis = Diagnosis(sanitize=self.sanitize)\n return self._diagnosis\n\n def get_tls_context(self, refresh=False, overrides=None):\n # type: (bool, Dict[AnyStr, Any]) -> ssl.SSLContext\n \"\"\"\n Creates and cache an SSLContext instance based on user configuration.\n Note that user configuration can be overridden by using `overrides`.\n This should only be applied to older integration that manually set config values.\n\n Since: Agent 7.24\n \"\"\"\n if not hasattr(self, '_tls_context_wrapper'):\n self._tls_context_wrapper = TlsContextWrapper(\n self.instance or {}, self.TLS_CONFIG_REMAPPER, overrides=overrides\n )\n\n if refresh:\n self._tls_context_wrapper.refresh_tls_context()\n\n return self._tls_context_wrapper.tls_context\n\n @property\n def metadata_manager(self):\n # type: () -> MetadataManager\n \"\"\"\n Used for sending metadata via Go bindings.\n \"\"\"\n if not hasattr(self, '_metadata_manager'):\n if not self.check_id and AGENT_RUNNING:\n raise RuntimeError('Attribute `check_id` must be set')\n\n self._metadata_manager = MetadataManager(self.name, self.check_id, self.log, self.METADATA_TRANSFORMERS)\n\n return self._metadata_manager\n\n @property\n def check_version(self):\n # type: () -> str\n \"\"\"\n Return the dynamically detected integration version.\n \"\"\"\n if not hasattr(self, '_check_version'):\n # 'datadog_checks.<PACKAGE>.<MODULE>...'\n module_parts = self.__module__.split('.')\n package_path = '.'.join(module_parts[:2])\n package = importlib.import_module(package_path)\n\n # Provide a default just in case\n self._check_version = getattr(package, '__version__', '0.0.0')\n\n return self._check_version\n\n @property\n def in_developer_mode(self):\n # type: () -> bool\n self._log_deprecation('in_developer_mode')\n return False\n\n def log_typos_in_options(self, user_config, models_config, level):\n # only import it when running in python 3\n from jellyfish import jaro_winkler_similarity\n\n user_configs = user_config or {} # type: Dict[str, Any]\n models_config = models_config or {}\n typos = set() # type: Set[str]\n\n known_options = {k for k, _ in models_config} # type: Set[str]\n\n if not PY2:\n\n if isinstance(models_config, BaseModel):\n # Also add aliases, if any\n known_options.update(set(models_config.model_dump(by_alias=True)))\n\n unknown_options = [option for option in user_configs.keys() if option not in known_options] # type: List[str]\n\n for unknown_option in unknown_options:\n similar_known_options = [] # type: List[Tuple[str, int]]\n for known_option in known_options:\n ratio = jaro_winkler_similarity(unknown_option, known_option)\n if ratio > TYPO_SIMILARITY_THRESHOLD:\n similar_known_options.append((known_option, ratio))\n typos.add(unknown_option)\n\n if len(similar_known_options) > 0:\n similar_known_options.sort(key=lambda option: option[1], reverse=True)\n similar_known_options_names = [option[0] for option in similar_known_options] # type: List[str]\n message = (\n 'Detected potential typo in configuration option in {}/{} section: `{}`. Did you mean {}?'\n ).format(self.name, level, unknown_option, ', or '.join(similar_known_options_names))\n self.log.warning(message)\n return typos\n\n def load_configuration_models(self, package_path=None):\n if package_path is None:\n # 'datadog_checks.<PACKAGE>.<MODULE>...'\n module_parts = self.__module__.split('.')\n package_path = '{}.config_models'.format('.'.join(module_parts[:2]))\n if self._config_model_shared is None:\n shared_config = copy.deepcopy(self.init_config)\n context = self._get_config_model_context(shared_config)\n shared_model = self.load_configuration_model(package_path, 'SharedConfig', shared_config, context)\n try:\n self.log_typos_in_options(shared_config, shared_model, 'init_config')\n except Exception as e:\n self.log.debug(\"Failed to detect typos in `init_config` section: %s\", e)\n if shared_model is not None:\n self._config_model_shared = shared_model\n\n if self._config_model_instance is None:\n instance_config = copy.deepcopy(self.instance)\n context = self._get_config_model_context(instance_config)\n instance_model = self.load_configuration_model(package_path, 'InstanceConfig', instance_config, context)\n try:\n self.log_typos_in_options(instance_config, instance_model, 'instances')\n except Exception as e:\n self.log.debug(\"Failed to detect typos in `instances` section: %s\", e)\n if instance_model is not None:\n self._config_model_instance = instance_model\n\n @staticmethod\n def load_configuration_model(import_path, model_name, config, context):\n try:\n package = importlib.import_module(import_path)\n # TODO: remove the type ignore when we drop Python 2\n except ModuleNotFoundError as e: # type: ignore\n # Don't fail if there are no models\n if str(e).startswith('No module named '):\n return\n\n raise\n\n model = getattr(package, model_name, None)\n if model is not None:\n try:\n config_model = model.model_validate(config, context=context)\n # TODO: remove the type ignore when we drop Python 2\n except ValidationError as e: # type: ignore\n errors = e.errors()\n num_errors = len(errors)\n message_lines = [\n 'Detected {} error{} while loading configuration model `{}`:'.format(\n num_errors, 's' if num_errors > 1 else '', model_name\n )\n ]\n\n for error in errors:\n message_lines.append(\n ' -> '.join(\n # Start array indexes at one for user-friendliness\n str(loc + 1) if isinstance(loc, int) else str(loc)\n for loc in error['loc']\n )\n )\n message_lines.append(' {}'.format(error['msg']))\n\n raise_from(ConfigurationError('\\n'.join(message_lines)), None)\n else:\n return config_model\n\n def _get_config_model_context(self, config):\n return {'logger': self.log, 'warning': self.warning, 'configured_fields': frozenset(config)}\n\n def register_secret(self, secret):\n # type: (str) -> None\n \"\"\"\n Register a secret to be scrubbed by `.sanitize()`.\n \"\"\"\n if not hasattr(self, '_sanitizer'):\n # Configure lazily so that checks that don't use sanitization aren't affected.\n self._sanitizer = SecretsSanitizer()\n self.log.setup_sanitization(sanitize=self.sanitize)\n\n self._sanitizer.register(secret)\n\n def sanitize(self, text):\n # type: (str) -> str\n \"\"\"\n Scrub any registered secrets in `text`.\n \"\"\"\n try:\n sanitizer = self._sanitizer\n except AttributeError:\n return text\n else:\n return sanitizer.sanitize(text)\n\n def _context_uid(self, mtype, name, tags=None, hostname=None):\n # type: (int, str, Sequence[str], str) -> str\n return '{}-{}-{}-{}'.format(mtype, name, tags if tags is None else hash(frozenset(tags)), hostname)\n\n def submit_histogram_bucket(\n self, name, value, lower_bound, upper_bound, monotonic, hostname, tags, raw=False, flush_first_value=False\n ):\n # type: (str, float, int, int, bool, str, Sequence[str], bool, bool) -> None\n if value is None:\n # ignore metric sample\n return\n\n # make sure the value (bucket count) is an integer\n try:\n value = int(value)\n except ValueError:\n err_msg = 'Histogram: {} has non integer value: {}. Only integer are valid bucket values (count).'.format(\n repr(name), repr(value)\n )\n if not AGENT_RUNNING:\n raise ValueError(err_msg)\n self.warning(err_msg)\n return\n\n tags = self._normalize_tags_type(tags, metric_name=name)\n if hostname is None:\n hostname = ''\n\n aggregator.submit_histogram_bucket(\n self,\n self.check_id,\n self._format_namespace(name, raw),\n value,\n lower_bound,\n upper_bound,\n monotonic,\n hostname,\n tags,\n flush_first_value,\n )\n\n def database_monitoring_query_sample(self, raw_event):\n # type: (str) -> None\n if raw_event is None:\n return\n\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), \"dbm-samples\")\n\n def database_monitoring_query_metrics(self, raw_event):\n # type: (str) -> None\n if raw_event is None:\n return\n\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), \"dbm-metrics\")\n\n def database_monitoring_query_activity(self, raw_event):\n # type: (str) -> None\n if raw_event is None:\n return\n\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), \"dbm-activity\")\n\n def database_monitoring_metadata(self, raw_event):\n # type: (str) -> None\n if raw_event is None:\n return\n\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), \"dbm-metadata\")\n\n def event_platform_event(self, raw_event, event_track_type):\n # type: (str, str) -> None\n \"\"\"Send an event platform event.\n\n Parameters:\n raw_event (str):\n JSON formatted string representing the event to send\n event_track_type (str):\n type of event ingested and processed by the event platform\n \"\"\"\n if raw_event is None:\n return\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), event_track_type)\n\n def should_send_metric(self, metric_name):\n return not self._metric_excluded(metric_name) and self._metric_included(metric_name)\n\n def _metric_included(self, metric_name):\n if self.include_metrics_pattern is None:\n return True\n\n return self.include_metrics_pattern.search(metric_name) is not None\n\n def _metric_excluded(self, metric_name):\n if self.exclude_metrics_pattern is None:\n return False\n\n return self.exclude_metrics_pattern.search(metric_name) is not None\n\n def _submit_metric(\n self, mtype, name, value, tags=None, hostname=None, device_name=None, raw=False, flush_first_value=False\n ):\n # type: (int, str, float, Sequence[str], str, str, bool, bool) -> None\n if value is None:\n # ignore metric sample\n return\n\n name = self._format_namespace(name, raw)\n if not self.should_send_metric(name):\n return\n\n tags = self._normalize_tags_type(tags or [], device_name, name)\n if hostname is None:\n hostname = ''\n\n if self.metric_limiter:\n if mtype in ONE_PER_CONTEXT_METRIC_TYPES:\n # Fast path for gauges, rates, monotonic counters, assume one set of tags per call\n if self.metric_limiter.is_reached():\n return\n else:\n # Other metric types have a legit use case for several calls per set of tags, track unique sets of tags\n context = self._context_uid(mtype, name, tags, hostname)\n if self.metric_limiter.is_reached(context):\n return\n\n try:\n value = float(value)\n except ValueError:\n err_msg = 'Metric: {} has non float value: {}. Only float values can be submitted as metrics.'.format(\n repr(name), repr(value)\n )\n if not AGENT_RUNNING:\n raise ValueError(err_msg)\n self.warning(err_msg)\n return\n\n aggregator.submit_metric(self, self.check_id, mtype, name, value, tags, hostname, flush_first_value)\n\n def gauge(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a gauge metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.GAUGE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def count(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a raw count metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.COUNT, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def monotonic_count(\n self, name, value, tags=None, hostname=None, device_name=None, raw=False, flush_first_value=False\n ):\n # type: (str, float, Sequence[str], str, str, bool, bool) -> None\n \"\"\"Sample an increasing counter metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n flush_first_value (bool):\n whether to sample the first value\n \"\"\"\n self._submit_metric(\n aggregator.MONOTONIC_COUNT,\n name,\n value,\n tags=tags,\n hostname=hostname,\n device_name=device_name,\n raw=raw,\n flush_first_value=flush_first_value,\n )\n\n def rate(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a point, with the rate calculated at the end of the check.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.RATE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def histogram(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a histogram metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.HISTOGRAM, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def historate(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a histogram based on rate metrics.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.HISTORATE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def increment(self, name, value=1, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Increment a counter metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._log_deprecation('increment')\n self._submit_metric(\n aggregator.COUNTER, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def decrement(self, name, value=-1, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Decrement a counter metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._log_deprecation('increment')\n self._submit_metric(\n aggregator.COUNTER, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def service_check(self, name, status, tags=None, hostname=None, message=None, raw=False):\n # type: (str, ServiceCheckStatus, Sequence[str], str, str, bool) -> None\n \"\"\"Send the status of a service.\n\n Parameters:\n name (str):\n the name of the service check\n status (int):\n a constant describing the service status\n tags (list[str]):\n a list of tags to associate with this service check\n message (str):\n additional information or a description of why this status occurred.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n tags = self._normalize_tags_type(tags or [])\n if hostname is None:\n hostname = ''\n if message is None:\n message = ''\n else:\n message = to_native_string(message)\n\n message = self.sanitize(message)\n\n aggregator.submit_service_check(\n self, self.check_id, self._format_namespace(name, raw), status, tags, hostname, message\n )\n\n def send_log(self, data, cursor=None, stream='default'):\n # type: (dict[str, str], dict[str, Any] | None, str) -> None\n \"\"\"Send a log for submission.\n\n Parameters:\n data (dict[str, str]):\n The log data to send. The following keys are treated specially, if present:\n\n - timestamp: should be an integer or float representing the number of seconds since the Unix epoch\n - ddtags: if not defined, it will automatically be set based on the instance's `tags` option\n cursor (dict[str, Any] or None):\n Metadata associated with the log which will be saved to disk. The most recent value may be\n retrieved with the `get_log_cursor` method.\n stream (str):\n The stream associated with this log, used for accurate cursor persistence.\n Has no effect if `cursor` argument is `None`.\n \"\"\"\n attributes = data.copy()\n if 'ddtags' not in attributes and self.formatted_tags:\n attributes['ddtags'] = self.formatted_tags\n\n timestamp = attributes.get('timestamp')\n if timestamp is not None:\n # convert seconds to milliseconds\n attributes['timestamp'] = int(timestamp * 1000)\n\n datadog_agent.send_log(to_json(attributes), self.check_id)\n if cursor is not None:\n self.write_persistent_cache('log_cursor_{}'.format(stream), to_json(cursor))\n\n def get_log_cursor(self, stream='default'):\n # type: (str) -> dict[str, Any] | None\n \"\"\"Returns the most recent log cursor from disk.\"\"\"\n data = self.read_persistent_cache('log_cursor_{}'.format(stream))\n return from_json(data) if data else None\n\n def _log_deprecation(self, deprecation_key, *args):\n # type: (str, *str) -> None\n \"\"\"\n Logs a deprecation notice at most once per AgentCheck instance, for the pre-defined `deprecation_key`\n \"\"\"\n sent, message = self._deprecations[deprecation_key]\n if sent:\n return\n\n self.warning(message, *args)\n self._deprecations[deprecation_key] = (True, message)\n\n # TODO: Remove once our checks stop calling it\n def service_metadata(self, meta_name, value):\n # type: (str, Any) -> None\n pass\n\n def set_metadata(self, name, value, **options):\n # type: (str, Any, **Any) -> None\n \"\"\"Updates the cached metadata `name` with `value`, which is then sent by the Agent at regular intervals.\n\n Parameters:\n name (str):\n the name of the metadata\n value (Any):\n the value for the metadata. if ``name`` has no transformer defined then the\n raw ``value`` will be submitted and therefore it must be a ``str``\n options (Any):\n keyword arguments to pass to any defined transformer\n \"\"\"\n self.metadata_manager.submit(name, value, options)\n\n @staticmethod\n def is_metadata_collection_enabled():\n # type: () -> bool\n return is_affirmative(datadog_agent.get_config('enable_metadata_collection'))\n\n @classmethod\n def metadata_entrypoint(cls, method):\n # type: (Callable[..., None]) -> Callable[..., None]\n \"\"\"\n Skip execution of the decorated method if metadata collection is disabled on the Agent.\n\n Usage:\n\n ```python\n class MyCheck(AgentCheck):\n @AgentCheck.metadata_entrypoint\n def collect_metadata(self):\n ...\n ```\n \"\"\"\n\n @functools.wraps(method)\n def entrypoint(self, *args, **kwargs):\n # type: (AgentCheck, *Any, **Any) -> None\n if not self.is_metadata_collection_enabled():\n return\n\n # NOTE: error handling still at the discretion of the wrapped method.\n method(self, *args, **kwargs)\n\n return entrypoint\n\n def _persistent_cache_id(self, key):\n # type: (str) -> str\n return '{}_{}'.format(self.check_id, key)\n\n def read_persistent_cache(self, key):\n # type: (str) -> str\n \"\"\"Returns the value previously stored with `write_persistent_cache` for the same `key`.\n\n Parameters:\n key (str):\n the key to retrieve\n \"\"\"\n return datadog_agent.read_persistent_cache(self._persistent_cache_id(key))\n\n def write_persistent_cache(self, key, value):\n # type: (str, str) -> None\n \"\"\"Stores `value` in a persistent cache for this check instance.\n The cache is located in a path where the agent is guaranteed to have read & write permissions. Namely in\n - `%ProgramData%\\\\Datadog\\\\run` on Windows.\n - `/opt/datadog-agent/run` everywhere else.\n The cache is persistent between agent restarts but will be rebuilt if the check instance configuration changes.\n\n Parameters:\n key (str):\n the key to retrieve\n value (str):\n the value to store\n \"\"\"\n datadog_agent.write_persistent_cache(self._persistent_cache_id(key), value)\n\n def set_external_tags(self, external_tags):\n # type: (Sequence[ExternalTagType]) -> None\n # Example of external_tags format\n # [\n # ('hostname', {'src_name': ['test:t1']}),\n # ('hostname2', {'src2_name': ['test2:t3']})\n # ]\n try:\n new_tags = []\n for hostname, source_map in external_tags:\n new_tags.append((to_native_string(hostname), source_map))\n for src_name, tags in iteritems(source_map):\n source_map[src_name] = self._normalize_tags_type(tags)\n datadog_agent.set_external_tags(new_tags)\n except IndexError:\n self.log.exception('Unexpected external tags format: %s', external_tags)\n raise\n\n def convert_to_underscore_separated(self, name):\n # type: (Union[str, bytes]) -> bytes\n \"\"\"\n Convert from CamelCase to camel_case\n And substitute illegal metric characters\n \"\"\"\n name = ensure_bytes(name)\n metric_name = self.FIRST_CAP_RE.sub(br'\\1_\\2', name)\n metric_name = self.ALL_CAP_RE.sub(br'\\1_\\2', metric_name).lower()\n metric_name = self.METRIC_REPLACEMENT.sub(br'_', metric_name)\n return self.DOT_UNDERSCORE_CLEANUP.sub(br'.', metric_name).strip(b'_')\n\n def warning(self, warning_message, *args, **kwargs):\n # type: (str, *Any, **Any) -> None\n \"\"\"Log a warning message, display it in the Agent's status page and in-app.\n\n Using *args is intended to make warning work like log.warn/debug/info/etc\n and make it compliant with flake8 logging format linter.\n\n Parameters:\n warning_message (str):\n the warning message\n args (Any):\n format string args used to format the warning message e.g. `warning_message % args`\n kwargs (Any):\n not used for now, but added to match Python logger's `warning` method signature\n \"\"\"\n warning_message = to_native_string(warning_message)\n # Interpolate message only if args is not empty. Same behavior as python logger:\n # https://github.com/python/cpython/blob/1dbe5373851acb85ba91f0be7b83c69563acd68d/Lib/logging/__init__.py#L368-L369\n if args:\n warning_message = warning_message % args\n frame = inspect.currentframe().f_back # type: ignore\n lineno = frame.f_lineno\n # only log the last part of the filename, not the full path\n filename = basename(frame.f_code.co_filename)\n\n self.log.warning(warning_message, extra={'_lineno': lineno, '_filename': filename, '_check_id': self.check_id})\n self.warnings.append(warning_message)\n\n def get_warnings(self):\n # type: () -> List[str]\n \"\"\"\n Return the list of warnings messages to be displayed in the info page\n \"\"\"\n warnings = self.warnings\n self.warnings = []\n return warnings\n\n def get_diagnoses(self):\n # type: () -> str\n \"\"\"\n Return the list of diagnosis as a JSON encoded string.\n\n The agent calls this method to retrieve diagnostics from integrations. This method\n runs explicit diagnostics if available.\n \"\"\"\n return to_json([d._asdict() for d in (self.diagnosis.diagnoses + self.diagnosis.run_explicit())])\n\n def _get_requests_proxy(self):\n # type: () -> ProxySettings\n # TODO: Remove with Agent 5\n no_proxy_settings = {'http': None, 'https': None, 'no': []} # type: ProxySettings\n\n # First we read the proxy configuration from datadog.conf\n proxies = self.agentConfig.get('proxy', datadog_agent.get_config('proxy'))\n if proxies:\n proxies = proxies.copy()\n\n # requests compliant dict\n if proxies and 'no_proxy' in proxies:\n proxies['no'] = proxies.pop('no_proxy')\n\n return proxies if proxies else no_proxy_settings\n\n def _format_namespace(self, s, raw=False):\n # type: (str, bool) -> str\n if not raw and self.__NAMESPACE__:\n return '{}.{}'.format(self.__NAMESPACE__, to_native_string(s))\n\n return to_native_string(s)\n\n def normalize(self, metric, prefix=None, fix_case=False):\n # type: (Union[str, bytes], Union[str, bytes], bool) -> str\n \"\"\"\n Turn a metric into a well-formed metric name prefix.b.c\n\n Parameters:\n metric: The metric name to normalize\n prefix: A prefix to to add to the normalized name, default None\n fix_case: A boolean, indicating whether to make sure that the metric name returned is in \"snake_case\"\n \"\"\"\n if isinstance(metric, text_type):\n metric = unicodedata.normalize('NFKD', metric).encode('ascii', 'ignore')\n\n if fix_case:\n name = self.convert_to_underscore_separated(metric)\n if prefix is not None:\n prefix = self.convert_to_underscore_separated(prefix)\n else:\n name = self.METRIC_REPLACEMENT.sub(br'_', metric)\n name = self.DOT_UNDERSCORE_CLEANUP.sub(br'.', name).strip(b'_')\n\n name = self.MULTIPLE_UNDERSCORE_CLEANUP.sub(br'_', name)\n\n if prefix is not None:\n name = ensure_bytes(prefix) + b\".\" + name\n\n return to_native_string(name)\n\n def normalize_tag(self, tag):\n # type: (Union[str, bytes]) -> str\n \"\"\"Normalize tag values.\n\n This happens for legacy reasons, when we cleaned up some characters (like '-')\n which are allowed in tags.\n \"\"\"\n if isinstance(tag, text_type):\n tag = tag.encode('utf-8', 'ignore')\n tag = self.TAG_REPLACEMENT.sub(br'_', tag)\n tag = self.MULTIPLE_UNDERSCORE_CLEANUP.sub(br'_', tag)\n tag = self.DOT_UNDERSCORE_CLEANUP.sub(br'.', tag).strip(b'_')\n return to_native_string(tag)\n\n def check(self, instance):\n # type: (InstanceType) -> None\n raise NotImplementedError\n\n def cancel(self):\n # type: () -> None\n \"\"\"\n This method is called when the check in unscheduled by the agent. This\n is SIGNAL that the check is being unscheduled and can be called while\n the check is running. It's up to the python implementation to make sure\n cancel is thread safe and won't block.\n \"\"\"\n pass\n\n def run(self):\n # type: () -> str\n try:\n self.diagnosis.clear()\n # Ignore check initializations if running in a separate process\n if is_affirmative(self.instance.get('process_isolation', self.init_config.get('process_isolation', False))):\n from ..utils.replay.execute import run_with_isolation\n\n run_with_isolation(self, aggregator, datadog_agent)\n else:\n while self.check_initializations:\n initialization = self.check_initializations.popleft()\n try:\n initialization()\n except Exception:\n self.check_initializations.appendleft(initialization)\n raise\n\n instance = copy.deepcopy(self.instances[0])\n\n if 'set_breakpoint' in self.init_config:\n from ..utils.agent.debug import enter_pdb\n\n enter_pdb(self.check, line=self.init_config['set_breakpoint'], args=(instance,))\n elif self.should_profile_memory():\n self.profile_memory(self.check, self.init_config, args=(instance,))\n else:\n self.check(instance)\n\n error_report = ''\n except Exception as e:\n message = self.sanitize(str(e))\n tb = self.sanitize(traceback.format_exc())\n error_report = to_json([{'message': message, 'traceback': tb}])\n finally:\n if self.metric_limiter:\n if is_affirmative(self.debug_metrics.get('metric_contexts', False)):\n debug_metrics = self.metric_limiter.get_debug_metrics()\n\n # Reset so we can actually submit the metrics\n self.metric_limiter.reset()\n\n tags = self.get_debug_metric_tags()\n for metric_name, value in debug_metrics:\n self.gauge(metric_name, value, tags=tags, raw=True)\n\n self.metric_limiter.reset()\n\n return error_report\n\n def event(self, event):\n # type: (Event) -> None\n \"\"\"Send an event.\n\n An event is a dictionary with the following keys and data types:\n\n ```python\n {\n \"timestamp\": int, # the epoch timestamp for the event\n \"event_type\": str, # the event name\n \"api_key\": str, # the api key for your account\n \"msg_title\": str, # the title of the event\n \"msg_text\": str, # the text body of the event\n \"aggregation_key\": str, # a key to use for aggregating events\n \"alert_type\": str, # (optional) one of ('error', 'warning', 'success', 'info'), defaults to 'info'\n \"source_type_name\": str, # (optional) the source type name\n \"host\": str, # (optional) the name of the host\n \"tags\": list, # (optional) a list of tags to associate with this event\n \"priority\": str, # (optional) specifies the priority of the event (\"normal\" or \"low\")\n }\n ```\n\n Parameters:\n event (dict[str, Any]):\n the event to be sent\n \"\"\"\n # Enforce types of some fields, considerably facilitates handling in go bindings downstream\n for key, value in iteritems(event):\n if not isinstance(value, (text_type, binary_type)):\n continue\n\n try:\n event[key] = to_native_string(value) # type: ignore\n # ^ Mypy complains about dynamic key assignment -- arguably for good reason.\n # Ideally we should convert this to a dict literal so that submitted events only include known keys.\n except UnicodeError:\n self.log.warning('Encoding error with field `%s`, cannot submit event', key)\n return\n\n if event.get('tags'):\n event['tags'] = self._normalize_tags_type(event['tags'])\n if event.get('timestamp'):\n event['timestamp'] = int(event['timestamp'])\n if event.get('aggregation_key'):\n event['aggregation_key'] = to_native_string(event['aggregation_key'])\n\n if self.__NAMESPACE__:\n event.setdefault('source_type_name', self.__NAMESPACE__)\n\n aggregator.submit_event(self, self.check_id, event)\n\n def _normalize_tags_type(self, tags, device_name=None, metric_name=None):\n # type: (Sequence[Union[None, str, bytes]], str, str) -> List[str]\n \"\"\"\n Normalize tags contents and type:\n - append `device_name` as `device:` tag\n - normalize tags type\n - doesn't mutate the passed list, returns a new list\n \"\"\"\n normalized_tags = []\n\n if device_name:\n self._log_deprecation('device_name')\n try:\n normalized_tags.append('device:{}'.format(to_native_string(device_name)))\n except UnicodeError:\n self.log.warning(\n 'Encoding error with device name `%r` for metric `%r`, ignoring tag', device_name, metric_name\n )\n\n for tag in tags:\n if tag is None:\n continue\n try:\n tag = to_native_string(tag)\n except UnicodeError:\n self.log.warning('Encoding error with tag `%s` for metric `%s`, ignoring tag', tag, metric_name)\n continue\n if self.disable_generic_tags:\n normalized_tags.append(self.degeneralise_tag(tag))\n else:\n normalized_tags.append(tag)\n return normalized_tags\n\n def degeneralise_tag(self, tag):\n split_tag = tag.split(':', 1)\n if len(split_tag) > 1:\n tag_name, value = split_tag\n else:\n tag_name = tag\n value = None\n\n if tag_name in GENERIC_TAGS:\n new_name = '{}_{}'.format(self.name, tag_name)\n if value:\n return '{}:{}'.format(new_name, value)\n else:\n return new_name\n else:\n return tag\n\n def get_debug_metric_tags(self):\n tags = ['check_name:{}'.format(self.name), 'check_version:{}'.format(self.check_version)]\n tags.extend(self.instance.get('tags', []))\n return tags\n\n def get_memory_profile_tags(self):\n # type: () -> List[str]\n tags = self.get_debug_metric_tags()\n tags.extend(self.instance.get('__memory_profiling_tags', []))\n return tags\n\n def should_profile_memory(self):\n # type: () -> bool\n return 'profile_memory' in self.init_config or (\n datadog_agent.tracemalloc_enabled() and should_profile_memory(datadog_agent, self.name)\n )\n\n def profile_memory(self, func, namespaces=None, args=(), kwargs=None, extra_tags=None):\n # type: (Callable[..., Any], Optional[Sequence[str]], Sequence[Any], Optional[Dict[str, Any]], Optional[List[str]]) -> None # noqa: E501\n from ..utils.agent.memory import profile_memory\n\n if namespaces is None:\n namespaces = self.check_id.split(':', 1)\n\n tags = self.get_memory_profile_tags()\n if extra_tags is not None:\n tags.extend(extra_tags)\n\n metrics = profile_memory(func, self.init_config, namespaces=namespaces, args=args, kwargs=kwargs)\n\n for m in metrics:\n self.gauge(m.name, m.value, tags=tags, raw=True)\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def gauge(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a gauge metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.GAUGE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def count(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a raw count metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.COUNT, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
Falseflush_first_valuebool
whether to sample the first value
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def monotonic_count(\n self, name, value, tags=None, hostname=None, device_name=None, raw=False, flush_first_value=False\n):\n # type: (str, float, Sequence[str], str, str, bool, bool) -> None\n \"\"\"Sample an increasing counter metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n flush_first_value (bool):\n whether to sample the first value\n \"\"\"\n self._submit_metric(\n aggregator.MONOTONIC_COUNT,\n name,\n value,\n tags=tags,\n hostname=hostname,\n device_name=device_name,\n raw=raw,\n flush_first_value=flush_first_value,\n )\n
Sample a point, with the rate calculated at the end of the check.
Parameters:
Name Type Description Default namestr
the name of the metric
required valuefloat
the value for the metric
required tagslist[str]
a list of tags to associate with this metric
Nonehostnamestr
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def rate(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a point, with the rate calculated at the end of the check.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.RATE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def histogram(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a histogram metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.HISTOGRAM, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def historate(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a histogram based on rate metrics.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.HISTORATE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a list of tags to associate with this service check
Nonemessagestr
additional information or a description of why this status occurred.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def service_check(self, name, status, tags=None, hostname=None, message=None, raw=False):\n # type: (str, ServiceCheckStatus, Sequence[str], str, str, bool) -> None\n \"\"\"Send the status of a service.\n\n Parameters:\n name (str):\n the name of the service check\n status (int):\n a constant describing the service status\n tags (list[str]):\n a list of tags to associate with this service check\n message (str):\n additional information or a description of why this status occurred.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n tags = self._normalize_tags_type(tags or [])\n if hostname is None:\n hostname = ''\n if message is None:\n message = ''\n else:\n message = to_native_string(message)\n\n message = self.sanitize(message)\n\n aggregator.submit_service_check(\n self, self.check_id, self._format_namespace(name, raw), status, tags, hostname, message\n )\n
An event is a dictionary with the following keys and data types:
{\n \"timestamp\": int, # the epoch timestamp for the event\n \"event_type\": str, # the event name\n \"api_key\": str, # the api key for your account\n \"msg_title\": str, # the title of the event\n \"msg_text\": str, # the text body of the event\n \"aggregation_key\": str, # a key to use for aggregating events\n \"alert_type\": str, # (optional) one of ('error', 'warning', 'success', 'info'), defaults to 'info'\n \"source_type_name\": str, # (optional) the source type name\n \"host\": str, # (optional) the name of the host\n \"tags\": list, # (optional) a list of tags to associate with this event\n \"priority\": str, # (optional) specifies the priority of the event (\"normal\" or \"low\")\n}\n
Parameters:
Name Type Description Default eventdict[str, Any]
the event to be sent
required Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def event(self, event):\n # type: (Event) -> None\n \"\"\"Send an event.\n\n An event is a dictionary with the following keys and data types:\n\n ```python\n {\n \"timestamp\": int, # the epoch timestamp for the event\n \"event_type\": str, # the event name\n \"api_key\": str, # the api key for your account\n \"msg_title\": str, # the title of the event\n \"msg_text\": str, # the text body of the event\n \"aggregation_key\": str, # a key to use for aggregating events\n \"alert_type\": str, # (optional) one of ('error', 'warning', 'success', 'info'), defaults to 'info'\n \"source_type_name\": str, # (optional) the source type name\n \"host\": str, # (optional) the name of the host\n \"tags\": list, # (optional) a list of tags to associate with this event\n \"priority\": str, # (optional) specifies the priority of the event (\"normal\" or \"low\")\n }\n ```\n\n Parameters:\n event (dict[str, Any]):\n the event to be sent\n \"\"\"\n # Enforce types of some fields, considerably facilitates handling in go bindings downstream\n for key, value in iteritems(event):\n if not isinstance(value, (text_type, binary_type)):\n continue\n\n try:\n event[key] = to_native_string(value) # type: ignore\n # ^ Mypy complains about dynamic key assignment -- arguably for good reason.\n # Ideally we should convert this to a dict literal so that submitted events only include known keys.\n except UnicodeError:\n self.log.warning('Encoding error with field `%s`, cannot submit event', key)\n return\n\n if event.get('tags'):\n event['tags'] = self._normalize_tags_type(event['tags'])\n if event.get('timestamp'):\n event['timestamp'] = int(event['timestamp'])\n if event.get('aggregation_key'):\n event['aggregation_key'] = to_native_string(event['aggregation_key'])\n\n if self.__NAMESPACE__:\n event.setdefault('source_type_name', self.__NAMESPACE__)\n\n aggregator.submit_event(self, self.check_id, event)\n
Updates the cached metadata name with value, which is then sent by the Agent at regular intervals.
Parameters:
Name Type Description Default namestr
the name of the metadata
required valueAny
the value for the metadata. if name has no transformer defined then the raw value will be submitted and therefore it must be a str
required optionsAny
keyword arguments to pass to any defined transformer
{} Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def set_metadata(self, name, value, **options):\n # type: (str, Any, **Any) -> None\n \"\"\"Updates the cached metadata `name` with `value`, which is then sent by the Agent at regular intervals.\n\n Parameters:\n name (str):\n the name of the metadata\n value (Any):\n the value for the metadata. if ``name`` has no transformer defined then the\n raw ``value`` will be submitted and therefore it must be a ``str``\n options (Any):\n keyword arguments to pass to any defined transformer\n \"\"\"\n self.metadata_manager.submit(name, value, options)\n
Skip execution of the decorated method if metadata collection is disabled on the Agent.
Usage:
class MyCheck(AgentCheck):\n @AgentCheck.metadata_entrypoint\n def collect_metadata(self):\n ...\n
Source code in datadog_checks_base/datadog_checks/base/checks/base.py
@classmethod\ndef metadata_entrypoint(cls, method):\n # type: (Callable[..., None]) -> Callable[..., None]\n \"\"\"\n Skip execution of the decorated method if metadata collection is disabled on the Agent.\n\n Usage:\n\n ```python\n class MyCheck(AgentCheck):\n @AgentCheck.metadata_entrypoint\n def collect_metadata(self):\n ...\n ```\n \"\"\"\n\n @functools.wraps(method)\n def entrypoint(self, *args, **kwargs):\n # type: (AgentCheck, *Any, **Any) -> None\n if not self.is_metadata_collection_enabled():\n return\n\n # NOTE: error handling still at the discretion of the wrapped method.\n method(self, *args, **kwargs)\n\n return entrypoint\n
Returns the value previously stored with write_persistent_cache for the same key.
Parameters:
Name Type Description Default keystr
the key to retrieve
required Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def read_persistent_cache(self, key):\n # type: (str) -> str\n \"\"\"Returns the value previously stored with `write_persistent_cache` for the same `key`.\n\n Parameters:\n key (str):\n the key to retrieve\n \"\"\"\n return datadog_agent.read_persistent_cache(self._persistent_cache_id(key))\n
Stores value in a persistent cache for this check instance. The cache is located in a path where the agent is guaranteed to have read & write permissions. Namely in - %ProgramData%\\Datadog\\run on Windows. - /opt/datadog-agent/run everywhere else. The cache is persistent between agent restarts but will be rebuilt if the check instance configuration changes.
Parameters:
Name Type Description Default keystr
the key to retrieve
required valuestr
the value to store
required Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def write_persistent_cache(self, key, value):\n # type: (str, str) -> None\n \"\"\"Stores `value` in a persistent cache for this check instance.\n The cache is located in a path where the agent is guaranteed to have read & write permissions. Namely in\n - `%ProgramData%\\\\Datadog\\\\run` on Windows.\n - `/opt/datadog-agent/run` everywhere else.\n The cache is persistent between agent restarts but will be rebuilt if the check instance configuration changes.\n\n Parameters:\n key (str):\n the key to retrieve\n value (str):\n the value to store\n \"\"\"\n datadog_agent.write_persistent_cache(self._persistent_cache_id(key), value)\n
The log data to send. The following keys are treated specially, if present:
timestamp: should be an integer or float representing the number of seconds since the Unix epoch
ddtags: if not defined, it will automatically be set based on the instance's tags option
required cursordict[str, Any] or None
Metadata associated with the log which will be saved to disk. The most recent value may be retrieved with the get_log_cursor method.
Nonestreamstr
The stream associated with this log, used for accurate cursor persistence. Has no effect if cursor argument is None.
'default' Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def send_log(self, data, cursor=None, stream='default'):\n # type: (dict[str, str], dict[str, Any] | None, str) -> None\n \"\"\"Send a log for submission.\n\n Parameters:\n data (dict[str, str]):\n The log data to send. The following keys are treated specially, if present:\n\n - timestamp: should be an integer or float representing the number of seconds since the Unix epoch\n - ddtags: if not defined, it will automatically be set based on the instance's `tags` option\n cursor (dict[str, Any] or None):\n Metadata associated with the log which will be saved to disk. The most recent value may be\n retrieved with the `get_log_cursor` method.\n stream (str):\n The stream associated with this log, used for accurate cursor persistence.\n Has no effect if `cursor` argument is `None`.\n \"\"\"\n attributes = data.copy()\n if 'ddtags' not in attributes and self.formatted_tags:\n attributes['ddtags'] = self.formatted_tags\n\n timestamp = attributes.get('timestamp')\n if timestamp is not None:\n # convert seconds to milliseconds\n attributes['timestamp'] = int(timestamp * 1000)\n\n datadog_agent.send_log(to_json(attributes), self.check_id)\n if cursor is not None:\n self.write_persistent_cache('log_cursor_{}'.format(stream), to_json(cursor))\n
Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def get_log_cursor(self, stream='default'):\n # type: (str) -> dict[str, Any] | None\n \"\"\"Returns the most recent log cursor from disk.\"\"\"\n data = self.read_persistent_cache('log_cursor_{}'.format(stream))\n return from_json(data) if data else None\n
Log a warning message, display it in the Agent's status page and in-app.
Using *args is intended to make warning work like log.warn/debug/info/etc and make it compliant with flake8 logging format linter.
Parameters:
Name Type Description Default warning_messagestr
the warning message
required argsAny
format string args used to format the warning message e.g. warning_message % args
()kwargsAny
not used for now, but added to match Python logger's warning method signature
{} Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def warning(self, warning_message, *args, **kwargs):\n # type: (str, *Any, **Any) -> None\n \"\"\"Log a warning message, display it in the Agent's status page and in-app.\n\n Using *args is intended to make warning work like log.warn/debug/info/etc\n and make it compliant with flake8 logging format linter.\n\n Parameters:\n warning_message (str):\n the warning message\n args (Any):\n format string args used to format the warning message e.g. `warning_message % args`\n kwargs (Any):\n not used for now, but added to match Python logger's `warning` method signature\n \"\"\"\n warning_message = to_native_string(warning_message)\n # Interpolate message only if args is not empty. Same behavior as python logger:\n # https://github.com/python/cpython/blob/1dbe5373851acb85ba91f0be7b83c69563acd68d/Lib/logging/__init__.py#L368-L369\n if args:\n warning_message = warning_message % args\n frame = inspect.currentframe().f_back # type: ignore\n lineno = frame.f_lineno\n # only log the last part of the filename, not the full path\n filename = basename(frame.f_code.co_filename)\n\n self.log.warning(warning_message, extra={'_lineno': lineno, '_filename': filename, '_check_id': self.check_id})\n self.warnings.append(warning_message)\n
This implements the methods defined by the Agent's C bindings which in turn call the Go backend.
It also provides utility methods for test assertions.
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
class AggregatorStub(object):\n \"\"\"\n This implements the methods defined by the Agent's\n [C bindings](https://github.com/DataDog/datadog-agent/blob/master/rtloader/common/builtins/aggregator.c)\n which in turn call the\n [Go backend](https://github.com/DataDog/datadog-agent/blob/master/pkg/collector/python/aggregator.go).\n\n It also provides utility methods for test assertions.\n \"\"\"\n\n # Replicate the Enum we have on the Agent\n METRIC_ENUM_MAP = OrderedDict(\n (\n ('gauge', 0),\n ('rate', 1),\n ('count', 2),\n ('monotonic_count', 3),\n ('counter', 4),\n ('histogram', 5),\n ('historate', 6),\n )\n )\n METRIC_ENUM_MAP_REV = {v: k for k, v in iteritems(METRIC_ENUM_MAP)}\n GAUGE, RATE, COUNT, MONOTONIC_COUNT, COUNTER, HISTOGRAM, HISTORATE = list(METRIC_ENUM_MAP.values())\n AGGREGATE_TYPES = {COUNT, COUNTER}\n IGNORED_METRICS = {'datadog.agent.profile.memory.check_run_alloc'}\n METRIC_TYPE_SUBMISSION_TO_BACKEND_MAP = {\n 'gauge': 'gauge',\n 'rate': 'gauge',\n 'count': 'count',\n 'monotonic_count': 'count',\n 'counter': 'rate',\n 'histogram': 'rate', # Checking .count only, the other are gauges\n 'historate': 'rate', # Checking .count only, the other are gauges\n }\n\n def __init__(self):\n self.reset()\n\n @classmethod\n def is_aggregate(cls, mtype):\n return mtype in cls.AGGREGATE_TYPES\n\n @classmethod\n def ignore_metric(cls, name):\n return name in cls.IGNORED_METRICS\n\n def submit_metric(self, check, check_id, mtype, name, value, tags, hostname, flush_first_value):\n check_tag_names(name, tags)\n if not self.ignore_metric(name):\n self._metrics[name].append(MetricStub(name, mtype, value, tags, hostname, None, flush_first_value))\n\n def submit_metric_e2e(\n self, check, check_id, mtype, name, value, tags, hostname, device=None, flush_first_value=False\n ):\n check_tag_names(name, tags)\n # Device is only present in metrics read from the real agent in e2e tests. Normally it is submitted as a tag\n if not self.ignore_metric(name):\n self._metrics[name].append(MetricStub(name, mtype, value, tags, hostname, device, flush_first_value))\n\n def submit_service_check(self, check, check_id, name, status, tags, hostname, message):\n if status == ServiceCheck.OK and message:\n raise Exception(\"Expected empty message on OK service check\")\n\n check_tag_names(name, tags)\n self._service_checks[name].append(ServiceCheckStub(check_id, name, status, tags, hostname, message))\n\n def submit_event(self, check, check_id, event):\n self._events.append(event)\n\n def submit_event_platform_event(self, check, check_id, raw_event, event_type):\n self._event_platform_events[event_type].append(raw_event)\n\n def submit_histogram_bucket(\n self,\n check,\n check_id,\n name,\n value,\n lower_bound,\n upper_bound,\n monotonic,\n hostname,\n tags,\n flush_first_value=False,\n ):\n check_tag_names(name, tags)\n self._histogram_buckets[name].append(\n HistogramBucketStub(name, value, lower_bound, upper_bound, monotonic, hostname, tags, flush_first_value)\n )\n\n def metrics(self, name):\n \"\"\"\n Return the metrics received under the given name\n \"\"\"\n return [\n MetricStub(\n ensure_unicode(stub.name),\n stub.type,\n stub.value,\n normalize_tags(stub.tags),\n ensure_unicode(stub.hostname),\n stub.device,\n stub.flush_first_value,\n )\n for stub in self._metrics.get(to_native_string(name), [])\n ]\n\n def service_checks(self, name):\n \"\"\"\n Return the service checks received under the given name\n \"\"\"\n return [\n ServiceCheckStub(\n ensure_unicode(stub.check_id),\n ensure_unicode(stub.name),\n stub.status,\n normalize_tags(stub.tags),\n ensure_unicode(stub.hostname),\n ensure_unicode(stub.message),\n )\n for stub in self._service_checks.get(to_native_string(name), [])\n ]\n\n @property\n def events(self):\n \"\"\"\n Return all events\n \"\"\"\n return self._events\n\n def get_event_platform_events(self, event_type, parse_json=True):\n \"\"\"\n Return all event platform events for the event_type\n \"\"\"\n return [json.loads(e) if parse_json else e for e in self._event_platform_events[event_type]]\n\n def histogram_bucket(self, name):\n \"\"\"\n Return the histogram buckets received under the given name\n \"\"\"\n return [\n HistogramBucketStub(\n ensure_unicode(stub.name),\n stub.value,\n stub.lower_bound,\n stub.upper_bound,\n stub.monotonic,\n ensure_unicode(stub.hostname),\n normalize_tags(stub.tags),\n stub.flush_first_value,\n )\n for stub in self._histogram_buckets.get(to_native_string(name), [])\n ]\n\n def assert_metric_has_tags(self, metric_name, tags, count=None, at_least=1):\n for tag in tags:\n self.assert_metric_has_tag(metric_name, tag, count, at_least)\n\n def assert_metric_has_tag(self, metric_name, tag, count=None, at_least=1):\n \"\"\"\n Assert a metric is tagged with tag\n \"\"\"\n self._asserted.add(metric_name)\n\n candidates = []\n candidates_with_tag = []\n for metric in self.metrics(metric_name):\n candidates.append(metric)\n if tag in metric.tags:\n candidates_with_tag.append(metric)\n\n if candidates_with_tag: # The metric was found with the tag but not enough times\n msg = \"The metric '{}' with tag '{}' was only found {}/{} times\".format(metric_name, tag, count, at_least)\n elif candidates:\n msg = (\n \"The metric '{}' was found but not with the tag '{}'.\\n\".format(metric_name, tag)\n + \"Similar submitted:\\n\"\n + \"\\n\".join([\" {}\".format(m) for m in candidates])\n )\n else:\n expected_stub = MetricStub(metric_name, type=None, value=None, tags=[tag], hostname=None, device=None)\n msg = \"Metric '{}' not found\".format(metric_name)\n msg = \"{}\\n{}\".format(msg, build_similar_elements_msg(expected_stub, self._metrics))\n\n if count is not None:\n assert len(candidates_with_tag) == count, msg\n else:\n assert len(candidates_with_tag) >= at_least, msg\n\n # Potential kwargs: aggregation_key, alert_type, event_type,\n # msg_title, source_type_name\n def assert_event(self, msg_text, count=None, at_least=1, exact_match=True, tags=None, **kwargs):\n candidates = []\n for e in self.events:\n if exact_match and msg_text != e['msg_text'] or msg_text not in e['msg_text']:\n continue\n if tags and set(tags) != set(e['tags']):\n continue\n for name, value in iteritems(kwargs):\n if e[name] != value:\n break\n else:\n candidates.append(e)\n\n msg = \"Candidates size assertion for `{}`, count: {}, at_least: {}) failed\".format(msg_text, count, at_least)\n if count is not None:\n assert len(candidates) == count, msg\n else:\n assert len(candidates) >= at_least, msg\n\n def assert_histogram_bucket(\n self,\n name,\n value,\n lower_bound,\n upper_bound,\n monotonic,\n hostname,\n tags,\n count=None,\n at_least=1,\n flush_first_value=None,\n ):\n expected_tags = normalize_tags(tags, sort=True)\n\n candidates = []\n for bucket in self.histogram_bucket(name):\n if value is not None and value != bucket.value:\n continue\n\n if expected_tags and expected_tags != sorted(bucket.tags):\n continue\n\n if hostname and hostname != bucket.hostname:\n continue\n\n if monotonic != bucket.monotonic:\n continue\n\n if flush_first_value is not None and flush_first_value != bucket.flush_first_value:\n continue\n\n candidates.append(bucket)\n\n expected_bucket = HistogramBucketStub(\n name, value, lower_bound, upper_bound, monotonic, hostname, tags, flush_first_value\n )\n\n if count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(\n condition=condition, msg=msg, expected_stub=expected_bucket, submitted_elements=self._histogram_buckets\n )\n\n def assert_metric(\n self,\n name,\n value=None,\n tags=None,\n count=None,\n at_least=1,\n hostname=None,\n metric_type=None,\n device=None,\n flush_first_value=None,\n ):\n \"\"\"\n Assert a metric was processed by this stub\n \"\"\"\n\n self._asserted.add(name)\n expected_tags = normalize_tags(tags, sort=True)\n\n candidates = []\n for metric in self.metrics(name):\n if value is not None and not self.is_aggregate(metric.type) and value != metric.value:\n continue\n\n if expected_tags and expected_tags != sorted(metric.tags):\n continue\n\n if hostname is not None and hostname != metric.hostname:\n continue\n\n if metric_type is not None and metric_type != metric.type:\n continue\n\n if device is not None and device != metric.device:\n continue\n\n if flush_first_value is not None and flush_first_value != metric.flush_first_value:\n continue\n\n candidates.append(metric)\n\n expected_metric = MetricStub(name, metric_type, value, expected_tags, hostname, device, flush_first_value)\n\n if value is not None and candidates and all(self.is_aggregate(m.type) for m in candidates):\n got = sum(m.value for m in candidates)\n msg = \"Expected count value for '{}': {}, got {}\".format(name, value, got)\n condition = value == got\n elif count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(condition, msg=msg, expected_stub=expected_metric, submitted_elements=self._metrics)\n\n def assert_service_check(self, name, status=None, tags=None, count=None, at_least=1, hostname=None, message=None):\n \"\"\"\n Assert a service check was processed by this stub\n \"\"\"\n tags = normalize_tags(tags, sort=True)\n candidates = []\n for sc in self.service_checks(name):\n if status is not None and status != sc.status:\n continue\n\n if tags and tags != sorted(sc.tags):\n continue\n\n if hostname is not None and hostname != sc.hostname:\n continue\n\n if message is not None and message != sc.message:\n continue\n\n candidates.append(sc)\n\n expected_service_check = ServiceCheckStub(\n None, name=name, status=status, tags=tags, hostname=hostname, message=message\n )\n\n if count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(\n condition=condition, msg=msg, expected_stub=expected_service_check, submitted_elements=self._service_checks\n )\n\n @staticmethod\n def _assert(condition, msg, expected_stub, submitted_elements):\n new_msg = msg\n if not condition: # It's costly to build the message with similar metrics, so it's built only on failure.\n new_msg = \"{}\\n{}\".format(msg, build_similar_elements_msg(expected_stub, submitted_elements))\n assert condition, new_msg\n\n def assert_all_metrics_covered(self):\n # use `condition` to avoid building the `msg` if not needed\n condition = self.metrics_asserted_pct >= 100.0\n msg = ''\n if not condition:\n prefix = '\\n\\t- '\n msg = 'Some metrics are collected but not asserted:'\n msg += '\\nAsserted Metrics:{}{}'.format(prefix, prefix.join(sorted(self._asserted)))\n msg += '\\nFound metrics that are not asserted:{}{}'.format(prefix, prefix.join(sorted(self.not_asserted())))\n assert condition, msg\n\n def assert_metrics_using_metadata(\n self, metadata_metrics, check_metric_type=True, check_submission_type=False, exclude=None\n ):\n \"\"\"\n Assert metrics using metadata.csv\n\n Checking type: By default we are asserting the in-app metric type (`check_submission_type=False`),\n asserting this type make sense for e2e (metrics collected from agent).\n For integrations tests, we can check the submission type with `check_submission_type=True`, or\n use `check_metric_type=False` not to check types.\n\n Usage:\n\n from datadog_checks.dev.utils import get_metadata_metrics\n aggregator.assert_metrics_using_metadata(get_metadata_metrics())\n\n \"\"\"\n\n exclude = exclude or []\n errors = set()\n for metric_name, metric_stubs in iteritems(self._metrics):\n if metric_name in exclude:\n continue\n for metric_stub in metric_stubs:\n metric_stub_name = backend_normalize_metric_name(metric_stub.name)\n actual_metric_type = AggregatorStub.METRIC_ENUM_MAP_REV[metric_stub.type]\n\n # We only check `*.count` metrics for histogram and historate submissions\n # Note: all Openmetrics histogram and summary metrics are actually separately submitted\n if check_submission_type and actual_metric_type in ['histogram', 'historate']:\n metric_stub_name += '.count'\n\n # Checking the metric is in `metadata.csv`\n if metric_stub_name not in metadata_metrics:\n errors.add(\"Expect `{}` to be in metadata.csv.\".format(metric_stub_name))\n continue\n\n expected_metric_type = metadata_metrics[metric_stub_name]['metric_type']\n if check_submission_type:\n # Integration tests type mapping\n actual_metric_type = AggregatorStub.METRIC_TYPE_SUBMISSION_TO_BACKEND_MAP[actual_metric_type]\n else:\n # E2E tests\n if actual_metric_type == 'monotonic_count' and expected_metric_type == 'count':\n actual_metric_type = 'count'\n\n if check_metric_type:\n if expected_metric_type != actual_metric_type:\n errors.add(\n \"Expect `{}` to have type `{}` but got `{}`.\".format(\n metric_stub_name, expected_metric_type, actual_metric_type\n )\n )\n\n assert not errors, \"Metadata assertion errors using metadata.csv:\" + \"\\n\\t- \".join([''] + sorted(errors))\n\n def assert_service_checks(self, service_checks):\n \"\"\"\n Assert service checks using service_checks.json\n\n Usage:\n\n from datadog_checks.dev.utils import get_service_checks\n aggregator.assert_service_checks(get_service_checks())\n\n \"\"\"\n\n errors = set()\n\n for service_check_name, service_check_stubs in iteritems(self._service_checks):\n for service_check_stub in service_check_stubs:\n # Checking the metric is in `service_checks.json`\n if service_check_name not in [sc['check'] for sc in service_checks]:\n errors.add(\"Expect `{}` to be in service_check.json.\".format(service_check_name))\n continue\n\n status_string = {value: key for key, value in iteritems(ServiceCheck._asdict())}[\n service_check_stub.status\n ].lower()\n service_check = [c for c in service_checks if c['check'] == service_check_name][0]\n\n if status_string not in service_check['statuses']:\n errors.add(\n \"Expect `{}` value to be in service_check.json for service check {}.\".format(\n status_string, service_check_stub.name\n )\n )\n\n assert not errors, \"Service checks assertion errors using service_checks.json:\" + \"\\n\\t- \".join(\n [''] + sorted(errors)\n )\n\n def assert_no_duplicate_all(self):\n \"\"\"\n Assert no duplicate metrics and service checks have been submitted.\n \"\"\"\n self.assert_no_duplicate_metrics()\n self.assert_no_duplicate_service_checks()\n\n def assert_no_duplicate_metrics(self):\n \"\"\"\n Assert no duplicate metrics have been submitted.\n\n Metrics are considered duplicate when all following fields match:\n\n - metric name\n - type (gauge, rate, etc)\n - tags\n - hostname\n \"\"\"\n # metric types that intended to be called multiple times are ignored\n ignored_types = [self.COUNT, self.COUNTER]\n metric_stubs = [m for metrics in self._metrics.values() for m in metrics if m.type not in ignored_types]\n\n def stub_to_key_fn(stub):\n return stub.name, stub.type, str(sorted(stub.tags)), stub.hostname\n\n self._assert_no_duplicate_stub('metric', metric_stubs, stub_to_key_fn)\n\n def assert_no_duplicate_service_checks(self):\n \"\"\"\n Assert no duplicate service checks have been submitted.\n\n Service checks are considered duplicate when all following fields match:\n - metric name\n - status\n - tags\n - hostname\n \"\"\"\n service_check_stubs = [m for metrics in self._service_checks.values() for m in metrics]\n\n def stub_to_key_fn(stub):\n return stub.name, stub.status, str(sorted(stub.tags)), stub.hostname\n\n self._assert_no_duplicate_stub('service_check', service_check_stubs, stub_to_key_fn)\n\n @staticmethod\n def _assert_no_duplicate_stub(stub_type, all_metrics, stub_to_key_fn):\n all_contexts = defaultdict(list)\n for metric in all_metrics:\n context = stub_to_key_fn(metric)\n all_contexts[context].append(metric)\n\n dup_contexts = defaultdict(list)\n for context, metrics in iteritems(all_contexts):\n if len(metrics) > 1:\n dup_contexts[context] = metrics\n\n err_msg_lines = [\"Duplicate {}s found:\".format(stub_type)]\n for key in sorted(dup_contexts):\n contexts = dup_contexts[key]\n err_msg_lines.append('- {}'.format(contexts[0].name))\n for metric in contexts:\n err_msg_lines.append(' ' + str(metric))\n\n assert len(dup_contexts) == 0, \"\\n\".join(err_msg_lines)\n\n def reset(self):\n \"\"\"\n Set the stub to its initial state\n \"\"\"\n self._metrics = defaultdict(list)\n self._asserted = set()\n self._service_checks = defaultdict(list)\n self._events = []\n # dict[event_type, [events]]\n self._event_platform_events = defaultdict(list)\n self._histogram_buckets = defaultdict(list)\n\n def all_metrics_asserted(self):\n assert self.metrics_asserted_pct >= 100.0\n\n def not_asserted(self):\n present_metrics = {ensure_unicode(m) for m in self._metrics}\n return present_metrics - set(self._asserted)\n\n def assert_metric_has_tag_prefix(self, metric_name, tag_prefix, count=None, at_least=1):\n candidates = []\n self._asserted.add(metric_name)\n\n for metric in self.metrics(metric_name):\n tags = metric.tags\n gtags = [t for t in tags if t.startswith(tag_prefix)]\n if len(gtags) > 0:\n candidates.append(metric)\n\n msg = \"Candidates size assertion for `{}`, count: {}, at_least: {}) failed\".format(metric_name, count, at_least)\n if count is not None:\n assert len(candidates) == count, msg\n else:\n assert len(candidates) >= at_least, msg\n\n @property\n def metrics_asserted_pct(self):\n \"\"\"\n Return the metrics assertion coverage\n \"\"\"\n num_metrics = len(self._metrics)\n num_asserted = len(self._asserted)\n\n if num_metrics == 0:\n if num_asserted == 0:\n return 100\n else:\n return 0\n\n # If it there have been assertions with at_least=0 the length of the num_metrics and num_asserted can match\n # even if there are different metrics in each set\n not_asserted = self.not_asserted()\n return (num_metrics - len(not_asserted)) / num_metrics * 100\n\n @property\n def metric_names(self):\n \"\"\"\n Return all the metric names we've seen so far\n \"\"\"\n return [ensure_unicode(name) for name in self._metrics.keys()]\n\n @property\n def service_check_names(self):\n \"\"\"\n Return all the service checks names seen so far\n \"\"\"\n return [ensure_unicode(name) for name in self._service_checks.keys()]\n
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_metric(\n self,\n name,\n value=None,\n tags=None,\n count=None,\n at_least=1,\n hostname=None,\n metric_type=None,\n device=None,\n flush_first_value=None,\n):\n \"\"\"\n Assert a metric was processed by this stub\n \"\"\"\n\n self._asserted.add(name)\n expected_tags = normalize_tags(tags, sort=True)\n\n candidates = []\n for metric in self.metrics(name):\n if value is not None and not self.is_aggregate(metric.type) and value != metric.value:\n continue\n\n if expected_tags and expected_tags != sorted(metric.tags):\n continue\n\n if hostname is not None and hostname != metric.hostname:\n continue\n\n if metric_type is not None and metric_type != metric.type:\n continue\n\n if device is not None and device != metric.device:\n continue\n\n if flush_first_value is not None and flush_first_value != metric.flush_first_value:\n continue\n\n candidates.append(metric)\n\n expected_metric = MetricStub(name, metric_type, value, expected_tags, hostname, device, flush_first_value)\n\n if value is not None and candidates and all(self.is_aggregate(m.type) for m in candidates):\n got = sum(m.value for m in candidates)\n msg = \"Expected count value for '{}': {}, got {}\".format(name, value, got)\n condition = value == got\n elif count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(condition, msg=msg, expected_stub=expected_metric, submitted_elements=self._metrics)\n
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_metric_has_tag(self, metric_name, tag, count=None, at_least=1):\n \"\"\"\n Assert a metric is tagged with tag\n \"\"\"\n self._asserted.add(metric_name)\n\n candidates = []\n candidates_with_tag = []\n for metric in self.metrics(metric_name):\n candidates.append(metric)\n if tag in metric.tags:\n candidates_with_tag.append(metric)\n\n if candidates_with_tag: # The metric was found with the tag but not enough times\n msg = \"The metric '{}' with tag '{}' was only found {}/{} times\".format(metric_name, tag, count, at_least)\n elif candidates:\n msg = (\n \"The metric '{}' was found but not with the tag '{}'.\\n\".format(metric_name, tag)\n + \"Similar submitted:\\n\"\n + \"\\n\".join([\" {}\".format(m) for m in candidates])\n )\n else:\n expected_stub = MetricStub(metric_name, type=None, value=None, tags=[tag], hostname=None, device=None)\n msg = \"Metric '{}' not found\".format(metric_name)\n msg = \"{}\\n{}\".format(msg, build_similar_elements_msg(expected_stub, self._metrics))\n\n if count is not None:\n assert len(candidates_with_tag) == count, msg\n else:\n assert len(candidates_with_tag) >= at_least, msg\n
"},{"location":"base/api/#datadog_checks.base.stubs.aggregator.AggregatorStub.assert_metric_has_tag_prefix","title":"assert_metric_has_tag_prefix(metric_name, tag_prefix, count=None, at_least=1)","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_metric_has_tag_prefix(self, metric_name, tag_prefix, count=None, at_least=1):\n candidates = []\n self._asserted.add(metric_name)\n\n for metric in self.metrics(metric_name):\n tags = metric.tags\n gtags = [t for t in tags if t.startswith(tag_prefix)]\n if len(gtags) > 0:\n candidates.append(metric)\n\n msg = \"Candidates size assertion for `{}`, count: {}, at_least: {}) failed\".format(metric_name, count, at_least)\n if count is not None:\n assert len(candidates) == count, msg\n else:\n assert len(candidates) >= at_least, msg\n
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_service_check(self, name, status=None, tags=None, count=None, at_least=1, hostname=None, message=None):\n \"\"\"\n Assert a service check was processed by this stub\n \"\"\"\n tags = normalize_tags(tags, sort=True)\n candidates = []\n for sc in self.service_checks(name):\n if status is not None and status != sc.status:\n continue\n\n if tags and tags != sorted(sc.tags):\n continue\n\n if hostname is not None and hostname != sc.hostname:\n continue\n\n if message is not None and message != sc.message:\n continue\n\n candidates.append(sc)\n\n expected_service_check = ServiceCheckStub(\n None, name=name, status=status, tags=tags, hostname=hostname, message=message\n )\n\n if count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(\n condition=condition, msg=msg, expected_stub=expected_service_check, submitted_elements=self._service_checks\n )\n
"},{"location":"base/api/#datadog_checks.base.stubs.aggregator.AggregatorStub.assert_event","title":"assert_event(msg_text, count=None, at_least=1, exact_match=True, tags=None, **kwargs)","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_event(self, msg_text, count=None, at_least=1, exact_match=True, tags=None, **kwargs):\n candidates = []\n for e in self.events:\n if exact_match and msg_text != e['msg_text'] or msg_text not in e['msg_text']:\n continue\n if tags and set(tags) != set(e['tags']):\n continue\n for name, value in iteritems(kwargs):\n if e[name] != value:\n break\n else:\n candidates.append(e)\n\n msg = \"Candidates size assertion for `{}`, count: {}, at_least: {}) failed\".format(msg_text, count, at_least)\n if count is not None:\n assert len(candidates) == count, msg\n else:\n assert len(candidates) >= at_least, msg\n
Checking type: By default we are asserting the in-app metric type (check_submission_type=False), asserting this type make sense for e2e (metrics collected from agent). For integrations tests, we can check the submission type with check_submission_type=True, or use check_metric_type=False not to check types.
Usage:
from datadog_checks.dev.utils import get_metadata_metrics\naggregator.assert_metrics_using_metadata(get_metadata_metrics())\n
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_metrics_using_metadata(\n self, metadata_metrics, check_metric_type=True, check_submission_type=False, exclude=None\n):\n \"\"\"\n Assert metrics using metadata.csv\n\n Checking type: By default we are asserting the in-app metric type (`check_submission_type=False`),\n asserting this type make sense for e2e (metrics collected from agent).\n For integrations tests, we can check the submission type with `check_submission_type=True`, or\n use `check_metric_type=False` not to check types.\n\n Usage:\n\n from datadog_checks.dev.utils import get_metadata_metrics\n aggregator.assert_metrics_using_metadata(get_metadata_metrics())\n\n \"\"\"\n\n exclude = exclude or []\n errors = set()\n for metric_name, metric_stubs in iteritems(self._metrics):\n if metric_name in exclude:\n continue\n for metric_stub in metric_stubs:\n metric_stub_name = backend_normalize_metric_name(metric_stub.name)\n actual_metric_type = AggregatorStub.METRIC_ENUM_MAP_REV[metric_stub.type]\n\n # We only check `*.count` metrics for histogram and historate submissions\n # Note: all Openmetrics histogram and summary metrics are actually separately submitted\n if check_submission_type and actual_metric_type in ['histogram', 'historate']:\n metric_stub_name += '.count'\n\n # Checking the metric is in `metadata.csv`\n if metric_stub_name not in metadata_metrics:\n errors.add(\"Expect `{}` to be in metadata.csv.\".format(metric_stub_name))\n continue\n\n expected_metric_type = metadata_metrics[metric_stub_name]['metric_type']\n if check_submission_type:\n # Integration tests type mapping\n actual_metric_type = AggregatorStub.METRIC_TYPE_SUBMISSION_TO_BACKEND_MAP[actual_metric_type]\n else:\n # E2E tests\n if actual_metric_type == 'monotonic_count' and expected_metric_type == 'count':\n actual_metric_type = 'count'\n\n if check_metric_type:\n if expected_metric_type != actual_metric_type:\n errors.add(\n \"Expect `{}` to have type `{}` but got `{}`.\".format(\n metric_stub_name, expected_metric_type, actual_metric_type\n )\n )\n\n assert not errors, \"Metadata assertion errors using metadata.csv:\" + \"\\n\\t- \".join([''] + sorted(errors))\n
"},{"location":"base/api/#datadog_checks.base.stubs.aggregator.AggregatorStub.assert_all_metrics_covered","title":"assert_all_metrics_covered()","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_all_metrics_covered(self):\n # use `condition` to avoid building the `msg` if not needed\n condition = self.metrics_asserted_pct >= 100.0\n msg = ''\n if not condition:\n prefix = '\\n\\t- '\n msg = 'Some metrics are collected but not asserted:'\n msg += '\\nAsserted Metrics:{}{}'.format(prefix, prefix.join(sorted(self._asserted)))\n msg += '\\nFound metrics that are not asserted:{}{}'.format(prefix, prefix.join(sorted(self.not_asserted())))\n assert condition, msg\n
Metrics are considered duplicate when all following fields match:
metric name
type (gauge, rate, etc)
tags
hostname
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_no_duplicate_metrics(self):\n \"\"\"\n Assert no duplicate metrics have been submitted.\n\n Metrics are considered duplicate when all following fields match:\n\n - metric name\n - type (gauge, rate, etc)\n - tags\n - hostname\n \"\"\"\n # metric types that intended to be called multiple times are ignored\n ignored_types = [self.COUNT, self.COUNTER]\n metric_stubs = [m for metrics in self._metrics.values() for m in metrics if m.type not in ignored_types]\n\n def stub_to_key_fn(stub):\n return stub.name, stub.type, str(sorted(stub.tags)), stub.hostname\n\n self._assert_no_duplicate_stub('metric', metric_stubs, stub_to_key_fn)\n
Assert no duplicate service checks have been submitted.
Service checks are considered duplicate when all following fields match
metric name
status
tags
hostname
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_no_duplicate_service_checks(self):\n \"\"\"\n Assert no duplicate service checks have been submitted.\n\n Service checks are considered duplicate when all following fields match:\n - metric name\n - status\n - tags\n - hostname\n \"\"\"\n service_check_stubs = [m for metrics in self._service_checks.values() for m in metrics]\n\n def stub_to_key_fn(stub):\n return stub.name, stub.status, str(sorted(stub.tags)), stub.hostname\n\n self._assert_no_duplicate_stub('service_check', service_check_stubs, stub_to_key_fn)\n
Assert no duplicate metrics and service checks have been submitted.
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_no_duplicate_all(self):\n \"\"\"\n Assert no duplicate metrics and service checks have been submitted.\n \"\"\"\n self.assert_no_duplicate_metrics()\n self.assert_no_duplicate_service_checks()\n
"},{"location":"base/api/#datadog_checks.base.stubs.aggregator.AggregatorStub.all_metrics_asserted","title":"all_metrics_asserted()","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
This implements the methods defined by the Agent's C bindings which in turn call the Go backend.
It also provides utility methods for test assertions.
Source code in datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py
class DatadogAgentStub(object):\n \"\"\"\n This implements the methods defined by the Agent's\n [C bindings](https://github.com/DataDog/datadog-agent/blob/master/rtloader/common/builtins/datadog_agent.c)\n which in turn call the\n [Go backend](https://github.com/DataDog/datadog-agent/blob/master/pkg/collector/python/datadog_agent.go).\n\n It also provides utility methods for test assertions.\n \"\"\"\n\n def __init__(self):\n self._sent_logs = defaultdict(list)\n self._metadata = {}\n self._cache = {}\n self._config = self.get_default_config()\n self._hostname = 'stubbed.hostname'\n self._process_start_time = 0\n self._external_tags = []\n self._host_tags = \"{}\"\n\n def get_default_config(self):\n return {'enable_metadata_collection': True, 'disable_unsafe_yaml': True}\n\n def reset(self):\n self._sent_logs.clear()\n self._metadata.clear()\n self._cache.clear()\n self._config = self.get_default_config()\n self._process_start_time = 0\n self._external_tags = []\n self._host_tags = \"{}\"\n\n def assert_logs(self, check_id, logs):\n sent_logs = self._sent_logs[check_id]\n assert sent_logs == logs, 'Expected {} logs for check {}, found {}. Submitted logs: {}'.format(\n len(logs), check_id, len(self._sent_logs[check_id]), repr(self._sent_logs)\n )\n\n def assert_metadata(self, check_id, data):\n actual = {}\n for name in data:\n key = (check_id, name)\n if key in self._metadata:\n actual[name] = self._metadata[key]\n assert data == actual\n\n def assert_metadata_count(self, count):\n metadata_items = len(self._metadata)\n assert metadata_items == count, 'Expected {} metadata items, found {}. Submitted metadata: {}'.format(\n count, metadata_items, repr(self._metadata)\n )\n\n def assert_external_tags(self, hostname, external_tags, match_tags_order=False):\n for h, tags in self._external_tags:\n if h == hostname:\n if not match_tags_order:\n external_tags = {k: sorted(v) for (k, v) in iteritems(external_tags)}\n tags = {k: sorted(v) for (k, v) in iteritems(tags)}\n\n assert (\n external_tags == tags\n ), 'Expected {} external tags for hostname {}, found {}. Submitted external tags: {}'.format(\n external_tags, hostname, tags, repr(self._external_tags)\n )\n return\n\n raise AssertionError('Hostname {} not found in external tags {}'.format(hostname, repr(self._external_tags)))\n\n def assert_external_tags_count(self, count):\n tags_count = len(self._external_tags)\n assert tags_count == count, 'Expected {} external tags items, found {}. Submitted external tags: {}'.format(\n count, tags_count, repr(self._external_tags)\n )\n\n def get_hostname(self):\n return self._hostname\n\n def set_hostname(self, hostname):\n self._hostname = hostname\n\n def reset_hostname(self):\n self._hostname = 'stubbed.hostname'\n\n def get_host_tags(self):\n return self._host_tags\n\n def _set_host_tags(self, tags_dict):\n self._host_tags = json.dumps(tags_dict)\n\n def _reset_host_tags(self):\n self._host_tags = \"{}\"\n\n def get_config(self, config_option):\n return self._config.get(config_option, '')\n\n def get_version(self):\n return '0.0.0'\n\n def log(self, *args, **kwargs):\n pass\n\n def set_check_metadata(self, check_id, name, value):\n self._metadata[(check_id, name)] = value\n\n def send_log(self, log_line, check_id):\n self._sent_logs[check_id].append(from_json(log_line))\n\n def set_external_tags(self, external_tags):\n self._external_tags = external_tags\n\n def tracemalloc_enabled(self, *args, **kwargs):\n return False\n\n def write_persistent_cache(self, key, value):\n self._cache[key] = value\n\n def read_persistent_cache(self, key):\n return self._cache.get(key, '')\n\n def obfuscate_sql(self, query, options=None):\n # Full obfuscation implementation is in go code.\n if options:\n # Options provided is a JSON string because the Go stub requires it, whereas\n # the python stub does not for things such as testing.\n if from_json(options).get('return_json_metadata', False):\n return to_json({'query': re.sub(r'\\s+', ' ', query or '').strip(), 'metadata': {}})\n return re.sub(r'\\s+', ' ', query or '').strip()\n\n def obfuscate_sql_exec_plan(self, plan, normalize=False):\n # Passthrough stub: obfuscation implementation is in Go code.\n return plan\n\n def get_process_start_time(self):\n return self._process_start_time\n\n def set_process_start_time(self, time):\n self._process_start_time = time\n\n def obfuscate_mongodb_string(self, command):\n # Passthrough stub: obfuscation implementation is in Go code.\n return command\n
"},{"location":"base/api/#datadog_checks.base.stubs.datadog_agent.DatadogAgentStub.assert_metadata","title":"assert_metadata(check_id, data)","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py
def assert_metadata(self, check_id, data):\n actual = {}\n for name in data:\n key = (check_id, name)\n if key in self._metadata:\n actual[name] = self._metadata[key]\n assert data == actual\n
"},{"location":"base/api/#datadog_checks.base.stubs.datadog_agent.DatadogAgentStub.assert_metadata_count","title":"assert_metadata_count(count)","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py
"},{"location":"base/api/#datadog_checks.base.stubs.datadog_agent.DatadogAgentStub.reset","title":"reset()","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py
This list enumerates what is collected from your system by each integration. For more information on metrics, see the Metric Types documentation. You can find the metrics for each integration in that integration's metadata.csv file. You can also set up custom metrics, so if the integration doesn\u2019t offer a metric out of the box, you can usually add it.
The gauge metric submission type represents a snapshot of events in one time interval. This representative snapshot value is the last value submitted to the Agent during a time interval. A gauge can be used to take a measure of something reporting continuously\u2014like the available disk space or memory used.
The count metric submission type represents the total number of event occurrences in one time interval. A count can be used to track the total number of connections made to a database or the total number of requests to an endpoint. This number of events can increase or decrease over time\u2014it is not monotonically increasing.
The rate metric submission type represents the total number of event occurrences per second in one time interval. A rate can be used to track how often something is happening\u2014like the frequency of connections made to a database or the flow of requests made to an endpoint.
The histogram metric submission type represents the statistical distribution of a set of values calculated Agent-side in one time interval. Datadog\u2019s histogram metric type is an extension of the StatsD timing metric type: the Agent aggregates the values that are sent in a defined time interval and produces different metrics which represent the set of values.
Within every integration, you can specify the value of __NAMESPACE__:
from datadog_checks.base import AgentCheck\n\n\nclass AwesomeCheck(AgentCheck):\n __NAMESPACE__ = 'awesome'\n\n...\n
This is an optional addition, but it makes submissions easier since it prefixes every metric with the __NAMESPACE__ automatically. In this case it would append awesome. to each metric submitted to Datadog.
If you wish to ignore the namespace for any reason, you can append an optional Boolean raw=True to each submission:
In the AgentCheck class, there is a useful property called check_initializations, which you can use to execute functions that are called once before the first check run. You can fill up check_initializations with instructions in the __init__ function of an integration. For example, you could use it to parse configuration information before running a check. Listed below is an example with Airflow:
class AirflowCheck(AgentCheck):\n def __init__(self, name, init_config, instances):\n super(AirflowCheck, self).__init__(name, init_config, instances)\n\n self._url = self.instance.get('url', '')\n self._tags = self.instance.get('tags', [])\n\n # The Agent only makes one attempt to instantiate each AgentCheck so any errors occurring\n # in `__init__` are logged just once, making it difficult to spot. Therefore,\n # potential configuration errors are emitted as part of the check run phase.\n # The configuration is only parsed once if it succeed, otherwise it's retried.\n self.check_initializations.append(self._parse_config)\n\n...\n
This class accepts a single dict argument which is necessary to run the query. The representation is based on our custom_queries format originally designed and implemented in #1528.
It is now part of all our database integrations and other products have since adopted this format.
Source code in datadog_checks_base/datadog_checks/base/utils/db/query.py
class Query(object):\n \"\"\"\n This class accepts a single `dict` argument which is necessary to run the query. The representation\n is based on our `custom_queries` format originally designed and implemented in !1528.\n\n It is now part of all our database integrations and\n [other](https://cloud.google.com/solutions/sap/docs/sap-hana-monitoring-agent-planning-guide#defining_custom_queries)\n products have since adopted this format.\n \"\"\"\n\n def __init__(self, query_data):\n '''\n Parameters:\n query_data (Dict[str, Any]): The query data to run the query. It should contain the following fields:\n - name (str): The name of the query.\n - query (str): The query to run.\n - columns (List[Dict[str, Any]]): Each column should contain the following fields:\n - name (str): The name of the column.\n - type (str): The type of the column.\n - (Optional) Any other field that the column transformer for the type requires.\n - (Optional) extras (List[Dict[str, Any]]): Each extra transformer should contain the following fields:\n - name (str): The name of the extra transformer.\n - type (str): The type of the extra transformer.\n - (Optional) Any other field that the extra transformer for the type requires.\n - (Optional) tags (List[str]): The tags to add to the query result.\n - (Optional) collection_interval (int): The collection interval (in seconds) of the query.\n Note:\n If collection_interval is None, the query will be run every check run.\n If the collection interval is less than check collection interval,\n the query will be run every check run.\n If the collection interval is greater than check collection interval,\n the query will NOT BE RUN exactly at the collection interval.\n The query will be run at the next check run after the collection interval has passed.\n - (Optional) metric_prefix (str): The prefix to add to the metric name.\n Note: If the metric prefix is None, the default metric prefix `<INTEGRATION>.` will be used.\n '''\n # Contains the data to fill the rest of the attributes\n self.query_data = deepcopy(query_data or {}) # type: Dict[str, Any]\n self.name = None # type: str\n # The actual query\n self.query = None # type: str\n # Contains a mapping of column_name -> column_type, transformer\n self.column_transformers = None # type: Tuple[Tuple[str, Tuple[str, Transformer]]]\n # These transformers are used to collect extra metrics calculated from the query result\n self.extra_transformers = None # type: List[Tuple[str, Transformer]]\n # Contains the tags defined in query_data, more tags can be added later from the query result\n self.base_tags = None # type: List[str]\n # The collecton interval (in seconds) of the query. If None, the query will be run every check run.\n self.collection_interval = None # type: int\n # The last time the query was executed. If None, the query has never been executed.\n # This is only used when the collection_interval is not None.\n self.__last_execution_time = None # type: float\n # whether to ignore any defined namespace prefix. True when `metric_prefix` is defined.\n self.metric_name_raw = False # type: bool\n\n def compile(\n self,\n column_transformers, # type: Dict[str, TransformerFactory]\n extra_transformers, # type: Dict[str, TransformerFactory]\n ):\n # type: (...) -> None\n\n \"\"\"\n This idempotent method will be called by `QueryManager.compile_queries` so you\n should never need to call it directly.\n \"\"\"\n # Check for previous compilation\n if self.name is not None:\n return\n\n query_name = self.query_data.get('name')\n if not query_name:\n raise ValueError('query field `name` is required')\n elif not isinstance(query_name, str):\n raise ValueError('query field `name` must be a string')\n\n metric_prefix = self.query_data.get('metric_prefix')\n if metric_prefix is not None:\n if not isinstance(metric_prefix, str):\n raise ValueError('field `metric_prefix` for {} must be a string'.format(query_name))\n elif not metric_prefix:\n raise ValueError('field `metric_prefix` for {} must not be empty'.format(query_name))\n\n query = self.query_data.get('query')\n if not query:\n raise ValueError('field `query` for {} is required'.format(query_name))\n elif query_name.startswith('custom query #') and not isinstance(query, str):\n raise ValueError('field `query` for {} must be a string'.format(query_name))\n\n columns = self.query_data.get('columns')\n if not columns:\n raise ValueError('field `columns` for {} is required'.format(query_name))\n elif not isinstance(columns, list):\n raise ValueError('field `columns` for {} must be a list'.format(query_name))\n\n tags = self.query_data.get('tags', [])\n if tags is not None and not isinstance(tags, list):\n raise ValueError('field `tags` for {} must be a list'.format(query_name))\n\n # Keep track of all defined names\n sources = {}\n\n column_data = []\n for i, column in enumerate(columns, 1):\n # Columns can be ignored via configuration.\n if not column:\n column_data.append((None, None))\n continue\n elif not isinstance(column, dict):\n raise ValueError('column #{} of {} is not a mapping'.format(i, query_name))\n\n column_name = column.get('name')\n if not column_name:\n raise ValueError('field `name` for column #{} of {} is required'.format(i, query_name))\n elif not isinstance(column_name, str):\n raise ValueError('field `name` for column #{} of {} must be a string'.format(i, query_name))\n elif column_name in sources:\n raise ValueError(\n 'the name {} of {} was already defined in {} #{}'.format(\n column_name, query_name, sources[column_name]['type'], sources[column_name]['index']\n )\n )\n\n sources[column_name] = {'type': 'column', 'index': i}\n\n column_type = column.get('type')\n if not column_type:\n raise ValueError('field `type` for column {} of {} is required'.format(column_name, query_name))\n elif not isinstance(column_type, str):\n raise ValueError('field `type` for column {} of {} must be a string'.format(column_name, query_name))\n elif column_type == 'source':\n column_data.append((column_name, (None, None)))\n continue\n elif column_type not in column_transformers:\n raise ValueError('unknown type `{}` for column {} of {}'.format(column_type, column_name, query_name))\n\n __column_type_is_tag = column_type in ('tag', 'tag_list', 'tag_not_null')\n modifiers = {key: value for key, value in column.items() if key not in ('name', 'type')}\n\n try:\n if not __column_type_is_tag and metric_prefix:\n # if metric_prefix is defined, we prepend it to the column name\n column_name = \"{}.{}\".format(metric_prefix, column_name)\n transformer = column_transformers[column_type](column_transformers, column_name, **modifiers)\n except Exception as e:\n error = 'error compiling type `{}` for column {} of {}: {}'.format(\n column_type, column_name, query_name, e\n )\n\n # Prepend helpful error text.\n #\n # When an exception is raised in the context of another one, both will be printed. To avoid\n # this we set the context to None. https://www.python.org/dev/peps/pep-0409/\n raise_from(type(e)(error), None)\n else:\n if __column_type_is_tag:\n column_data.append((column_name, (column_type, transformer)))\n else:\n # All these would actually submit data. As that is the default case, we represent it as\n # a reference to None since if we use e.g. `value` it would never be checked anyway.\n column_data.append((column_name, (None, transformer)))\n\n submission_transformers = column_transformers.copy() # type: Dict[str, Transformer]\n submission_transformers.pop('tag')\n submission_transformers.pop('tag_list')\n submission_transformers.pop('tag_not_null')\n\n extras = self.query_data.get('extras', []) # type: List[Dict[str, Any]]\n if not isinstance(extras, list):\n raise ValueError('field `extras` for {} must be a list'.format(query_name))\n\n extra_data = [] # type: List[Tuple[str, Transformer]]\n for i, extra in enumerate(extras, 1):\n if not isinstance(extra, dict):\n raise ValueError('extra #{} of {} is not a mapping'.format(i, query_name))\n\n extra_type = extra.get('type') # type: str\n extra_name = extra.get('name') # type: str\n if extra_type == 'log':\n # The name is unused\n extra_name = 'log'\n elif not extra_name:\n raise ValueError('field `name` for extra #{} of {} is required'.format(i, query_name))\n elif not isinstance(extra_name, str):\n raise ValueError('field `name` for extra #{} of {} must be a string'.format(i, query_name))\n elif extra_name in sources:\n raise ValueError(\n 'the name {} of {} was already defined in {} #{}'.format(\n extra_name, query_name, sources[extra_name]['type'], sources[extra_name]['index']\n )\n )\n\n sources[extra_name] = {'type': 'extra', 'index': i}\n\n if not extra_type:\n if 'expression' in extra:\n extra_type = 'expression'\n else:\n raise ValueError('field `type` for extra {} of {} is required'.format(extra_name, query_name))\n elif not isinstance(extra_type, str):\n raise ValueError('field `type` for extra {} of {} must be a string'.format(extra_name, query_name))\n elif extra_type not in extra_transformers and extra_type not in submission_transformers:\n raise ValueError('unknown type `{}` for extra {} of {}'.format(extra_type, extra_name, query_name))\n\n transformer_factory = extra_transformers.get(\n extra_type, submission_transformers.get(extra_type)\n ) # type: TransformerFactory\n\n extra_source = extra.get('source')\n if extra_type in submission_transformers:\n if not extra_source:\n raise ValueError('field `source` for extra {} of {} is required'.format(extra_name, query_name))\n\n modifiers = {key: value for key, value in extra.items() if key not in ('name', 'type', 'source')}\n else:\n modifiers = {key: value for key, value in extra.items() if key not in ('name', 'type')}\n modifiers['sources'] = sources\n\n try:\n transformer = transformer_factory(submission_transformers, extra_name, **modifiers)\n except Exception as e:\n error = 'error compiling type `{}` for extra {} of {}: {}'.format(extra_type, extra_name, query_name, e)\n\n raise_from(type(e)(error), None)\n else:\n if extra_type in submission_transformers:\n transformer = create_extra_transformer(transformer, extra_source)\n\n extra_data.append((extra_name, transformer))\n\n collection_interval = self.query_data.get('collection_interval')\n if collection_interval is not None:\n if not isinstance(collection_interval, (int, float)):\n raise ValueError('field `collection_interval` for {} must be a number'.format(query_name))\n elif int(collection_interval) <= 0:\n raise ValueError(\n 'field `collection_interval` for {} must be a positive number after rounding'.format(query_name)\n )\n collection_interval = int(collection_interval)\n\n self.name = query_name\n self.query = query\n self.column_transformers = tuple(column_data)\n self.extra_transformers = tuple(extra_data)\n self.base_tags = tags\n self.collection_interval = collection_interval\n self.metric_name_raw = metric_prefix is not None\n del self.query_data\n\n def should_execute(self):\n '''\n Check if the query should be executed based on the collection interval.\n\n :return: True if the query should be executed, False otherwise.\n '''\n if self.collection_interval is None:\n # if the collection interval is None, the query should always be executed.\n return True\n\n now = get_timestamp()\n if self.__last_execution_time is None or now - self.__last_execution_time >= self.collection_interval:\n # if the last execution time is None (the query has never been executed),\n # if the time since the last execution is greater than or equal to the collection interval,\n # the query should be executed.\n self.__last_execution_time = now\n return True\n\n return False\n
Name Type Description Default query_dataDict[str, Any]
The query data to run the query. It should contain the following fields: - name (str): The name of the query. - query (str): The query to run. - columns (List[Dict[str, Any]]): Each column should contain the following fields: - name (str): The name of the column. - type (str): The type of the column. - (Optional) Any other field that the column transformer for the type requires. - (Optional) extras (List[Dict[str, Any]]): Each extra transformer should contain the following fields: - name (str): The name of the extra transformer. - type (str): The type of the extra transformer. - (Optional) Any other field that the extra transformer for the type requires. - (Optional) tags (List[str]): The tags to add to the query result. - (Optional) collection_interval (int): The collection interval (in seconds) of the query. Note: If collection_interval is None, the query will be run every check run. If the collection interval is less than check collection interval, the query will be run every check run. If the collection interval is greater than check collection interval, the query will NOT BE RUN exactly at the collection interval. The query will be run at the next check run after the collection interval has passed. - (Optional) metric_prefix (str): The prefix to add to the metric name. Note: If the metric prefix is None, the default metric prefix <INTEGRATION>. will be used.
required Source code in datadog_checks_base/datadog_checks/base/utils/db/query.py
def __init__(self, query_data):\n '''\n Parameters:\n query_data (Dict[str, Any]): The query data to run the query. It should contain the following fields:\n - name (str): The name of the query.\n - query (str): The query to run.\n - columns (List[Dict[str, Any]]): Each column should contain the following fields:\n - name (str): The name of the column.\n - type (str): The type of the column.\n - (Optional) Any other field that the column transformer for the type requires.\n - (Optional) extras (List[Dict[str, Any]]): Each extra transformer should contain the following fields:\n - name (str): The name of the extra transformer.\n - type (str): The type of the extra transformer.\n - (Optional) Any other field that the extra transformer for the type requires.\n - (Optional) tags (List[str]): The tags to add to the query result.\n - (Optional) collection_interval (int): The collection interval (in seconds) of the query.\n Note:\n If collection_interval is None, the query will be run every check run.\n If the collection interval is less than check collection interval,\n the query will be run every check run.\n If the collection interval is greater than check collection interval,\n the query will NOT BE RUN exactly at the collection interval.\n The query will be run at the next check run after the collection interval has passed.\n - (Optional) metric_prefix (str): The prefix to add to the metric name.\n Note: If the metric prefix is None, the default metric prefix `<INTEGRATION>.` will be used.\n '''\n # Contains the data to fill the rest of the attributes\n self.query_data = deepcopy(query_data or {}) # type: Dict[str, Any]\n self.name = None # type: str\n # The actual query\n self.query = None # type: str\n # Contains a mapping of column_name -> column_type, transformer\n self.column_transformers = None # type: Tuple[Tuple[str, Tuple[str, Transformer]]]\n # These transformers are used to collect extra metrics calculated from the query result\n self.extra_transformers = None # type: List[Tuple[str, Transformer]]\n # Contains the tags defined in query_data, more tags can be added later from the query result\n self.base_tags = None # type: List[str]\n # The collecton interval (in seconds) of the query. If None, the query will be run every check run.\n self.collection_interval = None # type: int\n # The last time the query was executed. If None, the query has never been executed.\n # This is only used when the collection_interval is not None.\n self.__last_execution_time = None # type: float\n # whether to ignore any defined namespace prefix. True when `metric_prefix` is defined.\n self.metric_name_raw = False # type: bool\n
This idempotent method will be called by QueryManager.compile_queries so you should never need to call it directly.
Source code in datadog_checks_base/datadog_checks/base/utils/db/query.py
def compile(\n self,\n column_transformers, # type: Dict[str, TransformerFactory]\n extra_transformers, # type: Dict[str, TransformerFactory]\n):\n # type: (...) -> None\n\n \"\"\"\n This idempotent method will be called by `QueryManager.compile_queries` so you\n should never need to call it directly.\n \"\"\"\n # Check for previous compilation\n if self.name is not None:\n return\n\n query_name = self.query_data.get('name')\n if not query_name:\n raise ValueError('query field `name` is required')\n elif not isinstance(query_name, str):\n raise ValueError('query field `name` must be a string')\n\n metric_prefix = self.query_data.get('metric_prefix')\n if metric_prefix is not None:\n if not isinstance(metric_prefix, str):\n raise ValueError('field `metric_prefix` for {} must be a string'.format(query_name))\n elif not metric_prefix:\n raise ValueError('field `metric_prefix` for {} must not be empty'.format(query_name))\n\n query = self.query_data.get('query')\n if not query:\n raise ValueError('field `query` for {} is required'.format(query_name))\n elif query_name.startswith('custom query #') and not isinstance(query, str):\n raise ValueError('field `query` for {} must be a string'.format(query_name))\n\n columns = self.query_data.get('columns')\n if not columns:\n raise ValueError('field `columns` for {} is required'.format(query_name))\n elif not isinstance(columns, list):\n raise ValueError('field `columns` for {} must be a list'.format(query_name))\n\n tags = self.query_data.get('tags', [])\n if tags is not None and not isinstance(tags, list):\n raise ValueError('field `tags` for {} must be a list'.format(query_name))\n\n # Keep track of all defined names\n sources = {}\n\n column_data = []\n for i, column in enumerate(columns, 1):\n # Columns can be ignored via configuration.\n if not column:\n column_data.append((None, None))\n continue\n elif not isinstance(column, dict):\n raise ValueError('column #{} of {} is not a mapping'.format(i, query_name))\n\n column_name = column.get('name')\n if not column_name:\n raise ValueError('field `name` for column #{} of {} is required'.format(i, query_name))\n elif not isinstance(column_name, str):\n raise ValueError('field `name` for column #{} of {} must be a string'.format(i, query_name))\n elif column_name in sources:\n raise ValueError(\n 'the name {} of {} was already defined in {} #{}'.format(\n column_name, query_name, sources[column_name]['type'], sources[column_name]['index']\n )\n )\n\n sources[column_name] = {'type': 'column', 'index': i}\n\n column_type = column.get('type')\n if not column_type:\n raise ValueError('field `type` for column {} of {} is required'.format(column_name, query_name))\n elif not isinstance(column_type, str):\n raise ValueError('field `type` for column {} of {} must be a string'.format(column_name, query_name))\n elif column_type == 'source':\n column_data.append((column_name, (None, None)))\n continue\n elif column_type not in column_transformers:\n raise ValueError('unknown type `{}` for column {} of {}'.format(column_type, column_name, query_name))\n\n __column_type_is_tag = column_type in ('tag', 'tag_list', 'tag_not_null')\n modifiers = {key: value for key, value in column.items() if key not in ('name', 'type')}\n\n try:\n if not __column_type_is_tag and metric_prefix:\n # if metric_prefix is defined, we prepend it to the column name\n column_name = \"{}.{}\".format(metric_prefix, column_name)\n transformer = column_transformers[column_type](column_transformers, column_name, **modifiers)\n except Exception as e:\n error = 'error compiling type `{}` for column {} of {}: {}'.format(\n column_type, column_name, query_name, e\n )\n\n # Prepend helpful error text.\n #\n # When an exception is raised in the context of another one, both will be printed. To avoid\n # this we set the context to None. https://www.python.org/dev/peps/pep-0409/\n raise_from(type(e)(error), None)\n else:\n if __column_type_is_tag:\n column_data.append((column_name, (column_type, transformer)))\n else:\n # All these would actually submit data. As that is the default case, we represent it as\n # a reference to None since if we use e.g. `value` it would never be checked anyway.\n column_data.append((column_name, (None, transformer)))\n\n submission_transformers = column_transformers.copy() # type: Dict[str, Transformer]\n submission_transformers.pop('tag')\n submission_transformers.pop('tag_list')\n submission_transformers.pop('tag_not_null')\n\n extras = self.query_data.get('extras', []) # type: List[Dict[str, Any]]\n if not isinstance(extras, list):\n raise ValueError('field `extras` for {} must be a list'.format(query_name))\n\n extra_data = [] # type: List[Tuple[str, Transformer]]\n for i, extra in enumerate(extras, 1):\n if not isinstance(extra, dict):\n raise ValueError('extra #{} of {} is not a mapping'.format(i, query_name))\n\n extra_type = extra.get('type') # type: str\n extra_name = extra.get('name') # type: str\n if extra_type == 'log':\n # The name is unused\n extra_name = 'log'\n elif not extra_name:\n raise ValueError('field `name` for extra #{} of {} is required'.format(i, query_name))\n elif not isinstance(extra_name, str):\n raise ValueError('field `name` for extra #{} of {} must be a string'.format(i, query_name))\n elif extra_name in sources:\n raise ValueError(\n 'the name {} of {} was already defined in {} #{}'.format(\n extra_name, query_name, sources[extra_name]['type'], sources[extra_name]['index']\n )\n )\n\n sources[extra_name] = {'type': 'extra', 'index': i}\n\n if not extra_type:\n if 'expression' in extra:\n extra_type = 'expression'\n else:\n raise ValueError('field `type` for extra {} of {} is required'.format(extra_name, query_name))\n elif not isinstance(extra_type, str):\n raise ValueError('field `type` for extra {} of {} must be a string'.format(extra_name, query_name))\n elif extra_type not in extra_transformers and extra_type not in submission_transformers:\n raise ValueError('unknown type `{}` for extra {} of {}'.format(extra_type, extra_name, query_name))\n\n transformer_factory = extra_transformers.get(\n extra_type, submission_transformers.get(extra_type)\n ) # type: TransformerFactory\n\n extra_source = extra.get('source')\n if extra_type in submission_transformers:\n if not extra_source:\n raise ValueError('field `source` for extra {} of {} is required'.format(extra_name, query_name))\n\n modifiers = {key: value for key, value in extra.items() if key not in ('name', 'type', 'source')}\n else:\n modifiers = {key: value for key, value in extra.items() if key not in ('name', 'type')}\n modifiers['sources'] = sources\n\n try:\n transformer = transformer_factory(submission_transformers, extra_name, **modifiers)\n except Exception as e:\n error = 'error compiling type `{}` for extra {} of {}: {}'.format(extra_type, extra_name, query_name, e)\n\n raise_from(type(e)(error), None)\n else:\n if extra_type in submission_transformers:\n transformer = create_extra_transformer(transformer, extra_source)\n\n extra_data.append((extra_name, transformer))\n\n collection_interval = self.query_data.get('collection_interval')\n if collection_interval is not None:\n if not isinstance(collection_interval, (int, float)):\n raise ValueError('field `collection_interval` for {} must be a number'.format(query_name))\n elif int(collection_interval) <= 0:\n raise ValueError(\n 'field `collection_interval` for {} must be a positive number after rounding'.format(query_name)\n )\n collection_interval = int(collection_interval)\n\n self.name = query_name\n self.query = query\n self.column_transformers = tuple(column_data)\n self.extra_transformers = tuple(extra_data)\n self.base_tags = tags\n self.collection_interval = collection_interval\n self.metric_name_raw = metric_prefix is not None\n del self.query_data\n
Note: This class is not in charge of opening or closing connections, just running queries.
Source code in datadog_checks_base/datadog_checks/base/utils/db/core.py
class QueryManager(QueryExecutor):\n \"\"\"\n This class is in charge of running any number of `Query` instances for a single Check instance.\n\n You will most often see it created during Check initialization like this:\n\n ```python\n self._query_manager = QueryManager(\n self,\n self.execute_query,\n queries=[\n queries.SomeQuery1,\n queries.SomeQuery2,\n queries.SomeQuery3,\n queries.SomeQuery4,\n queries.SomeQuery5,\n ],\n tags=self.instance.get('tags', []),\n error_handler=self._error_sanitizer,\n )\n self.check_initializations.append(self._query_manager.compile_queries)\n ```\n\n Note: This class is not in charge of opening or closing connections, just running queries.\n \"\"\"\n\n def __init__(\n self,\n check, # type: AgentCheck\n executor, # type: QueriesExecutor\n queries=None, # type: List[Dict[str, Any]]\n tags=None, # type: List[str]\n error_handler=None, # type: Callable[[str], str]\n hostname=None, # type: str\n ): # type: (...) -> QueryManager\n \"\"\"\n - **check** (_AgentCheck_) - an instance of a Check\n - **executor** (_callable_) - a callable accepting a `str` query as its sole argument and returning\n a sequence representing either the full result set or an iterator over the result set\n - **queries** (_List[Dict]_) - a list of queries in dict format\n - **tags** (_List[str]_) - a list of tags to associate with every submission\n - **error_handler** (_callable_) - a callable accepting a `str` error as its sole argument and returning\n a sanitized string, useful for scrubbing potentially sensitive information libraries emit\n \"\"\"\n super(QueryManager, self).__init__(\n executor=executor,\n submitter=check,\n queries=queries,\n tags=tags,\n error_handler=error_handler,\n hostname=hostname,\n logger=check.log,\n )\n self.check = check # type: AgentCheck\n\n only_custom_queries = is_affirmative(self.check.instance.get('only_custom_queries', False)) # type: bool\n custom_queries = list(self.check.instance.get('custom_queries', [])) # type: List[str]\n use_global_custom_queries = self.check.instance.get('use_global_custom_queries', True) # type: str\n\n # Handle overrides\n if use_global_custom_queries == 'extend':\n custom_queries.extend(self.check.init_config.get('global_custom_queries', []))\n elif (\n not custom_queries\n and 'global_custom_queries' in self.check.init_config\n and is_affirmative(use_global_custom_queries)\n ):\n custom_queries = self.check.init_config.get('global_custom_queries', [])\n\n # Override statement queries if only running custom queries\n if only_custom_queries:\n self.queries = []\n\n # Deduplicate\n for i, custom_query in enumerate(iter_unique(custom_queries), 1):\n query = Query(custom_query)\n query.query_data.setdefault('name', 'custom query #{}'.format(i))\n self.queries.append(query)\n\n if len(self.queries) == 0:\n self.logger.warning('QueryManager initialized with no query')\n\n def execute(self, extra_tags=None):\n # This needs to stay here b/c when we construct a QueryManager in a check's __init__\n # there is no check ID at that point\n self.logger = self.check.log\n\n return super(QueryManager, self).execute(extra_tags)\n
executor (callable) - a callable accepting a str query as its sole argument and returning a sequence representing either the full result set or an iterator over the result set
queries (List[Dict]) - a list of queries in dict format
tags (List[str]) - a list of tags to associate with every submission
error_handler (callable) - a callable accepting a str error as its sole argument and returning a sanitized string, useful for scrubbing potentially sensitive information libraries emit
Source code in datadog_checks_base/datadog_checks/base/utils/db/core.py
def __init__(\n self,\n check, # type: AgentCheck\n executor, # type: QueriesExecutor\n queries=None, # type: List[Dict[str, Any]]\n tags=None, # type: List[str]\n error_handler=None, # type: Callable[[str], str]\n hostname=None, # type: str\n): # type: (...) -> QueryManager\n \"\"\"\n - **check** (_AgentCheck_) - an instance of a Check\n - **executor** (_callable_) - a callable accepting a `str` query as its sole argument and returning\n a sequence representing either the full result set or an iterator over the result set\n - **queries** (_List[Dict]_) - a list of queries in dict format\n - **tags** (_List[str]_) - a list of tags to associate with every submission\n - **error_handler** (_callable_) - a callable accepting a `str` error as its sole argument and returning\n a sanitized string, useful for scrubbing potentially sensitive information libraries emit\n \"\"\"\n super(QueryManager, self).__init__(\n executor=executor,\n submitter=check,\n queries=queries,\n tags=tags,\n error_handler=error_handler,\n hostname=hostname,\n logger=check.log,\n )\n self.check = check # type: AgentCheck\n\n only_custom_queries = is_affirmative(self.check.instance.get('only_custom_queries', False)) # type: bool\n custom_queries = list(self.check.instance.get('custom_queries', [])) # type: List[str]\n use_global_custom_queries = self.check.instance.get('use_global_custom_queries', True) # type: str\n\n # Handle overrides\n if use_global_custom_queries == 'extend':\n custom_queries.extend(self.check.init_config.get('global_custom_queries', []))\n elif (\n not custom_queries\n and 'global_custom_queries' in self.check.init_config\n and is_affirmative(use_global_custom_queries)\n ):\n custom_queries = self.check.init_config.get('global_custom_queries', [])\n\n # Override statement queries if only running custom queries\n if only_custom_queries:\n self.queries = []\n\n # Deduplicate\n for i, custom_query in enumerate(iter_unique(custom_queries), 1):\n query = Query(custom_query)\n query.query_data.setdefault('name', 'custom query #{}'.format(i))\n self.queries.append(query)\n\n if len(self.queries) == 0:\n self.logger.warning('QueryManager initialized with no query')\n
"},{"location":"base/databases/#datadog_checks.base.utils.db.core.QueryManager.execute","title":"execute(extra_tags=None)","text":"Source code in datadog_checks_base/datadog_checks/base/utils/db/core.py
def execute(self, extra_tags=None):\n # This needs to stay here b/c when we construct a QueryManager in a check's __init__\n # there is no check ID at that point\n self.logger = self.check.log\n\n return super(QueryManager, self).execute(extra_tags)\n
For example, say you want to collect the fields named foo and bar. Typically, they would be stored like:
foo bar 4 2
and would be queried like:
SELECT foo, bar FROM ...\n
Often, you will instead find data stored in the following format:
metric value foo 4 bar 2
and would be queried like:
SELECT metric, value FROM ...\n
In this case, the metric column stores the name with which to match on and its value is stored in a separate column.
The required items modifier is a mapping of matched names to column data values. Consider the values to be exactly the same as the entries in the columns top level field. You must also define a source modifier either for this transformer itself or in the values of items (which will take precedence). The source will be treated as the value of the match.
bar - test.bar.total as a gauge and test.bar.count as a monotonic_count, both with a value of 5
baz - nothing since it was not defined as a match item
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_match(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n This is used for querying unstructured data.\n\n For example, say you want to collect the fields named `foo` and `bar`. Typically, they would be stored like:\n\n | foo | bar |\n | --- | --- |\n | 4 | 2 |\n\n and would be queried like:\n\n ```sql\n SELECT foo, bar FROM ...\n ```\n\n Often, you will instead find data stored in the following format:\n\n | metric | value |\n | ------ | ----- |\n | foo | 4 |\n | bar | 2 |\n\n and would be queried like:\n\n ```sql\n SELECT metric, value FROM ...\n ```\n\n In this case, the `metric` column stores the name with which to match on and its `value` is\n stored in a separate column.\n\n The required `items` modifier is a mapping of matched names to column data values. Consider the values\n to be exactly the same as the entries in the `columns` top level field. You must also define a `source`\n modifier either for this transformer itself or in the values of `items` (which will take precedence).\n The source will be treated as the value of the match.\n\n Say this is your configuration:\n\n ```yaml\n query: SELECT source1, source2, metric FROM TABLE\n columns:\n - name: value1\n type: source\n - name: value2\n type: source\n - name: metric_name\n type: match\n source: value1\n items:\n foo:\n name: test.foo\n type: gauge\n source: value2\n bar:\n name: test.bar\n type: monotonic_gauge\n ```\n\n and the result set is:\n\n | source1 | source2 | metric |\n | ------- | ------- | ------ |\n | 1 | 2 | foo |\n | 3 | 4 | baz |\n | 5 | 6 | bar |\n\n Here's what would be submitted:\n\n - `foo` - `test.foo` as a `gauge` with a value of `2`\n - `bar` - `test.bar.total` as a `gauge` and `test.bar.count` as a `monotonic_count`, both with a value of `5`\n - `baz` - nothing since it was not defined as a match item\n \"\"\"\n # Do work in a separate function to avoid having to `del` a bunch of variables\n compiled_items = _compile_match_items(transformers, modifiers) # type: Dict[str, Tuple[str, Transformer]]\n\n def match(sources, value, **kwargs):\n # type: (Dict[str, Any], str, Dict[str, Any]) -> None\n if value in compiled_items:\n source, transformer = compiled_items[value] # type: str, Transformer\n transformer(sources, sources[source], **kwargs)\n\n return match\n
Send the result as percentage of time since the last check run as a rate.
For example, say the result is a forever increasing counter representing the total time spent pausing for garbage collection since start up. That number by itself is quite useless, but as a percentage of time spent pausing since the previous collection interval it becomes a useful metric.
There is one required parameter called scale that indicates what unit of time the result should be considered. Valid values are:
second
millisecond
microsecond
nanosecond
You may also define the unit as an integer number of parts compared to seconds e.g. millisecond is equivalent to 1000.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_temporal_percent(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Send the result as percentage of time since the last check run as a `rate`.\n\n For example, say the result is a forever increasing counter representing the total time spent pausing for\n garbage collection since start up. That number by itself is quite useless, but as a percentage of time spent\n pausing since the previous collection interval it becomes a useful metric.\n\n There is one required parameter called `scale` that indicates what unit of time the result should be considered.\n Valid values are:\n\n - `second`\n - `millisecond`\n - `microsecond`\n - `nanosecond`\n\n You may also define the unit as an integer number of parts compared to seconds e.g. `millisecond` is\n equivalent to `1000`.\n \"\"\"\n scale = modifiers.pop('scale', None)\n if scale is None:\n raise ValueError('the `scale` parameter is required')\n\n if isinstance(scale, str):\n scale = constants.TIME_UNITS.get(scale.lower())\n if scale is None:\n raise ValueError(\n 'the `scale` parameter must be one of: {}'.format(' | '.join(sorted(constants.TIME_UNITS)))\n )\n elif not isinstance(scale, int):\n raise ValueError(\n 'the `scale` parameter must be an integer representing parts of a second e.g. 1000 for millisecond'\n )\n\n rate = transformers['rate'](transformers, column_name, **modifiers) # type: Callable\n\n def temporal_percent(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> None\n rate(_, total_time_to_temporal_percent(float(value), scale=scale), **kwargs)\n\n return temporal_percent\n
Send the number of seconds elapsed from a time in the past as a gauge.
For example, if the result is an instance of datetime.datetime representing 5 seconds ago, then this would submit with a value of 5.
The optional modifier format indicates what format the result is in. By default it is native, assuming the underlying library provides timestamps as datetime objects.
If the value is a UNIX timestamp you can set the format modifier to unix_time.
If the value is a string representation of a date, you must provide the expected timestamp format using the supported codes.
Example:
columns:\n - name: time_since_x\n type: time_elapsed\n format: native # default value and can be omitted\n - name: time_since_y\n type: time_elapsed\n format: unix_time\n - name: time_since_z\n type: time_elapsed\n format: \"%d/%m/%Y %H:%M:%S\"\n
Note
The code %z (lower case) is not supported on Windows.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_time_elapsed(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Send the number of seconds elapsed from a time in the past as a `gauge`.\n\n For example, if the result is an instance of\n [datetime.datetime](https://docs.python.org/3/library/datetime.html#datetime.datetime) representing 5 seconds ago,\n then this would submit with a value of `5`.\n\n The optional modifier `format` indicates what format the result is in. By default it is `native`, assuming the\n underlying library provides timestamps as `datetime` objects.\n\n If the value is a UNIX timestamp you can set the `format` modifier to `unix_time`.\n\n If the value is a string representation of a date, you must provide the expected timestamp format using the\n [supported codes](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).\n\n Example:\n\n ```yaml\n columns:\n - name: time_since_x\n type: time_elapsed\n format: native # default value and can be omitted\n - name: time_since_y\n type: time_elapsed\n format: unix_time\n - name: time_since_z\n type: time_elapsed\n format: \"%d/%m/%Y %H:%M:%S\"\n ```\n !!! note\n The code `%z` (lower case) is not supported on Windows.\n \"\"\"\n time_format = modifiers.pop('format', 'native')\n if not isinstance(time_format, str):\n raise ValueError('the `format` parameter must be a string')\n\n gauge = transformers['gauge'](transformers, column_name, **modifiers)\n\n if time_format == 'native':\n\n def time_elapsed(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> None\n value = ensure_aware_datetime(value)\n gauge(_, (datetime.now(value.tzinfo) - value).total_seconds(), **kwargs)\n\n elif time_format == 'unix_time':\n\n def time_elapsed(_, value, **kwargs):\n gauge(_, time.time() - value, **kwargs)\n\n else:\n\n def time_elapsed(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> None\n value = ensure_aware_datetime(datetime.strptime(value, time_format))\n gauge(_, (datetime.now(value.tzinfo) - value).total_seconds(), **kwargs)\n\n return time_elapsed\n
The required modifier status_map is a mapping of values to statuses. Valid statuses include:
OK
WARNING
CRITICAL
UNKNOWN
Any encountered values that are not defined will be sent as UNKNOWN.
In addition, a message modifier can be passed which can contain placeholders (based on Python's str.format) for other column names from the same query to add a message dynamically to the service_check.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_service_check(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Submit a service check.\n\n The required modifier `status_map` is a mapping of values to statuses. Valid statuses include:\n\n - `OK`\n - `WARNING`\n - `CRITICAL`\n - `UNKNOWN`\n\n Any encountered values that are not defined will be sent as `UNKNOWN`.\n\n In addition, a `message` modifier can be passed which can contain placeholders\n (based on Python's str.format) for other column names from the same query to add a message\n dynamically to the service_check.\n \"\"\"\n # Do work in a separate function to avoid having to `del` a bunch of variables\n status_map = _compile_service_check_statuses(modifiers)\n message_field = modifiers.pop('message', None)\n\n service_check_method = transformers['__service_check'](transformers, column_name, **modifiers) # type: Callable\n\n def service_check(sources, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> None\n check_status = status_map.get(value, ServiceCheck.UNKNOWN)\n if not message_field or check_status == ServiceCheck.OK:\n message = None\n else:\n message = message_field.format(**sources)\n\n service_check_method(sources, check_status, message=message, **kwargs)\n\n return service_check\n
Convert a column to a tag that will be used in every subsequent submission.
For example, if you named the column env and the column returned the value prod1, all submissions from that row will be tagged by env:prod1.
This also accepts an optional modifier called boolean that when set to true will transform the result to the string true or false. So for example if you named the column alive and the result was the number 0 the tag will be alive:false.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_tag(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Convert a column to a tag that will be used in every subsequent submission.\n\n For example, if you named the column `env` and the column returned the value `prod1`, all submissions\n from that row will be tagged by `env:prod1`.\n\n This also accepts an optional modifier called `boolean` that when set to `true` will transform the result\n to the string `true` or `false`. So for example if you named the column `alive` and the result was the\n number `0` the tag will be `alive:false`.\n \"\"\"\n template = '{}:{{}}'.format(column_name)\n boolean = is_affirmative(modifiers.pop('boolean', None))\n\n def tag(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> str\n if boolean:\n value = str(is_affirmative(value)).lower()\n\n return template.format(value)\n\n return tag\n
Convert a column to a list of tags that will be used in every submission.
Tag name is determined by column_name. The column value represents a list of values. It is expected to be either a list of strings, or a comma-separated string.
For example, if the column is named server_tag and the column returned the value us,primary, then all submissions for that row will be tagged by server_tag:us and server_tag:primary.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_tag_list(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Convert a column to a list of tags that will be used in every submission.\n\n Tag name is determined by `column_name`. The column value represents a list of values. It is expected to be either\n a list of strings, or a comma-separated string.\n\n For example, if the column is named `server_tag` and the column returned the value `us,primary`, then all\n submissions for that row will be tagged by `server_tag:us` and `server_tag:primary`.\n \"\"\"\n template = '%s:{}' % column_name\n\n def tag_list(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> List[str]\n if isinstance(value, str):\n value = [v.strip() for v in value.split(',')]\n\n return [template.format(v) for v in value]\n\n return tag_list\n
then the extra metric disk.utilized would be sent as a gauge calculated as disk.used / disk.total * 100.
If the source of total is 0, then the submitted value will always be sent as 0 too.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_percent(transformers, name, **modifiers):\n # type: (Dict[str, Callable], str, Any) -> Transformer\n \"\"\"\n Send a percentage based on 2 sources as a `gauge`.\n\n The required modifiers are `part` and `total`.\n\n For example, if you have this configuration:\n\n ```yaml\n columns:\n - name: disk.total\n type: gauge\n - name: disk.used\n type: gauge\n extras:\n - name: disk.utilized\n type: percent\n part: disk.used\n total: disk.total\n ```\n\n then the extra metric `disk.utilized` would be sent as a `gauge` calculated as `disk.used / disk.total * 100`.\n\n If the source of `total` is `0`, then the submitted value will always be sent as `0` too.\n \"\"\"\n available_sources = modifiers.pop('sources')\n\n part = modifiers.pop('part', None)\n if part is None:\n raise ValueError('the `part` parameter is required')\n elif not isinstance(part, str):\n raise ValueError('the `part` parameter must be a string')\n elif part not in available_sources:\n raise ValueError('the `part` parameter `{}` is not an available source'.format(part))\n\n total = modifiers.pop('total', None)\n if total is None:\n raise ValueError('the `total` parameter is required')\n elif not isinstance(total, str):\n raise ValueError('the `total` parameter must be a string')\n elif total not in available_sources:\n raise ValueError('the `total` parameter `{}` is not an available source'.format(total))\n\n del available_sources\n gauge = transformers['gauge'](transformers, name, **modifiers)\n gauge = create_extra_transformer(gauge)\n\n def percent(sources, **kwargs):\n gauge(sources, compute_percent(sources[part], sources[total]), **kwargs)\n\n return percent\n
For brevity, if the expression attribute exists and type does not then it is assumed the type is expression. The submit_type can be any transformer and any extra options are passed down to it.
The result of every expression is stored, so in lieu of a submit_type the above example could also be written as:
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_expression(transformers, name, **modifiers):\n # type: (Dict[str, Transformer], str, Dict[str, Any]) -> Transformer\n \"\"\"\n This allows the evaluation of a limited subset of Python syntax and built-in functions.\n\n ```yaml\n columns:\n - name: disk.total\n type: gauge\n - name: disk.used\n type: gauge\n extras:\n - name: disk.free\n expression: disk.total - disk.used\n submit_type: gauge\n ```\n\n For brevity, if the `expression` attribute exists and `type` does not then it is assumed the type is\n `expression`. The `submit_type` can be any transformer and any extra options are passed down to it.\n\n The result of every expression is stored, so in lieu of a `submit_type` the above example could also be written as:\n\n ```yaml\n columns:\n - name: disk.total\n type: gauge\n - name: disk.used\n type: gauge\n extras:\n - name: free\n expression: disk.total - disk.used\n - name: disk.free\n type: gauge\n source: free\n ```\n\n The order matters though, so for example the following will fail:\n\n ```yaml\n columns:\n - name: disk.total\n type: gauge\n - name: disk.used\n type: gauge\n extras:\n - name: disk.free\n type: gauge\n source: free\n - name: free\n expression: disk.total - disk.used\n ```\n\n since the source `free` does not yet exist.\n \"\"\"\n available_sources = modifiers.pop('sources')\n\n expression = modifiers.pop('expression', None)\n if expression is None:\n raise ValueError('the `expression` parameter is required')\n elif not isinstance(expression, str):\n raise ValueError('the `expression` parameter must be a string')\n elif not expression:\n raise ValueError('the `expression` parameter must not be empty')\n\n if not modifiers.pop('verbose', False):\n # Sort the sources in reverse order of length to prevent greedy matching\n available_sources = sorted(available_sources, key=lambda s: -len(s))\n\n # Escape special characters, mostly for the possible dots in metric names\n available_sources = list(map(re.escape, available_sources))\n\n # Finally, utilize the order by relying on the guarantees provided by the alternation operator\n available_sources = '|'.join(available_sources)\n\n expression = re.sub(\n SOURCE_PATTERN.format(available_sources),\n # Replace by the particular source that matched\n lambda match_obj: 'SOURCES[\"{}\"]'.format(match_obj.group(1)),\n expression,\n )\n\n expression = compile(expression, filename=name, mode='eval')\n\n del available_sources\n\n if 'submit_type' in modifiers:\n if modifiers['submit_type'] not in transformers:\n raise ValueError('unknown submit_type `{}`'.format(modifiers['submit_type']))\n\n submit_method = transformers[modifiers.pop('submit_type')](transformers, name, **modifiers) # type: Transformer\n submit_method = create_extra_transformer(submit_method) # type: Callable\n\n def execute_expression(sources, **kwargs):\n # type: (Dict[str, Any], Dict[str, Any]) -> float\n result = eval(expression, ALLOWED_GLOBALS, {'SOURCES': sources}) # type: float\n submit_method(sources, result, **kwargs)\n return result\n\n else:\n\n def execute_expression(sources, **kwargs):\n # type: (Dict[str, Any], Dict[str, Any]) -> Any\n return eval(expression, ALLOWED_GLOBALS, {'SOURCES': sources})\n\n return execute_expression\n
then a log will be sent with the following attributes:
message: value of the msg column
status: value of the level column
date: value of the time column
foo: value of the bar column
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_log(transformers, name, **modifiers):\n # type: (Dict[str, Callable], str, Any) -> Transformer\n \"\"\"\n Send a log.\n\n The only required modifier is `attributes`.\n\n For example, if you have this configuration:\n\n ```yaml\n columns:\n - name: msg\n type: source\n - name: level\n type: source\n - name: time\n type: source\n - name: bar\n type: source\n extras:\n - type: log\n attributes:\n message: msg\n status: level\n date: time\n foo: bar\n ```\n\n then a log will be sent with the following attributes:\n\n - `message`: value of the `msg` column\n - `status`: value of the `level` column\n - `date`: value of the `time` column\n - `foo`: value of the `bar` column\n \"\"\"\n available_sources = modifiers.pop('sources')\n attributes = _compile_log_attributes(modifiers, available_sources)\n\n del available_sources\n send_log = transformers['__send_log'](transformers, **modifiers)\n send_log = create_extra_transformer(send_log)\n\n def log(sources, **kwargs):\n data = {attribute: sources[source] for attribute, source in attributes.items()}\n if kwargs['tags']:\n data['ddtags'] = ','.join(kwargs['tags'])\n\n send_log(sources, data)\n\n return log\n
Whenever you need to make HTTP requests, the base class provides a convenience member that has the same interface as the popular requests library and ensures consistent behavior across all integrations.
The wrapper automatically parses and uses configuration from the instance, init_config, and Agent config. Also, this is only done once during initialization and cached to reduce the overhead of every call.
For example, to make a GET request you would use:
response = self.http.get(url)\n
and the wrapper will pass the right things to requests. All methods accept optional keyword arguments like stream, etc.
Any method-level option will override configuration. So for example if tls_verify was set to false and you do self.http.get(url, verify=True), then SSL certificates will be verified on that particular request. You can use the keyword argument persist to override persist_connections.
There is also support for non-standard or legacy configurations with the HTTP_CONFIG_REMAPPER class attribute. For example:
Support for Unix socket is provided via requests-unixsocket and allows making UDS requests on the unix:// scheme (not supported on Windows until Python adds support for AF_UNIX, see ticket):
Some options can be set globally in init_config (with instances taking precedence). For complete documentation of every option, see the associated configuration templates for the instances and init_config sections.
Support for configuring cookies! Since they can be set globally, per-domain, and even per-path, the configuration may be complex if not thought out adequately. We'll discuss options for what that might look like. Only our spark and cisco_aci checks currently set cookies, and that is based on code logic, not configuration.
Some systems expose their logs from HTTP endpoints instead of files that the Logs Agent can tail. In such cases, you can create an Agent integration to crawl the endpoints and submit the logs.
The following diagram illustrates how crawling logs integrates into the Datadog Agent.
"},{"location":"base/logs-crawlers/#interface","title":"Interface","text":""},{"location":"base/logs-crawlers/#datadog_checks.base.checks.logs.crawler.base.LogCrawlerCheck","title":"datadog_checks.base.checks.logs.crawler.base.LogCrawlerCheck","text":"Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/base.py
class LogCrawlerCheck(AgentCheck, ABC):\n @abstractmethod\n def get_log_streams(self) -> Iterable[LogStream]:\n \"\"\"\n Yields the log streams associated with this check.\n \"\"\"\n\n def process_streams(self) -> None:\n \"\"\"\n Process the log streams and send the collected logs.\n\n Crawler checks that need more functionality can implement the `check` method and call this directly.\n \"\"\"\n for stream in self.get_log_streams():\n last_cursor = self.get_log_cursor(stream.name)\n for record in stream.records(cursor=last_cursor):\n self.send_log(record.data, cursor=record.cursor, stream=stream.name)\n\n def check(self, _) -> None:\n self.process_streams()\n
Process the log streams and send the collected logs.
Crawler checks that need more functionality can implement the check method and call this directly.
Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/base.py
def process_streams(self) -> None:\n \"\"\"\n Process the log streams and send the collected logs.\n\n Crawler checks that need more functionality can implement the `check` method and call this directly.\n \"\"\"\n for stream in self.get_log_streams():\n last_cursor = self.get_log_cursor(stream.name)\n for record in stream.records(cursor=last_cursor):\n self.send_log(record.data, cursor=record.cursor, stream=stream.name)\n
"},{"location":"base/logs-crawlers/#datadog_checks.base.checks.logs.crawler.base.LogCrawlerCheck.check","title":"check(_)","text":"Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/base.py
"},{"location":"base/logs-crawlers/#datadog_checks.base.checks.logs.crawler.stream.LogStream","title":"datadog_checks.base.checks.logs.crawler.stream.LogStream","text":"Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/stream.py
class LogStream(ABC):\n def __init__(self, *, check: AgentCheck, name: str):\n self.__check = check\n self.__name = name\n\n @property\n def check(self) -> AgentCheck:\n \"\"\"\n The AgentCheck instance associated with this LogStream.\n \"\"\"\n return self.__check\n\n @property\n def name(self) -> str:\n \"\"\"\n The name of this LogStream.\n \"\"\"\n return self.__name\n\n def construct_tags(self, tags: list[str]) -> list[str]:\n \"\"\"\n Returns a formatted string of tags which may be used directly as the `ddtags` field of logs.\n This will include the `tags` from the integration instance config.\n \"\"\"\n formatted_tags = ','.join(tags)\n return f'{self.check.formatted_tags},{formatted_tags}' if self.check.formatted_tags else formatted_tags\n\n @abstractmethod\n def records(self, *, cursor: dict[str, Any] | None = None) -> Iterable[LogRecord]:\n \"\"\"\n Yields log records as they are received.\n \"\"\"\n
"},{"location":"base/logs-crawlers/#datadog_checks.base.checks.logs.crawler.stream.LogRecord","title":"datadog_checks.base.checks.logs.crawler.stream.LogRecord","text":"Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/stream.py
Often, you will want to collect mostly unstructured data that doesn't map well to tags, like fine-grained product version information.
The base class provides a method that handles such cases. The collected data is captured by flares, displayed on the Agent's status page, and will eventually be queryable in-app.
Custom transformers may be defined via a class level attribute METADATA_TRANSFORMERS.
This is a mapping of metadata names to functions. When you call self.set_metadata(name, value, **options), if name is in this mapping then the corresponding function will be called with the value, and the return value(s) will be collected instead.
Transformer functions must satisfy the following signature:
If the return type is str, then it will be sent as the value for name. If the return type is a mapping type, then each key will be considered a name and will be sent with its (str) value.
For example, the following would collect an entity named square with a value of '25':
There are a few default transformers, which can be overridden by custom transformers.
Source code in datadog_checks_base/datadog_checks/base/utils/metadata/core.py
class MetadataManager(object):\n \"\"\"\n Custom transformers may be defined via a class level attribute `METADATA_TRANSFORMERS`.\n\n This is a mapping of metadata names to functions. When you call\n `#!python self.set_metadata(name, value, **options)`, if `name` is in this mapping then\n the corresponding function will be called with the `value`, and the return\n value(s) will be collected instead.\n\n Transformer functions must satisfy the following signature:\n\n ```python\n def transform_<NAME>(value: Any, options: dict) -> Union[str, Dict[str, str]]:\n ```\n\n If the return type is `str`, then it will be sent as the value for `name`. If the return type is a mapping type,\n then each key will be considered a `name` and will be sent with its (`str`) value.\n\n For example, the following would collect an entity named `square` with a value of `'25'`:\n\n ```python\n from datadog_checks.base import AgentCheck\n\n\n class AwesomeCheck(AgentCheck):\n METADATA_TRANSFORMERS = {\n 'square': lambda value, options: str(int(value) ** 2)\n }\n\n def check(self, instance):\n self.set_metadata('square', '5')\n ```\n\n There are a few default transformers, which can be overridden by custom transformers.\n \"\"\"\n\n __slots__ = ('check_id', 'check_name', 'logger', 'metadata_transformers')\n\n def __init__(self, check_name, check_id, logger=None, metadata_transformers=None):\n self.check_name = check_name\n self.check_id = check_id\n self.logger = logger or LOGGER\n self.metadata_transformers = {'version': self.transform_version}\n\n if metadata_transformers:\n self.metadata_transformers.update(metadata_transformers)\n\n def submit_raw(self, name, value):\n datadog_agent.set_check_metadata(self.check_id, to_native_string(name), to_native_string(value))\n\n def submit(self, name, value, options):\n transformer = self.metadata_transformers.get(name)\n if transformer:\n try:\n transformed = transformer(value, options)\n except Exception as e:\n if is_primitive(value):\n self.logger.debug('Unable to transform `%s` metadata value `%s`: %s', name, value, e)\n else:\n self.logger.debug('Unable to transform `%s` metadata: %s', name, e)\n\n return\n\n if isinstance(transformed, str):\n self.submit_raw(name, transformed)\n else:\n for transformed_name, transformed_value in iteritems(transformed):\n self.submit_raw(transformed_name, transformed_value)\n else:\n self.submit_raw(name, value)\n\n def transform_version(self, version, options):\n \"\"\"\n Transforms a version like `1.2.3-rc.4+5` to its constituent parts. In all cases,\n the metadata names `version.raw` and `version.scheme` will be collected.\n\n If a `scheme` is defined then it will be looked up from our known schemes. If no\n scheme is defined then it will default to `semver`. The supported schemes are:\n\n - `regex` - A `pattern` must also be defined. The pattern must be a `str` or a pre-compiled\n `re.Pattern`. Any matching named subgroups will then be sent as `version.<GROUP_NAME>`. In this case,\n the check name will be used as the value of `version.scheme` unless `final_scheme` is also set, which\n will take precedence.\n - `parts` - A `part_map` must also be defined. Each key in this mapping will be considered\n a `name` and will be sent with its (`str`) value.\n - `semver` - This is essentially the same as `regex` with the `pattern` set to the standard regular\n expression for semantic versioning.\n\n Taking the example above, calling `#!python self.set_metadata('version', '1.2.3-rc.4+5')` would produce:\n\n | name | value |\n | --- | --- |\n | `version.raw` | `1.2.3-rc.4+5` |\n | `version.scheme` | `semver` |\n | `version.major` | `1` |\n | `version.minor` | `2` |\n | `version.patch` | `3` |\n | `version.release` | `rc.4` |\n | `version.build` | `5` |\n \"\"\"\n scheme, version_parts = parse_version(version, options)\n if scheme == 'regex' or scheme == 'parts':\n scheme = options.get('final_scheme', self.check_name)\n\n data = {'version.{}'.format(part_name): part_value for part_name, part_value in iteritems(version_parts)}\n data['version.raw'] = version\n data['version.scheme'] = scheme\n\n return data\n
Transforms a version like 1.2.3-rc.4+5 to its constituent parts. In all cases, the metadata names version.raw and version.scheme will be collected.
If a scheme is defined then it will be looked up from our known schemes. If no scheme is defined then it will default to semver. The supported schemes are:
regex - A pattern must also be defined. The pattern must be a str or a pre-compiled re.Pattern. Any matching named subgroups will then be sent as version.<GROUP_NAME>. In this case, the check name will be used as the value of version.scheme unless final_scheme is also set, which will take precedence.
parts - A part_map must also be defined. Each key in this mapping will be considered a name and will be sent with its (str) value.
semver - This is essentially the same as regex with the pattern set to the standard regular expression for semantic versioning.
Taking the example above, calling self.set_metadata('version', '1.2.3-rc.4+5') would produce:
name value version.raw1.2.3-rc.4+5version.schemesemverversion.major1version.minor2version.patch3version.releaserc.4version.build5 Source code in datadog_checks_base/datadog_checks/base/utils/metadata/core.py
def transform_version(self, version, options):\n \"\"\"\n Transforms a version like `1.2.3-rc.4+5` to its constituent parts. In all cases,\n the metadata names `version.raw` and `version.scheme` will be collected.\n\n If a `scheme` is defined then it will be looked up from our known schemes. If no\n scheme is defined then it will default to `semver`. The supported schemes are:\n\n - `regex` - A `pattern` must also be defined. The pattern must be a `str` or a pre-compiled\n `re.Pattern`. Any matching named subgroups will then be sent as `version.<GROUP_NAME>`. In this case,\n the check name will be used as the value of `version.scheme` unless `final_scheme` is also set, which\n will take precedence.\n - `parts` - A `part_map` must also be defined. Each key in this mapping will be considered\n a `name` and will be sent with its (`str`) value.\n - `semver` - This is essentially the same as `regex` with the `pattern` set to the standard regular\n expression for semantic versioning.\n\n Taking the example above, calling `#!python self.set_metadata('version', '1.2.3-rc.4+5')` would produce:\n\n | name | value |\n | --- | --- |\n | `version.raw` | `1.2.3-rc.4+5` |\n | `version.scheme` | `semver` |\n | `version.major` | `1` |\n | `version.minor` | `2` |\n | `version.patch` | `3` |\n | `version.release` | `rc.4` |\n | `version.build` | `5` |\n \"\"\"\n scheme, version_parts = parse_version(version, options)\n if scheme == 'regex' or scheme == 'parts':\n scheme = options.get('final_scheme', self.check_name)\n\n data = {'version.{}'.format(part_name): part_value for part_name, part_value in iteritems(version_parts)}\n data['version.raw'] = version\n data['version.scheme'] = scheme\n\n return data\n
OpenMetrics is used for collecting metrics using the CNCF-backed OpenMetrics format. This version is the default version for all new OpenMetric-checks, and it is compatible with Python 3 only.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
class OpenMetricsBaseCheckV2(AgentCheck):\n \"\"\"\n OpenMetricsBaseCheckV2 is an updated class of OpenMetricsBaseCheck to scrape endpoints that emit Prometheus metrics.\n\n Minimal example configuration:\n\n ```yaml\n instances:\n - openmetrics_endpoint: http://example.com/endpoint\n namespace: \"foobar\"\n metrics:\n - bar\n - foo\n ```\n\n \"\"\"\n\n DEFAULT_METRIC_LIMIT = 2000\n\n # Allow tracing for openmetrics integrations\n def __init_subclass__(cls, **kwargs):\n super().__init_subclass__(**kwargs)\n return traced_class(cls)\n\n def __init__(self, name, init_config, instances):\n \"\"\"\n The base class for any OpenMetrics-based integration.\n\n Subclasses are expected to override this to add their custom scrapers or transformers.\n When overriding, make sure to call this (the parent's) __init__ first!\n \"\"\"\n super(OpenMetricsBaseCheckV2, self).__init__(name, init_config, instances)\n\n # All desired scraper configurations, which subclasses can override as needed\n self.scraper_configs = [self.instance]\n\n # All configured scrapers keyed by the endpoint\n self.scrapers = {}\n\n self.check_initializations.append(self.configure_scrapers)\n\n def check(self, _):\n \"\"\"\n Perform an openmetrics-based check.\n\n Subclasses should typically not need to override this, as most common customization\n needs are covered by the use of custom scrapers.\n Another thing to note is that this check ignores its instance argument completely.\n We take care of instance-level customization at initialization time.\n \"\"\"\n self.refresh_scrapers()\n\n for endpoint, scraper in self.scrapers.items():\n self.log.debug('Scraping OpenMetrics endpoint: %s', endpoint)\n\n with self.adopt_namespace(scraper.namespace):\n try:\n scraper.scrape()\n except (ConnectionError, RequestException) as e:\n self.log.error(\"There was an error scraping endpoint %s: %s\", endpoint, str(e))\n raise_from(type(e)(\"There was an error scraping endpoint {}: {}\".format(endpoint, e)), None)\n\n def configure_scrapers(self):\n \"\"\"\n Creates a scraper configuration for each instance.\n \"\"\"\n\n scrapers = {}\n\n for config in self.scraper_configs:\n endpoint = config.get('openmetrics_endpoint', '')\n if not isinstance(endpoint, str):\n raise ConfigurationError('The setting `openmetrics_endpoint` must be a string')\n elif not endpoint:\n raise ConfigurationError('The setting `openmetrics_endpoint` is required')\n\n scrapers[endpoint] = self.create_scraper(config)\n\n self.scrapers.clear()\n self.scrapers.update(scrapers)\n\n def create_scraper(self, config):\n \"\"\"\n Subclasses can override to return a custom scraper based on instance configuration.\n \"\"\"\n return OpenMetricsScraper(self, self.get_config_with_defaults(config))\n\n def set_dynamic_tags(self, *tags):\n for scraper in self.scrapers.values():\n scraper.set_dynamic_tags(*tags)\n\n def get_config_with_defaults(self, config):\n return ChainMap(config, self.get_default_config())\n\n def get_default_config(self):\n return {}\n\n def refresh_scrapers(self):\n pass\n\n @contextmanager\n def adopt_namespace(self, namespace):\n old_namespace = self.__NAMESPACE__\n\n try:\n self.__NAMESPACE__ = namespace or old_namespace\n yield\n finally:\n self.__NAMESPACE__ = old_namespace\n
The base class for any OpenMetrics-based integration.
Subclasses are expected to override this to add their custom scrapers or transformers. When overriding, make sure to call this (the parent's) init first!
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
def __init__(self, name, init_config, instances):\n \"\"\"\n The base class for any OpenMetrics-based integration.\n\n Subclasses are expected to override this to add their custom scrapers or transformers.\n When overriding, make sure to call this (the parent's) __init__ first!\n \"\"\"\n super(OpenMetricsBaseCheckV2, self).__init__(name, init_config, instances)\n\n # All desired scraper configurations, which subclasses can override as needed\n self.scraper_configs = [self.instance]\n\n # All configured scrapers keyed by the endpoint\n self.scrapers = {}\n\n self.check_initializations.append(self.configure_scrapers)\n
Subclasses should typically not need to override this, as most common customization needs are covered by the use of custom scrapers. Another thing to note is that this check ignores its instance argument completely. We take care of instance-level customization at initialization time.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
def check(self, _):\n \"\"\"\n Perform an openmetrics-based check.\n\n Subclasses should typically not need to override this, as most common customization\n needs are covered by the use of custom scrapers.\n Another thing to note is that this check ignores its instance argument completely.\n We take care of instance-level customization at initialization time.\n \"\"\"\n self.refresh_scrapers()\n\n for endpoint, scraper in self.scrapers.items():\n self.log.debug('Scraping OpenMetrics endpoint: %s', endpoint)\n\n with self.adopt_namespace(scraper.namespace):\n try:\n scraper.scrape()\n except (ConnectionError, RequestException) as e:\n self.log.error(\"There was an error scraping endpoint %s: %s\", endpoint, str(e))\n raise_from(type(e)(\"There was an error scraping endpoint {}: {}\".format(endpoint, e)), None)\n
Creates a scraper configuration for each instance.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
def configure_scrapers(self):\n \"\"\"\n Creates a scraper configuration for each instance.\n \"\"\"\n\n scrapers = {}\n\n for config in self.scraper_configs:\n endpoint = config.get('openmetrics_endpoint', '')\n if not isinstance(endpoint, str):\n raise ConfigurationError('The setting `openmetrics_endpoint` must be a string')\n elif not endpoint:\n raise ConfigurationError('The setting `openmetrics_endpoint` is required')\n\n scrapers[endpoint] = self.create_scraper(config)\n\n self.scrapers.clear()\n self.scrapers.update(scrapers)\n
Subclasses can override to return a custom scraper based on instance configuration.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
def create_scraper(self, config):\n \"\"\"\n Subclasses can override to return a custom scraper based on instance configuration.\n \"\"\"\n return OpenMetricsScraper(self, self.get_config_with_defaults(config))\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
class OpenMetricsScraper:\n \"\"\"\n OpenMetricsScraper is a class that can be used to override the default scraping behavior for OpenMetricsBaseCheckV2.\n\n Minimal example configuration:\n\n ```yaml\n - openmetrics_endpoint: http://example.com/endpoint\n namespace: \"foobar\"\n metrics:\n - bar\n - foo\n raw_metric_prefix: \"test\"\n telemetry: \"true\"\n hostname_label: node\n ```\n\n \"\"\"\n\n SERVICE_CHECK_HEALTH = 'openmetrics.health'\n\n def __init__(self, check, config):\n \"\"\"\n The base class for any scraper overrides.\n \"\"\"\n\n self.config = config\n\n # Save a reference to the check instance\n self.check = check\n\n # Parse the configuration\n self.endpoint = config['openmetrics_endpoint']\n\n self.metric_transformer = MetricTransformer(self.check, config)\n self.label_aggregator = LabelAggregator(self.check, config)\n\n self.enable_telemetry = is_affirmative(config.get('telemetry', False))\n # Make every telemetry submission method a no-op to avoid many lookups of `self.enable_telemetry`\n if not self.enable_telemetry:\n for name, _ in inspect.getmembers(self, predicate=inspect.ismethod):\n if name.startswith('submit_telemetry_'):\n setattr(self, name, no_op)\n\n # Prevent overriding an integration's defined namespace\n self.namespace = check.__NAMESPACE__ or config.get('namespace', '')\n if not isinstance(self.namespace, str):\n raise ConfigurationError('Setting `namespace` must be a string')\n\n self.raw_metric_prefix = config.get('raw_metric_prefix', '')\n if not isinstance(self.raw_metric_prefix, str):\n raise ConfigurationError('Setting `raw_metric_prefix` must be a string')\n\n self.enable_health_service_check = is_affirmative(config.get('enable_health_service_check', True))\n self.ignore_connection_errors = is_affirmative(config.get('ignore_connection_errors', False))\n\n self.hostname_label = config.get('hostname_label', '')\n if not isinstance(self.hostname_label, str):\n raise ConfigurationError('Setting `hostname_label` must be a string')\n\n hostname_format = config.get('hostname_format', '')\n if not isinstance(hostname_format, str):\n raise ConfigurationError('Setting `hostname_format` must be a string')\n\n self.hostname_formatter = None\n if self.hostname_label and hostname_format:\n placeholder = '<HOSTNAME>'\n if placeholder not in hostname_format:\n raise ConfigurationError(f'Setting `hostname_format` does not contain the placeholder `{placeholder}`')\n\n self.hostname_formatter = lambda hostname: hostname_format.replace('<HOSTNAME>', hostname, 1)\n\n exclude_labels = config.get('exclude_labels', [])\n if not isinstance(exclude_labels, list):\n raise ConfigurationError('Setting `exclude_labels` must be an array')\n\n self.exclude_labels = set()\n for i, entry in enumerate(exclude_labels, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `exclude_labels` must be a string')\n\n self.exclude_labels.add(entry)\n\n include_labels = config.get('include_labels', [])\n if not isinstance(include_labels, list):\n raise ConfigurationError('Setting `include_labels` must be an array')\n self.include_labels = set()\n for i, entry in enumerate(include_labels, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `include_labels` must be a string')\n if entry in self.exclude_labels:\n self.log.debug(\n 'Label `%s` is set in both `exclude_labels` and `include_labels`. Excluding label.', entry\n )\n self.include_labels.add(entry)\n\n self.rename_labels = config.get('rename_labels', {})\n if not isinstance(self.rename_labels, dict):\n raise ConfigurationError('Setting `rename_labels` must be a mapping')\n\n for key, value in self.rename_labels.items():\n if not isinstance(value, str):\n raise ConfigurationError(f'Value for label `{key}` of setting `rename_labels` must be a string')\n\n exclude_metrics = config.get('exclude_metrics', [])\n if not isinstance(exclude_metrics, list):\n raise ConfigurationError('Setting `exclude_metrics` must be an array')\n\n self.exclude_metrics = set()\n self.exclude_metrics_pattern = None\n exclude_metrics_patterns = []\n for i, entry in enumerate(exclude_metrics, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `exclude_metrics` must be a string')\n\n escaped_entry = re.escape(entry)\n if entry == escaped_entry:\n self.exclude_metrics.add(entry)\n else:\n exclude_metrics_patterns.append(entry)\n\n if exclude_metrics_patterns:\n self.exclude_metrics_pattern = re.compile('|'.join(exclude_metrics_patterns))\n\n self.exclude_metrics_by_labels = {}\n exclude_metrics_by_labels = config.get('exclude_metrics_by_labels', {})\n if not isinstance(exclude_metrics_by_labels, dict):\n raise ConfigurationError('Setting `exclude_metrics_by_labels` must be a mapping')\n elif exclude_metrics_by_labels:\n for label, values in exclude_metrics_by_labels.items():\n if values is True:\n self.exclude_metrics_by_labels[label] = return_true\n elif isinstance(values, list):\n for i, value in enumerate(values, 1):\n if not isinstance(value, str):\n raise ConfigurationError(\n f'Value #{i} for label `{label}` of setting `exclude_metrics_by_labels` '\n f'must be a string'\n )\n\n self.exclude_metrics_by_labels[label] = (\n lambda label_value, pattern=re.compile('|'.join(values)): pattern.search( # noqa: B008\n label_value\n ) # noqa: B008, E501\n is not None\n )\n else:\n raise ConfigurationError(\n f'Label `{label}` of setting `exclude_metrics_by_labels` must be an array or set to `true`'\n )\n\n custom_tags = config.get('tags', []) # type: List[str]\n if not isinstance(custom_tags, list):\n raise ConfigurationError('Setting `tags` must be an array')\n\n for i, entry in enumerate(custom_tags, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `tags` must be a string')\n\n # Some tags can be ignored to reduce the cardinality.\n # This can be useful for cost optimization in containerized environments\n # when the openmetrics check is configured to collect custom metrics.\n # Even when the Agent's Tagger is configured to add low-cardinality tags only,\n # some tags can still generate unwanted metric contexts (e.g pod annotations as tags).\n ignore_tags = config.get('ignore_tags', [])\n if ignore_tags:\n ignored_tags_re = re.compile('|'.join(set(ignore_tags)))\n custom_tags = [tag for tag in custom_tags if not ignored_tags_re.search(tag)]\n\n self.static_tags = copy(custom_tags)\n if is_affirmative(self.config.get('tag_by_endpoint', True)):\n self.static_tags.append(f'endpoint:{self.endpoint}')\n\n # These will be applied only to service checks\n self.static_tags = tuple(self.static_tags)\n # These will be applied to everything except service checks\n self.tags = self.static_tags\n\n self.raw_line_filter = None\n raw_line_filters = config.get('raw_line_filters', [])\n if not isinstance(raw_line_filters, list):\n raise ConfigurationError('Setting `raw_line_filters` must be an array')\n elif raw_line_filters:\n for i, entry in enumerate(raw_line_filters, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `raw_line_filters` must be a string')\n\n self.raw_line_filter = re.compile('|'.join(raw_line_filters))\n\n self.http = RequestsWrapper(config, self.check.init_config, self.check.HTTP_CONFIG_REMAPPER, self.check.log)\n\n self._content_type = ''\n self._use_latest_spec = is_affirmative(config.get('use_latest_spec', False))\n if self._use_latest_spec:\n accept_header = 'application/openmetrics-text;version=1.0.0,application/openmetrics-text;version=0.0.1'\n else:\n accept_header = 'text/plain'\n\n # Request the appropriate exposition format\n if self.http.options['headers'].get('Accept') == '*/*':\n self.http.options['headers']['Accept'] = accept_header\n\n self.use_process_start_time = is_affirmative(config.get('use_process_start_time'))\n\n # Used for monotonic counts\n self.flush_first_value = False\n\n def scrape(self):\n \"\"\"\n Execute a scrape, and for each metric collected, transform the metric.\n \"\"\"\n runtime_data = {'flush_first_value': self.flush_first_value, 'static_tags': self.static_tags}\n\n for metric in self.consume_metrics(runtime_data):\n transformer = self.metric_transformer.get(metric)\n if transformer is None:\n continue\n\n transformer(metric, self.generate_sample_data(metric), runtime_data)\n\n self.flush_first_value = True\n\n def consume_metrics(self, runtime_data):\n \"\"\"\n Yield the processed metrics and filter out excluded metrics.\n \"\"\"\n\n metric_parser = self.parse_metrics()\n if not self.flush_first_value and self.use_process_start_time:\n metric_parser = first_scrape_handler(metric_parser, runtime_data, datadog_agent.get_process_start_time())\n if self.label_aggregator.configured:\n metric_parser = self.label_aggregator(metric_parser)\n\n for metric in metric_parser:\n if metric.name in self.exclude_metrics or (\n self.exclude_metrics_pattern is not None and self.exclude_metrics_pattern.search(metric.name)\n ):\n self.submit_telemetry_number_of_ignored_metric_samples(metric)\n continue\n\n yield metric\n\n def parse_metrics(self):\n \"\"\"\n Get the line streamer and yield processed metrics.\n \"\"\"\n\n line_streamer = self.stream_connection_lines()\n if self.raw_line_filter is not None:\n line_streamer = self.filter_connection_lines(line_streamer)\n\n # Since we determine `self.parse_metric_families` dynamically from the response and that's done as a\n # side effect inside the `line_streamer` generator, we need to consume the first line in order to\n # trigger that side effect.\n try:\n line_streamer = chain([next(line_streamer)], line_streamer)\n except StopIteration:\n # If line_streamer is an empty iterator, next(line_streamer) fails.\n return\n\n for metric in self.parse_metric_families(line_streamer):\n self.submit_telemetry_number_of_total_metric_samples(metric)\n\n # It is critical that the prefix is removed immediately so that\n # all other configuration may reference the trimmed metric name\n if self.raw_metric_prefix and metric.name.startswith(self.raw_metric_prefix):\n metric.name = metric.name[len(self.raw_metric_prefix) :]\n\n yield metric\n\n @property\n def parse_metric_families(self):\n media_type = self._content_type.split(';')[0]\n # Setting `use_latest_spec` forces the use of the OpenMetrics format, otherwise\n # the format will be chosen based on the media type specified in the response's content-header.\n # The selection is based on what Prometheus does:\n # https://github.com/prometheus/prometheus/blob/v2.43.0/model/textparse/interface.go#L83-L90\n return (\n parse_openmetrics\n if self._use_latest_spec or media_type == 'application/openmetrics-text'\n else parse_prometheus\n )\n\n def generate_sample_data(self, metric):\n \"\"\"\n Yield a sample of processed data.\n \"\"\"\n\n label_normalizer = get_label_normalizer(metric.type)\n\n for sample in metric.samples:\n value = sample.value\n if isnan(value) or isinf(value):\n self.log.debug('Ignoring sample for metric `%s` as it has an invalid value: %s', metric.name, value)\n continue\n\n tags = []\n skip_sample = False\n labels = sample.labels\n self.label_aggregator.populate(labels)\n label_normalizer(labels)\n\n for label_name, label_value in labels.items():\n sample_excluder = self.exclude_metrics_by_labels.get(label_name)\n if sample_excluder is not None and sample_excluder(label_value):\n skip_sample = True\n break\n elif label_name in self.exclude_labels:\n continue\n elif self.include_labels and label_name not in self.include_labels:\n continue\n\n label_name = self.rename_labels.get(label_name, label_name)\n tags.append(f'{label_name}:{label_value}')\n\n if skip_sample:\n continue\n\n tags.extend(self.tags)\n\n hostname = \"\"\n if self.hostname_label and self.hostname_label in labels:\n hostname = labels[self.hostname_label]\n if self.hostname_formatter is not None:\n hostname = self.hostname_formatter(hostname)\n\n self.submit_telemetry_number_of_processed_metric_samples()\n yield sample, tags, hostname\n\n def stream_connection_lines(self):\n \"\"\"\n Yield the connection line.\n \"\"\"\n\n try:\n with self.get_connection() as connection:\n # Media type will be used to select parser dynamically\n self._content_type = connection.headers.get('Content-Type', '')\n for line in connection.iter_lines(decode_unicode=True):\n yield line\n except ConnectionError as e:\n if self.ignore_connection_errors:\n self.log.warning(\"OpenMetrics endpoint %s is not accessible\", self.endpoint)\n else:\n raise e\n\n def filter_connection_lines(self, line_streamer):\n \"\"\"\n Filter connection lines in the line streamer.\n \"\"\"\n\n for line in line_streamer:\n if self.raw_line_filter.search(line):\n self.submit_telemetry_number_of_ignored_lines()\n else:\n yield line\n\n def get_connection(self):\n \"\"\"\n Send a request to scrape metrics. Return the response or throw an exception.\n \"\"\"\n\n try:\n response = self.send_request()\n except Exception as e:\n self.submit_health_check(ServiceCheck.CRITICAL, message=str(e))\n raise\n else:\n try:\n response.raise_for_status()\n except Exception as e:\n self.submit_health_check(ServiceCheck.CRITICAL, message=str(e))\n response.close()\n raise\n else:\n self.submit_health_check(ServiceCheck.OK)\n\n # Never derive the encoding from the locale\n if response.encoding is None:\n response.encoding = 'utf-8'\n\n self.submit_telemetry_endpoint_response_size(response)\n\n return response\n\n def send_request(self, **kwargs):\n \"\"\"\n Send an HTTP GET request to the `openmetrics_endpoint` value.\n \"\"\"\n\n kwargs['stream'] = True\n return self.http.get(self.endpoint, **kwargs)\n\n def set_dynamic_tags(self, *tags):\n \"\"\"\n Set dynamic tags.\n \"\"\"\n\n self.tags = tuple(chain(self.static_tags, tags))\n\n def submit_health_check(self, status, **kwargs):\n \"\"\"\n If health service check is enabled, send an `openmetrics.health` service check.\n \"\"\"\n\n if self.enable_health_service_check:\n self.service_check(self.SERVICE_CHECK_HEALTH, status, tags=self.static_tags, **kwargs)\n\n def submit_telemetry_number_of_total_metric_samples(self, metric):\n self.count('telemetry.metrics.input.count', len(metric.samples), tags=self.tags)\n\n def submit_telemetry_number_of_ignored_metric_samples(self, metric):\n self.count('telemetry.metrics.ignored.count', len(metric.samples), tags=self.tags)\n\n def submit_telemetry_number_of_processed_metric_samples(self):\n self.count('telemetry.metrics.processed.count', 1, tags=self.tags)\n\n def submit_telemetry_number_of_ignored_lines(self):\n self.count('telemetry.metrics.blacklist.count', 1, tags=self.tags)\n\n def submit_telemetry_endpoint_response_size(self, response):\n content_length = response.headers.get('Content-Length')\n if content_length is not None:\n content_length = int(content_length)\n else:\n content_length = len(response.content)\n\n self.gauge('telemetry.payload.size', content_length, tags=self.tags)\n\n def __getattr__(self, name):\n # Forward all unknown attribute lookups to the check instance for access to submission methods, hostname, etc.\n attribute = getattr(self.check, name)\n setattr(self, name, attribute)\n return attribute\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def __init__(self, check, config):\n \"\"\"\n The base class for any scraper overrides.\n \"\"\"\n\n self.config = config\n\n # Save a reference to the check instance\n self.check = check\n\n # Parse the configuration\n self.endpoint = config['openmetrics_endpoint']\n\n self.metric_transformer = MetricTransformer(self.check, config)\n self.label_aggregator = LabelAggregator(self.check, config)\n\n self.enable_telemetry = is_affirmative(config.get('telemetry', False))\n # Make every telemetry submission method a no-op to avoid many lookups of `self.enable_telemetry`\n if not self.enable_telemetry:\n for name, _ in inspect.getmembers(self, predicate=inspect.ismethod):\n if name.startswith('submit_telemetry_'):\n setattr(self, name, no_op)\n\n # Prevent overriding an integration's defined namespace\n self.namespace = check.__NAMESPACE__ or config.get('namespace', '')\n if not isinstance(self.namespace, str):\n raise ConfigurationError('Setting `namespace` must be a string')\n\n self.raw_metric_prefix = config.get('raw_metric_prefix', '')\n if not isinstance(self.raw_metric_prefix, str):\n raise ConfigurationError('Setting `raw_metric_prefix` must be a string')\n\n self.enable_health_service_check = is_affirmative(config.get('enable_health_service_check', True))\n self.ignore_connection_errors = is_affirmative(config.get('ignore_connection_errors', False))\n\n self.hostname_label = config.get('hostname_label', '')\n if not isinstance(self.hostname_label, str):\n raise ConfigurationError('Setting `hostname_label` must be a string')\n\n hostname_format = config.get('hostname_format', '')\n if not isinstance(hostname_format, str):\n raise ConfigurationError('Setting `hostname_format` must be a string')\n\n self.hostname_formatter = None\n if self.hostname_label and hostname_format:\n placeholder = '<HOSTNAME>'\n if placeholder not in hostname_format:\n raise ConfigurationError(f'Setting `hostname_format` does not contain the placeholder `{placeholder}`')\n\n self.hostname_formatter = lambda hostname: hostname_format.replace('<HOSTNAME>', hostname, 1)\n\n exclude_labels = config.get('exclude_labels', [])\n if not isinstance(exclude_labels, list):\n raise ConfigurationError('Setting `exclude_labels` must be an array')\n\n self.exclude_labels = set()\n for i, entry in enumerate(exclude_labels, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `exclude_labels` must be a string')\n\n self.exclude_labels.add(entry)\n\n include_labels = config.get('include_labels', [])\n if not isinstance(include_labels, list):\n raise ConfigurationError('Setting `include_labels` must be an array')\n self.include_labels = set()\n for i, entry in enumerate(include_labels, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `include_labels` must be a string')\n if entry in self.exclude_labels:\n self.log.debug(\n 'Label `%s` is set in both `exclude_labels` and `include_labels`. Excluding label.', entry\n )\n self.include_labels.add(entry)\n\n self.rename_labels = config.get('rename_labels', {})\n if not isinstance(self.rename_labels, dict):\n raise ConfigurationError('Setting `rename_labels` must be a mapping')\n\n for key, value in self.rename_labels.items():\n if not isinstance(value, str):\n raise ConfigurationError(f'Value for label `{key}` of setting `rename_labels` must be a string')\n\n exclude_metrics = config.get('exclude_metrics', [])\n if not isinstance(exclude_metrics, list):\n raise ConfigurationError('Setting `exclude_metrics` must be an array')\n\n self.exclude_metrics = set()\n self.exclude_metrics_pattern = None\n exclude_metrics_patterns = []\n for i, entry in enumerate(exclude_metrics, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `exclude_metrics` must be a string')\n\n escaped_entry = re.escape(entry)\n if entry == escaped_entry:\n self.exclude_metrics.add(entry)\n else:\n exclude_metrics_patterns.append(entry)\n\n if exclude_metrics_patterns:\n self.exclude_metrics_pattern = re.compile('|'.join(exclude_metrics_patterns))\n\n self.exclude_metrics_by_labels = {}\n exclude_metrics_by_labels = config.get('exclude_metrics_by_labels', {})\n if not isinstance(exclude_metrics_by_labels, dict):\n raise ConfigurationError('Setting `exclude_metrics_by_labels` must be a mapping')\n elif exclude_metrics_by_labels:\n for label, values in exclude_metrics_by_labels.items():\n if values is True:\n self.exclude_metrics_by_labels[label] = return_true\n elif isinstance(values, list):\n for i, value in enumerate(values, 1):\n if not isinstance(value, str):\n raise ConfigurationError(\n f'Value #{i} for label `{label}` of setting `exclude_metrics_by_labels` '\n f'must be a string'\n )\n\n self.exclude_metrics_by_labels[label] = (\n lambda label_value, pattern=re.compile('|'.join(values)): pattern.search( # noqa: B008\n label_value\n ) # noqa: B008, E501\n is not None\n )\n else:\n raise ConfigurationError(\n f'Label `{label}` of setting `exclude_metrics_by_labels` must be an array or set to `true`'\n )\n\n custom_tags = config.get('tags', []) # type: List[str]\n if not isinstance(custom_tags, list):\n raise ConfigurationError('Setting `tags` must be an array')\n\n for i, entry in enumerate(custom_tags, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `tags` must be a string')\n\n # Some tags can be ignored to reduce the cardinality.\n # This can be useful for cost optimization in containerized environments\n # when the openmetrics check is configured to collect custom metrics.\n # Even when the Agent's Tagger is configured to add low-cardinality tags only,\n # some tags can still generate unwanted metric contexts (e.g pod annotations as tags).\n ignore_tags = config.get('ignore_tags', [])\n if ignore_tags:\n ignored_tags_re = re.compile('|'.join(set(ignore_tags)))\n custom_tags = [tag for tag in custom_tags if not ignored_tags_re.search(tag)]\n\n self.static_tags = copy(custom_tags)\n if is_affirmative(self.config.get('tag_by_endpoint', True)):\n self.static_tags.append(f'endpoint:{self.endpoint}')\n\n # These will be applied only to service checks\n self.static_tags = tuple(self.static_tags)\n # These will be applied to everything except service checks\n self.tags = self.static_tags\n\n self.raw_line_filter = None\n raw_line_filters = config.get('raw_line_filters', [])\n if not isinstance(raw_line_filters, list):\n raise ConfigurationError('Setting `raw_line_filters` must be an array')\n elif raw_line_filters:\n for i, entry in enumerate(raw_line_filters, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `raw_line_filters` must be a string')\n\n self.raw_line_filter = re.compile('|'.join(raw_line_filters))\n\n self.http = RequestsWrapper(config, self.check.init_config, self.check.HTTP_CONFIG_REMAPPER, self.check.log)\n\n self._content_type = ''\n self._use_latest_spec = is_affirmative(config.get('use_latest_spec', False))\n if self._use_latest_spec:\n accept_header = 'application/openmetrics-text;version=1.0.0,application/openmetrics-text;version=0.0.1'\n else:\n accept_header = 'text/plain'\n\n # Request the appropriate exposition format\n if self.http.options['headers'].get('Accept') == '*/*':\n self.http.options['headers']['Accept'] = accept_header\n\n self.use_process_start_time = is_affirmative(config.get('use_process_start_time'))\n\n # Used for monotonic counts\n self.flush_first_value = False\n
Execute a scrape, and for each metric collected, transform the metric.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def scrape(self):\n \"\"\"\n Execute a scrape, and for each metric collected, transform the metric.\n \"\"\"\n runtime_data = {'flush_first_value': self.flush_first_value, 'static_tags': self.static_tags}\n\n for metric in self.consume_metrics(runtime_data):\n transformer = self.metric_transformer.get(metric)\n if transformer is None:\n continue\n\n transformer(metric, self.generate_sample_data(metric), runtime_data)\n\n self.flush_first_value = True\n
Yield the processed metrics and filter out excluded metrics.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def consume_metrics(self, runtime_data):\n \"\"\"\n Yield the processed metrics and filter out excluded metrics.\n \"\"\"\n\n metric_parser = self.parse_metrics()\n if not self.flush_first_value and self.use_process_start_time:\n metric_parser = first_scrape_handler(metric_parser, runtime_data, datadog_agent.get_process_start_time())\n if self.label_aggregator.configured:\n metric_parser = self.label_aggregator(metric_parser)\n\n for metric in metric_parser:\n if metric.name in self.exclude_metrics or (\n self.exclude_metrics_pattern is not None and self.exclude_metrics_pattern.search(metric.name)\n ):\n self.submit_telemetry_number_of_ignored_metric_samples(metric)\n continue\n\n yield metric\n
Get the line streamer and yield processed metrics.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def parse_metrics(self):\n \"\"\"\n Get the line streamer and yield processed metrics.\n \"\"\"\n\n line_streamer = self.stream_connection_lines()\n if self.raw_line_filter is not None:\n line_streamer = self.filter_connection_lines(line_streamer)\n\n # Since we determine `self.parse_metric_families` dynamically from the response and that's done as a\n # side effect inside the `line_streamer` generator, we need to consume the first line in order to\n # trigger that side effect.\n try:\n line_streamer = chain([next(line_streamer)], line_streamer)\n except StopIteration:\n # If line_streamer is an empty iterator, next(line_streamer) fails.\n return\n\n for metric in self.parse_metric_families(line_streamer):\n self.submit_telemetry_number_of_total_metric_samples(metric)\n\n # It is critical that the prefix is removed immediately so that\n # all other configuration may reference the trimmed metric name\n if self.raw_metric_prefix and metric.name.startswith(self.raw_metric_prefix):\n metric.name = metric.name[len(self.raw_metric_prefix) :]\n\n yield metric\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def generate_sample_data(self, metric):\n \"\"\"\n Yield a sample of processed data.\n \"\"\"\n\n label_normalizer = get_label_normalizer(metric.type)\n\n for sample in metric.samples:\n value = sample.value\n if isnan(value) or isinf(value):\n self.log.debug('Ignoring sample for metric `%s` as it has an invalid value: %s', metric.name, value)\n continue\n\n tags = []\n skip_sample = False\n labels = sample.labels\n self.label_aggregator.populate(labels)\n label_normalizer(labels)\n\n for label_name, label_value in labels.items():\n sample_excluder = self.exclude_metrics_by_labels.get(label_name)\n if sample_excluder is not None and sample_excluder(label_value):\n skip_sample = True\n break\n elif label_name in self.exclude_labels:\n continue\n elif self.include_labels and label_name not in self.include_labels:\n continue\n\n label_name = self.rename_labels.get(label_name, label_name)\n tags.append(f'{label_name}:{label_value}')\n\n if skip_sample:\n continue\n\n tags.extend(self.tags)\n\n hostname = \"\"\n if self.hostname_label and self.hostname_label in labels:\n hostname = labels[self.hostname_label]\n if self.hostname_formatter is not None:\n hostname = self.hostname_formatter(hostname)\n\n self.submit_telemetry_number_of_processed_metric_samples()\n yield sample, tags, hostname\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def stream_connection_lines(self):\n \"\"\"\n Yield the connection line.\n \"\"\"\n\n try:\n with self.get_connection() as connection:\n # Media type will be used to select parser dynamically\n self._content_type = connection.headers.get('Content-Type', '')\n for line in connection.iter_lines(decode_unicode=True):\n yield line\n except ConnectionError as e:\n if self.ignore_connection_errors:\n self.log.warning(\"OpenMetrics endpoint %s is not accessible\", self.endpoint)\n else:\n raise e\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def filter_connection_lines(self, line_streamer):\n \"\"\"\n Filter connection lines in the line streamer.\n \"\"\"\n\n for line in line_streamer:\n if self.raw_line_filter.search(line):\n self.submit_telemetry_number_of_ignored_lines()\n else:\n yield line\n
Send a request to scrape metrics. Return the response or throw an exception.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def get_connection(self):\n \"\"\"\n Send a request to scrape metrics. Return the response or throw an exception.\n \"\"\"\n\n try:\n response = self.send_request()\n except Exception as e:\n self.submit_health_check(ServiceCheck.CRITICAL, message=str(e))\n raise\n else:\n try:\n response.raise_for_status()\n except Exception as e:\n self.submit_health_check(ServiceCheck.CRITICAL, message=str(e))\n response.close()\n raise\n else:\n self.submit_health_check(ServiceCheck.OK)\n\n # Never derive the encoding from the locale\n if response.encoding is None:\n response.encoding = 'utf-8'\n\n self.submit_telemetry_endpoint_response_size(response)\n\n return response\n
If health service check is enabled, send an openmetrics.health service check.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def submit_health_check(self, status, **kwargs):\n \"\"\"\n If health service check is enabled, send an `openmetrics.health` service check.\n \"\"\"\n\n if self.enable_health_service_check:\n self.service_check(self.SERVICE_CHECK_HEALTH, status, tags=self.static_tags, **kwargs)\n
"},{"location":"base/openmetrics/#transformers","title":"Transformers","text":""},{"location":"base/openmetrics/#datadog_checks.base.checks.openmetrics.v2.transform.Transformers","title":"datadog_checks.base.checks.openmetrics.v2.transform.Transformers","text":"Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/transform.py
This OpenMetrics implementation is the updated version of the original Prometheus/OpenMetrics implementation. The docs for the deprecated implementation are still available as a reference.
TLS/SSL is widely used to provide communications over a secure network. Many of the software that Datadog supports has features to allow TLS/SSL. Therefore, the Datadog Agent may need to connect with TLS/SSL to get metrics.
For Agent v7.24+, checks compatible with TLS/SSL should not manually create a raw ssl.SSLContext. Instead, check implementations should use AgentCheck.get_tls_context() to obtain a TLS/SSL context.
get_tls_context() allows a few optional parameters which may be helpful when developing integrations.
Creates and cache an SSLContext instance based on user configuration. Note that user configuration can be overridden by using overrides. This should only be applied to older integration that manually set config values.
Since: Agent 7.24
Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def get_tls_context(self, refresh=False, overrides=None):\n # type: (bool, Dict[AnyStr, Any]) -> ssl.SSLContext\n \"\"\"\n Creates and cache an SSLContext instance based on user configuration.\n Note that user configuration can be overridden by using `overrides`.\n This should only be applied to older integration that manually set config values.\n\n Since: Agent 7.24\n \"\"\"\n if not hasattr(self, '_tls_context_wrapper'):\n self._tls_context_wrapper = TlsContextWrapper(\n self.instance or {}, self.TLS_CONFIG_REMAPPER, overrides=overrides\n )\n\n if refresh:\n self._tls_context_wrapper.refresh_tls_context()\n\n return self._tls_context_wrapper.tls_context\n
"},{"location":"ddev/about/","title":"What's in the box?","text":"
The Dev package, often referred to as its CLI entrypoint ddev, is fundamentally split into 2 parts.
The test framework provides everything necessary to test integrations, such as:
Dependencies like pytest, mock, requests, etc.
Utilities for consistently handling complex logic or common operations
An orchestrator for arbitrary E2E environments
Python 2 Alert!
Some integrations still support Python version 2.7 and must be tested with it. As a consequence, so must parts of our test framework, for example the pytest plugin.
The CLI provides the interface through which tests are invoked, E2E environments are managed, and general repository maintenance (such as dependency management) occurs.
As the dependencies of the test framework are a subset of what is required for the CLI, the CLI tooling may import from the test framework, but not vice versa.
The diagram below shows the import hierarchy between each component. Clicking a node will open that component's location in the source code.
graph BT\n A([Plugins])\n click A \"https://github.com/DataDog/integrations-core/tree/master/datadog_checks_dev/datadog_checks/dev/plugin\" \"Test framework plugins location\"\n\n B([Test framework])\n click B \"https://github.com/DataDog/integrations-core/tree/master/datadog_checks_dev/datadog_checks/dev\" \"Test framework location\"\n\n C([CLI])\n click C \"https://github.com/DataDog/integrations-core/tree/master/datadog_checks_dev/datadog_checks/dev/tooling\" \"CLI tooling location\"\n\n A-->B\n C-->B
Name Type Description Default --core, -c boolean Work on integrations-core. False--extras, -e boolean Work on integrations-extras. False--marketplace, -m boolean Work on marketplace. False--agent, -a boolean Work on datadog-agent. False--here, -x boolean Work on the current location. False--org, -o text Override org config field for this invocation. None --color / --no-color boolean Whether or not to display colored output (default is auto-detection) [env vars: FORCE_COLOR/NO_COLOR] None --interactive / --no-interactive boolean Whether or not to allow features like prompts and progress bars (default is auto-detection) [env var: DDEV_INTERACTIVE] None --verbose, -v integer range (0 and above) Increase verbosity (can be used additively) [env var: DDEV_VERBOSE] 0--quiet, -q integer range (0 and above) Decrease verbosity (can be used additively) [env var: DDEV_QUIET] 0--config text The path to a custom config file to use [env var: DDEV_CONFIG] None --version boolean Show the version and exit. False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-ci","title":"ddev ci","text":"
CI related utils. Anything here should be considered experimental.
Usage:
ddev ci [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-ci-setup","title":"ddev ci setup","text":"
Run CI setup scripts
Usage:
ddev ci setup [OPTIONS] [CHECKS]...\n
Options:
Name Type Description Default --changed boolean Only target changed checks False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-clean","title":"ddev clean","text":"
Remove build and test artifacts for the entire repository.
Usage:
ddev clean [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config","title":"ddev config","text":"
Manage the config file
Usage:
ddev config [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-edit","title":"ddev config edit","text":"
Edit the config file with your default editor.
Usage:
ddev config edit [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-explore","title":"ddev config explore","text":"
Open the config location in your file manager.
Usage:
ddev config explore [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-find","title":"ddev config find","text":"
Show the location of the config file.
Usage:
ddev config find [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-restore","title":"ddev config restore","text":"
Restore the config file to default settings.
Usage:
ddev config restore [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-set","title":"ddev config set","text":"
Assign values to config file entries. If the value is omitted, you will be prompted, with the input hidden if it is sensitive.
Usage:
ddev config set [OPTIONS] KEY [VALUE]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-show","title":"ddev config show","text":"
Show the contents of the config file.
Usage:
ddev config show [OPTIONS]\n
Options:
Name Type Description Default --all, -a boolean Do not scrub secret fields False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-create","title":"ddev create","text":"
Create scaffolding for a new integration.
NAME: The display name of the integration that will appear in documentation.
Usage:
ddev create [OPTIONS] NAME\n
Options:
Name Type Description Default --type, -t choice (check | jmx | logs | metrics_crawler | snmp_tile | tile) The type of integration to create. See below for more details. check--location, -l text The directory where files will be written None --non-interactive, -ni boolean Disable prompting for fields False--quiet, -q boolean Show less output False--dry-run, -n boolean Only show what would be created False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep","title":"ddev dep","text":"
Manage dependencies
Usage:
ddev dep [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep-freeze","title":"ddev dep freeze","text":"
Combine all dependencies for the Agent's static environment.
This reads and merges the dependency specs from individual integrations and writes them to agent_requirements.in
Usage:
ddev dep freeze [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep-pin","title":"ddev dep pin","text":"
Pin a dependency for all checks that require it.
Usage:
ddev dep pin [OPTIONS] DEFINITION\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep-sync","title":"ddev dep sync","text":"
Synchronize integration dependency spec with that of the agent as a whole.
Reads dependency spec from agent_requirements.in and propagates it to all integrations. For each integration we propagate only the relevant parts (i.e. its direct dependencies).
Usage:
ddev dep sync [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep-updates","title":"ddev dep updates","text":"
Automatically check for dependency updates
Usage:
ddev dep updates [OPTIONS]\n
Options:
Name Type Description Default --sync, -s boolean Update the dependency definitions False--include-security-deps, -i boolean Attempt to update security dependencies False--batch-size, -b integer The maximum number of dependencies to upgrade if syncing None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-docs","title":"ddev docs","text":"
Manage documentation.
Usage:
ddev docs [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-docs-build","title":"ddev docs build","text":"
Build documentation.
Usage:
ddev docs build [OPTIONS]\n
Options:
Name Type Description Default --check boolean Ensure links are valid False--pdf boolean Also export the site as PDF False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-docs-serve","title":"ddev docs serve","text":"
Serve documentation.
Usage:
ddev docs serve [OPTIONS]\n
Options:
Name Type Description Default --dirty boolean Speed up reload time by only rebuilding edited pages (based on modified time). For development only. False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env","title":"ddev env","text":"
Manage environments.
Usage:
ddev env [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-agent","title":"ddev env agent","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config","title":"ddev env config","text":"
Manage the config file
Usage:
ddev env config [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config-edit","title":"ddev env config edit","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config-explore","title":"ddev env config explore","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config-find","title":"ddev env config find","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config-show","title":"ddev env config show","text":"
Show the contents of the config file.
Usage:
ddev env config show [OPTIONS] INTEGRATION ENVIRONMENT\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-reload","title":"ddev env reload","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-shell","title":"ddev env shell","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-show","title":"ddev env show","text":"
Show active or available environments.
Usage:
ddev env show [OPTIONS] INTEGRATION [ENVIRONMENT]\n
Options:
Name Type Description Default --ascii boolean Whether or not to only use ASCII characters False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-start","title":"ddev env start","text":"
Name Type Description Default --dev boolean Install the local version of the integration False--base boolean Install the local version of the base package, implicitly enabling the --dev option False--agent, -a text The Agent build to use e.g. a Docker image like datadog/agent:latest. You can also use the name of an Agent defined in the agents configuration section. None -e text Environment variables to pass to the Agent e.g. -e DD_URL=app.datadoghq.com -e DD_API_KEY=foobar None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-stop","title":"ddev env stop","text":"
Stop environments. To stop all the running environments, use all as the integration name and the environment.
Usage:
ddev env stop [OPTIONS] INTEGRATION ENVIRONMENT\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-test","title":"ddev env test","text":"
Test environments.
This runs the end-to-end tests.
If no ENVIRONMENT is specified, active is selected which will test all environments that are currently running. You may choose all to test all environments whether or not they are running.
Testing active environments will not stop them after tests complete. Testing environments that are not running will start and stop them automatically.
See these docs for to pass ENVIRONMENT and PYTEST_ARGS:
https://datadoghq.dev/integrations-core/testing/
Usage:
ddev env test [OPTIONS] INTEGRATION [ENVIRONMENT] [PYTEST_ARGS]...\n
Options:
Name Type Description Default --dev boolean Install the local version of the integration False--base boolean Install the local version of the base package, implicitly enabling the --dev option False--agent, -a text The Agent build to use e.g. a Docker image like datadog/agent:latest. You can also use the name of an Agent defined in the agents configuration section. None -e text Environment variables to pass to the Agent e.g. -e DD_URL=app.datadoghq.com -e DD_API_KEY=foobar None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta","title":"ddev meta","text":"
Anything here should be considered experimental.
This meta namespace can be used for an arbitrary number of niche or beta features without bloating the root namespace.
Usage:
ddev meta [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-catalog","title":"ddev meta catalog","text":"
Create a catalog with information about integrations
Usage:
ddev meta catalog [OPTIONS] CHECKS...\n
Options:
Name Type Description Default -f, --file text Output to file (it will be overwritten), you can pass \"tmp\" to generate a temporary file None --markdown, -m boolean Output to markdown instead of CSV False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-changes","title":"ddev meta changes","text":"
Show changes since a specific date.
Usage:
ddev meta changes [OPTIONS] SINCE\n
Options:
Name Type Description Default --out, -o boolean Output to file False--eager boolean Skip validation of commit subjects False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-create-example-commits","title":"ddev meta create-example-commits","text":"
Create branch commits from example repo
Usage:
ddev meta create-example-commits [OPTIONS] SOURCE_DIR\n
Options:
Name Type Description Default --prefix, -p text Optional text to prefix each commit `` --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-dash","title":"ddev meta dash","text":"
Dashboard utilities
Usage:
ddev meta dash [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-dash-export","title":"ddev meta dash export","text":"
Export a Dashboard as JSON
Usage:
ddev meta dash export [OPTIONS] URL INTEGRATION\n
Options:
Name Type Description Default --author, -a text The owner of this integration's dashboard. Default is 'Datadog' Datadog--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-jmx","title":"ddev meta jmx","text":"
JMX utilities
Usage:
ddev meta jmx [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-jmx-query-endpoint","title":"ddev meta jmx query-endpoint","text":"
Query endpoint for JMX info
Usage:
ddev meta jmx query-endpoint [OPTIONS] HOST PORT [DOMAIN]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-manifest","title":"ddev meta manifest","text":"
Manifest utilities
Usage:
ddev meta manifest [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-manifest-migrate","title":"ddev meta manifest migrate","text":"
Helper tool to ease the migration of a manifest to a newer version, auto-filling fields when possible
Inputs:
integration: The name of the integration folder to perform the migration on
to_version: The schema version to upgrade the manifest to
Usage:
ddev meta manifest migrate [OPTIONS] INTEGRATION TO_VERSION\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-prom","title":"ddev meta prom","text":"
Prometheus utilities
Usage:
ddev meta prom [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-prom-info","title":"ddev meta prom info","text":"
Show metric info from a Prometheus endpoint.
Example: $ ddev meta prom info -e :8080/_status/vars
Usage:
ddev meta prom info [OPTIONS]\n
Options:
Name Type Description Default -e, --endpoint text N/A None -f, --file filename N/A None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-prom-parse","title":"ddev meta prom parse","text":"
Interactively parse metric info from a Prometheus endpoint and write it to metadata.csv.
Usage:
ddev meta prom parse [OPTIONS] CHECK\n
Options:
Name Type Description Default -e, --endpoint text N/A None -f, --file filename N/A None --here, -x boolean Output to the current location False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts","title":"ddev meta scripts","text":"
Miscellaneous scripts that may be useful.
Usage:
ddev meta scripts [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-email2ghuser","title":"ddev meta scripts email2ghuser","text":"
Given an email, attempt to find a Github username associated with the email.
$ ddev meta scripts email2ghuser example@datadoghq.com
Usage:
ddev meta scripts email2ghuser [OPTIONS] EMAIL\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-generate-metrics","title":"ddev meta scripts generate-metrics","text":"
Generate metrics with fake values for an integration
You can provide the site and API key as options:
$ ddev meta scripts generate-metrics --site --api-key
It's easier however to switch ddev's org setting temporarily:
$ ddev -o meta scripts generate-metrics
Usage:
ddev meta scripts generate-metrics [OPTIONS] INTEGRATION\n
Options:
Name Type Description Default --site text The datadog SITE to use, e.g. \"datadoghq.com\". If not provided we will use ddev config org settings. None --api-key text The API key. If not provided we will use ddev config org settings. None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-metrics2md","title":"ddev meta scripts metrics2md","text":"
Convert a check's metadata.csv file to a Markdown table, which will be copied to your clipboard.
By default it will be compact and only contain the most useful fields. If you wish to use arbitrary metric data, you may set the check to cb to target the current contents of your clipboard.
Usage:
ddev meta scripts metrics2md [OPTIONS] CHECK [FIELDS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-remove-labels","title":"ddev meta scripts remove-labels","text":"
Remove all labels from an issue or pull request. This is useful when there are too many labels and its state cannot be modified (known GitHub issue).
$ ddev meta scripts remove-labels 5626
Usage:
ddev meta scripts remove-labels [OPTIONS] ISSUE_NUMBER\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-serve-openmetrics-payload","title":"ddev meta scripts serve-openmetrics-payload","text":"
Serve and collect metrics from OpenMetrics files with a real Agent
$ ddev meta scripts serve-openmetrics-payload ray payload1.txt payload2.txt
Usage:
ddev meta scripts serve-openmetrics-payload [OPTIONS] INTEGRATION\n [PAYLOADS]...\n
Options:
Name Type Description Default -c, --config text Path to the config file to use for the integration. The openmetrics_endpoint option will be overriden to use the right URL. If not provided, the openmetrics_endpoint will be the only option configured. None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-upgrade-python","title":"ddev meta scripts upgrade-python","text":"
Upgrade the Python version of all test environments.
$ ddev meta scripts upgrade-python 3.11
Usage:
ddev meta scripts upgrade-python [OPTIONS] VERSION\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp","title":"ddev meta snmp","text":"
SNMP utilities
Usage:
ddev meta snmp [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-generate-profile-from-mibs","title":"ddev meta snmp generate-profile-from-mibs","text":"
Generate an SNMP profile from MIBs. Accepts a directory path containing mib files to be used as source to generate the profile, along with a filter if a device or family of devices support only a subset of oids from a mib.
filters is the path to a yaml file containing a collection of MIBs, with their list of MIB node names to be included. For example:
Note that each MIB:node_name correspond to exactly one and only one OID. However, some MIBs report legacy nodes that are overwritten.
To resolve, edit the MIB by removing legacy values manually before loading them with this profile generator. If a MIB is fully supported, it can be omitted from the filter as MIBs not found in a filter will be fully loaded. If a MIB is not fully supported, it can be listed with an empty node list, as CISCO-SYSLOG-MIB in the example.
-a, --aliases is an option to provide the path to a YAML file containing a list of aliases to be used as metric tags for tables, in the following format:
MIBs tables most of the time define a column OID within the table, or from a different table and even different MIB, which value can be used to index entries. This is the INDEX field in row nodes. As an example, entPhysicalContainsTable in ENTITY-MIB
Sometimes indexes are columns from another table, and we might want to use another column as it could have more human readable information - we might prefer to see the interface name vs its numerical table index. This can be achieved using metric_tag_aliases
Return a list of SNMP metrics and copy its yaml dump to the clipboard Metric tags need to be added manually
Usage:
ddev meta snmp generate-profile-from-mibs [OPTIONS] [MIB_FILES]...\n
Options:
Name Type Description Default -f, --filters text Path to OIDs filter None -a, --aliases text Path to metric tag aliases None --debug, -d boolean Include debug output False--interactive, -i boolean Prompt to confirm before saving to a file False--source, -s text Source of the MIBs files. Can be a url or a path for a directory https://raw.githubusercontent.com:443/DataDog/mibs.snmplabs.com/master/asn1/@mib@--compiled_mibs_path, -c text Source of compiled MIBs files. Can be a url or a path for a directory https://raw.githubusercontent.com/DataDog/mibs.snmplabs.com/master/json/@mib@--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-generate-traps-db","title":"ddev meta snmp generate-traps-db","text":"
Generate yaml or json formatted documents containing various information about traps. These files can be used by the Datadog Agent to enrich trap data. This command is intended for \"Network Devices Monitoring\" users who need to enrich traps that are not automatically supported by Datadog.
The expected workflow is as such:
1- Identify a type of device that is sending traps that Datadog does not already recognize.
2- Fetch all the MIBs that Datadog does not support.
3- Run ddev meta snmp generate-traps-db -o ./output_dir/ /path/to/my/mib1 /path/to/my/mib2
You'll need to install pysmi manually beforehand.
Usage:
ddev meta snmp generate-traps-db [OPTIONS] MIB_FILES...\n
Options:
Name Type Description Default --mib-sources, -s text Url or a path to a directory containing the dependencies for [mib_files...].Traps defined in these files are ignored. None --output-dir, -o directory Path to a directory where to store the created traps database file per MIB.Recommended option, do not use with --output-file None --output-file file Path to a file to store a compacted version of the traps database file. Do not use with --output-dir None --output-format choice (yaml | json) Use json instead of yaml for the output file(s). yaml--no-descr boolean Removes descriptions from the generated file(s) when set (more compact). False--debug, -d boolean Include debug output False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-translate-profile","title":"ddev meta snmp translate-profile","text":"
Do OID translation in a SNMP profile. This isn't a plain replacement, as it doesn't preserve comments and indent, but it should automate most of the work.
You'll need to install pysnmp and pysnmp-mibs manually beforehand.
Usage:
ddev meta snmp translate-profile [OPTIONS] PROFILE_PATH\n
Options:
Name Type Description Default --mib_source_url text Source url to fetch missing MIBS https://raw.githubusercontent.com:443/DataDog/mibs.snmplabs.com/master/asn1/@mib@--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-validate-mib-filenames","title":"ddev meta snmp validate-mib-filenames","text":"
Validate MIB file names. Frameworks used to load mib files expect MIB file names to match MIB name.
Usage:
ddev meta snmp validate-mib-filenames [OPTIONS] [MIB_FILES]...\n
Options:
Name Type Description Default --interactive, -i boolean Prompt to confirm before renaming all invalid MIB files False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-validate-profile","title":"ddev meta snmp validate-profile","text":"
Validate SNMP profiles
Usage:
ddev meta snmp validate-profile [OPTIONS]\n
Options:
Name Type Description Default -f, --file text Path to a profile file to validate None -d, --directory text Path to a directory of profiles to validate None -v, --verbose boolean Increase verbosity of error messages False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-windows","title":"ddev meta windows","text":"
Windows utilities
Usage:
ddev meta windows [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-windows-pdh","title":"ddev meta windows pdh","text":"
PDH utilities
Usage:
ddev meta windows pdh [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-windows-pdh-browse","title":"ddev meta windows pdh browse","text":"
Explore performance counters.
You'll need to install pywin32 manually beforehand.
Usage:
ddev meta windows pdh browse [OPTIONS] [COUNTERSET]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release","title":"ddev release","text":"
Manage the release of integrations.
Usage:
ddev release [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-agent","title":"ddev release agent","text":"
A collection of tasks related to the Datadog Agent.
Usage:
ddev release agent [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-agent-changelog","title":"ddev release agent changelog","text":"
Generates a markdown file containing the list of checks that changed for a given Agent release. Agent version numbers are derived inspecting tags on integrations-core so running this tool might provide unexpected results if the repo is not up to date with the Agent release process.
If neither --since or --to are passed (the most common use case), the tool will generate the whole changelog since Agent version 6.3.0 (before that point we don't have enough information to build the log).
Usage:
ddev release agent changelog [OPTIONS]\n
Options:
Name Type Description Default --since text Initial Agent version 6.3.0--to text Final Agent version None --write, -w boolean Write to the changelog file, if omitted contents will be printed to stdout False--force, -f boolean Replace an existing file False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-agent-integrations","title":"ddev release agent integrations","text":"
Generates a markdown file containing the list of integrations shipped in a given Agent release. Agent version numbers are derived by inspecting tags on integrations-core, so running this tool might provide unexpected results if the repo is not up to date with the Agent release process.
If neither --since nor --to are passed (the most common use case), the tool will generate the list for every Agent since version 6.3.0 (before that point we don't have enough information to build the log).
Usage:
ddev release agent integrations [OPTIONS]\n
Options:
Name Type Description Default --since text Initial Agent version 6.3.0--to text Final Agent version None --write, -w boolean Write to file, if omitted contents will be printed to stdout False--force, -f boolean Replace an existing file False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-agent-integrations-changelog","title":"ddev release agent integrations-changelog","text":"
Update integration CHANGELOG.md by adding the Agent version.
Agent version is only added to the integration versions released with a specific Agent release.
Name Type Description Default --since text Initial Agent version 6.3.0--to text Final Agent version None --write, -w boolean Write to the changelog file, if omitted contents will be printed to stdout False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-branch","title":"ddev release branch","text":"
Manage Agent release branches.
Usage:
ddev release branch [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-branch-create","title":"ddev release branch create","text":"
Create a branch for a release of the Agent.
BRANCH_NAME should match this pattern: ^\\d+.\\d+.x$, for example7.52.x`.
This command will also create the backport/<BRANCH_NAME> label in GitHub for this release branch.
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-branch-tag","title":"ddev release branch tag","text":"
Tag the release branch either as release candidate or final release.
Usage:
ddev release branch tag [OPTIONS]\n
Options:
Name Type Description Default --final / --rc boolean Whether we're tagging the final release or a release candidate (rc). False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-build","title":"ddev release build","text":"
Build a wheel for a check as it is on the repo HEAD
Usage:
ddev release build [OPTIONS] CHECK\n
Options:
Name Type Description Default --sdist, -s boolean N/A False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-changelog","title":"ddev release changelog","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-changelog-fix","title":"ddev release changelog fix","text":"
Fix changelog entries.
This command is only needed if you are manually writing to the changelog. For instance for marketplace and extras integrations. Don't use this in integrations-core because the changelogs there are generated automatically.
The first line of every new changelog entry must include the PR number in which the change occurred. This command will apply this suffix to manually added entries if it is missing.
Usage:
ddev release changelog fix [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-changelog-new","title":"ddev release changelog new","text":"
This creates new changelog entries in Markdown format.
If the ENTRY_TYPE is not specified, you will be prompted.
The --message option can be used to specify the changelog text. If this is not supplied, an editor will be opened for you to manually write the entry. The changelog text that is opened defaults to the PR title, followed by the most recent commit subject. If that is sufficient, then you may close the editor tab immediately.
By default, changelog entries will be created for all integrations that have changed code. To create entries only for specific targets, you may pass them as additional arguments after the entry type.
Usage:
ddev release changelog new [OPTIONS] [ENTRY_TYPE] [TARGETS]...\n
Options:
Name Type Description Default --message, -m text The changelog text None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-list","title":"ddev release list","text":"
Show all versions of an integration.
Usage:
ddev release list [OPTIONS] INTEGRATION\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-make","title":"ddev release make","text":"
Perform a set of operations needed to release checks:
update the version in __about__.py
update the changelog
update the requirements-agent-release.txt file
update in-toto metadata
commit the above changes
You can release everything at once by setting the check to all.
If you run into issues signing: - Ensure you did gpg --import <YOUR_KEY_ID>.gpg.pub
Usage:
ddev release make [OPTIONS] CHECKS...\n
Options:
Name Type Description Default --version text N/A None --end text N/A None --new boolean Ensure versions are at 1.0.0 False--skip-sign boolean Skip the signing of release metadata False--sign-only boolean Only sign release metadata False--exclude text Comma-separated list of checks to skip None --allow-master boolean Allow ddev to commit directly to master. Forbidden for core. False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-show","title":"ddev release show","text":"
To avoid GitHub's public API rate limits, you need to set github.user/github.token in your config file or use the DD_GITHUB_USER/DD_GITHUB_TOKEN environment variables.
Usage:
ddev release show [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-show-changes","title":"ddev release show changes","text":"
Show all the pending PRs for a given check.
Usage:
ddev release show changes [OPTIONS] CHECK\n
Options:
Name Type Description Default --tag-pattern text The regex pattern for the format of the tag. Required if the tag doesn't follow semver None --tag-prefix text Specify the prefix of the tag to use if the tag doesn't follow semver None --dry-run, -n boolean Run the command in dry-run mode False--since text The git ref to use instead of auto-detecting the tag to view changes since None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-show-ready","title":"ddev release show ready","text":"
Show all the checks that can be released.
Usage:
ddev release show ready [OPTIONS]\n
Options:
Name Type Description Default --quiet, -q boolean N/A False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-stats","title":"ddev release stats","text":"
A collection of tasks to generate reports about releases.
Usage:
ddev release stats [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-stats-merged-prs","title":"ddev release stats merged-prs","text":"
Prints the PRs merged between the first RC and the current RC/final build
Usage:
ddev release stats merged-prs [OPTIONS]\n
Options:
Name Type Description Default --from-ref, -f text Reference to start stats on (first RC tagged) _required --to-ref, -t text Reference to end stats at (current RC/final tag) _required --release-milestone, -r text Github release milestone _required --exclude-releases, -e boolean Flag to exclude the release PRs from the list False--export-csv text CSV file where the list will be exported None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-stats-report","title":"ddev release stats report","text":"
Prints some release stats we want to track
Usage:
ddev release stats report [OPTIONS]\n
Options:
Name Type Description Default --from-ref, -f text Reference to start stats on (first RC tagged) _required --to-ref, -t text Reference to end stats at (current RC/final tag) _required --release-milestone, -r text Github release milestone _required --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-tag","title":"ddev release tag","text":"
Tag the HEAD of the git repo with the current release number for a specific check. The tag is pushed to origin by default.
You can tag everything at once by setting the check to all.
Notice: specifying a different version than the one in __about__.py is a maintenance task that should be run under very specific circumstances (e.g. re-align an old release performed on the wrong commit).
Usage:
ddev release tag [OPTIONS] CHECK [VERSION]\n
Options:
Name Type Description Default --push / --no-push boolean N/A True--dry-run, -n boolean N/A False--skip-prerelease boolean N/A False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-upload","title":"ddev release upload","text":"
Release a specific check to PyPI as it is on the repo HEAD.
Usage:
ddev release upload [OPTIONS] CHECK\n
Options:
Name Type Description Default --sdist, -s boolean N/A False--dry-run, -n boolean N/A False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-run","title":"ddev run","text":"
Run commands in the proper repo.
Usage:
ddev run [OPTIONS] [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-status","title":"ddev status","text":"
Show information about the current environment.
Usage:
ddev status [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-test","title":"ddev test","text":"
Run unit and integration tests.
Please see these docs to know how to pass TARGET_SPEC and PYTEST_ARGS:
https://datadoghq.dev/integrations-core/testing/
Usage:
ddev test [OPTIONS] [TARGET_SPEC] [PYTEST_ARGS]...\n
Options:
Name Type Description Default --lint, -s boolean Run only lint & style checks False--fmt, -fs boolean Run only the code formatter False--bench, -b boolean Run only benchmarks False--latest boolean Only verify support of new product versions False--cov, -c boolean Measure code coverage False--compat boolean Check compatibility with the minimum allowed Agent version. Implies --recreate. False--ddtrace boolean Enable tracing during test execution False--memray boolean Measure memory usage during test execution False--recreate, -r boolean Recreate environments from scratch False--list, -l boolean Show available test environments False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate","title":"ddev validate","text":"
Verify certain aspects of the repo.
Usage:
ddev validate [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-agent-reqs","title":"ddev validate agent-reqs","text":"
Verify that the checks versions are in sync with the requirements-agent-release.txt file.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate agent-reqs [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-all","title":"ddev validate all","text":"
Run all CI validations for a repo.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate all [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-ci","title":"ddev validate ci","text":"
Validate CI infrastructure configuration.
Usage:
ddev validate ci [OPTIONS]\n
Options:
Name Type Description Default --sync boolean Update the CI configuration False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-codeowners","title":"ddev validate codeowners","text":"
Validate that every integration has an entry in the CODEOWNERS file.
Usage:
ddev validate codeowners [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-config","title":"ddev validate config","text":"
Validate default configuration files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate config [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --sync, -s boolean Generate example configuration files based on specifications False--verbose, -v boolean Verbose mode False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-dashboards","title":"ddev validate dashboards","text":"
Validate all Dashboard definition files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate dashboards [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --fix boolean Attempt to fix errors False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-dep","title":"ddev validate dep","text":"
This command will:
Verify the uniqueness of dependency versions across all checks, or optionally a single check
Verify all the dependencies are pinned.
Verify the embedded Python environment defined in the base check and requirements listed in every integration are compatible.
Verify each check specifies a CHECKS_BASE_REQ variable for datadog-checks-base requirement
Optionally verify that the datadog-checks-base requirement is lower-bounded
Optionally verify that the datadog-checks-base requirement satisfies specific version
Usage:
ddev validate dep [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --require-base-check-version boolean Require specific version for datadog-checks-base requirement False--min-base-check-version text Specify minimum version for datadog-checks-base requirement, e.g. 11.0.0 None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-eula","title":"ddev validate eula","text":"
Validate all EULA definition files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate eula [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-http","title":"ddev validate http","text":"
Validate all integrations for usage of HTTP wrapper.
If integrations is specified, only those will be validated, an 'all' check value will validate all checks.
Usage:
ddev validate http [OPTIONS] [INTEGRATIONS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-imports","title":"ddev validate imports","text":"
Validate proper imports in checks.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate imports [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --autofix boolean Apply suggested fix False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-integration-style","title":"ddev validate integration-style","text":"
Validate that check follows style guidelines.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Name Type Description Default --verbose, -v boolean Verbose mode False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-jmx-metrics","title":"ddev validate jmx-metrics","text":"
Validate all default JMX metrics definitions.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate jmx-metrics [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --verbose, -v boolean Verbose mode False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-labeler","title":"ddev validate labeler","text":"
Validate labeler configuration.
Usage:
ddev validate labeler [OPTIONS]\n
Options:
Name Type Description Default --sync boolean Update the labeler configuration False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-legacy-signature","title":"ddev validate legacy-signature","text":"
Validate that no integration uses the legacy signature.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-license-headers","title":"ddev validate license-headers","text":"
Validate license headers in python code files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all python files.
Usage:
ddev validate license-headers [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --fix boolean Attempt to fix errors False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-licenses","title":"ddev validate licenses","text":"
Validate third-party license list
Usage:
ddev validate licenses [OPTIONS]\n
Options:
Name Type Description Default --sync, -s boolean Generate the LICENSE-3rdparty.csv file False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-manifest","title":"ddev validate manifest","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-metadata","title":"ddev validate metadata","text":"
Validate metadata.csv files
If integrations is specified, only the check will be validated, an 'all' or empty value will validate all metadata.csv files, a changed value will validate changed integrations.
Name Type Description Default --check-duplicates boolean Output warnings if there are duplicate short names and descriptions False--show-warnings, -w boolean Show warnings in addition to failures False--sync boolean Update the file False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-models","title":"ddev validate models","text":"
Validate configuration data models.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate models [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --sync, -s boolean Generate data models based on specifications False--verbose, -v boolean Verbose mode False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-openmetrics","title":"ddev validate openmetrics","text":"
Validate OpenMetrics metric limit.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate nothing.
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-package","title":"ddev validate package","text":"
Validate all files for Python package metadata.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all files.
Usage:
ddev validate package [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-readmes","title":"ddev validate readmes","text":"
Validates README files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate readmes [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --format-links, -fl boolean Automatically format links False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-saved-views","title":"ddev validate saved-views","text":"
Validates saved view files
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all saved view files.
Usage:
ddev validate saved-views [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-service-checks","title":"ddev validate service-checks","text":"
Validate all service_checks.json files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate service-checks [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --sync boolean Generate example configuration files based on specifications False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-typos","title":"ddev validate typos","text":"
Validate spelling in the source code.
If check is specified, only the directory is validated. Use codespell command line tool to detect spelling errors.
Usage:
ddev validate typos [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --fix boolean Apply suggested fix False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-version","title":"ddev validate version","text":"
Check that the integration version is defined and makes sense.
It should exist.
In Python packages the CHANGELOG should be automatically generated and match about.py.
In new Python packages CHANGELOG should have no version and about.py should have 0.0.1 as the version.
For now the validation is limited to integrations-core. INTEGRATIONS can be one or more integrations or the special value \"all\"
Usage:
ddev validate version [OPTIONS] [INTEGRATIONS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/configuration/","title":"Configuration","text":"
All configuration can be managed entirely by the ddev config command group. To locate the TOML config file, run:
All CLI commands are aware of the current repository context, defined by the option repo. This option should be a reference to a key in repos which is set to the path of a supported repository. For example, this configuration:
would make it so running e.g. ddev test nginx will look for an integration named nginx in /path/to/integrations-core no matter what directory you are in. If the selected path does not exist, then the current directory will be used.
For running environments with a live Agent, you can select a specific build version to use with the option agent. This option should be a reference to a key in agents which is a mapping of environment types to Agent versions. For example, this configuration:
would make it so environments that define the type as docker will use the Docker image that was built with the latest commit to the datadog-agent repo.
You can switch to using a particular organization with the option org. This option should be a reference to a key in orgs which is a mapping containing data specific to the organization. For example, this configuration:
To avoid GitHub's public API rate limits, you need to set github.user/github.token in your config file or use the DD_GITHUB_USER/DD_GITHUB_TOKEN environment variables.
Run ddev config show to see if your GitHub user and token is set.
If not:
Run ddev config set github.user <YOUR_GITHUB_USERNAME>
Create a personal access token with public_repo and read:org permissions
Run ddev config set github.token then paste the token
Setting dd_check_style to true will enable 2 environments for enforcing our style conventions:
style - This will check the formatting and will error if any issues are found. You may use the -s/--style flag of ddev test to execute only this environment.
format_style - This will format the code for you, resolving the most common issues caught by style environment. You can run the formatter by using the -fs/--format-style flag of ddev test.
Our pytest plugin makes a few fixtures available globally for use during tests. Also, it's responsible for managing the control flow of E2E environments.
Most tests will execute checks via the run method of the AgentCheck interface (if the check is stateful).
A consequence of this is that, unlike the check method, exceptions are not propagated to the caller meaning not only can an exception not be asserted, but also errors are silently ignored.
The dd_run_check fixture takes a check instance and executes it while also propagating any exceptions like normal.
You can use the extract_message option to condense any exception message to just the original message rather than the full traceback.
def test_config(dd_run_check):\n check = AwesomeCheck('awesome', {}, [{'port': 'foo'}])\n\n with pytest.raises(Exception, match='^Option `port` must be an integer$'):\n dd_run_check(check, extract_message=True)\n
The dd_agent_check fixture will run the integration with a given configuration on a live Agent and return a populated aggregator. It accepts a single dict configuration representing either:
a single instance
a full configuration with top level keys instances, init_config, etc.
Internally, this is a wrapper around ddev env check and you can pass through any supported options or flags.
This fixture can only be used from tests marked as e2e. For example:
Occasionally, you will need to persist some data only known at the time of environment creation (like a generated token) through the test and environment tear down phases.
To do so, use the following fixtures:
dd_save_state - When executing the necessary steps to spin up an environment you may use this to save any object that can be serialized to JSON. For example:
dd_save_state('my_data', {'foo': 'bar'})\n
dd_get_state - This may be used to retrieve the data:
The mock_http_response fixture mocks HTTP requests for the lifetime of a test.
The fixture can be used to mock the response of an endpoint. In the following example, we can mock the Prometheus output.
def test(mock_http_response):\n mock_http_response(\n \"\"\"\n # HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.\n # TYPE go_memstats_alloc_bytes gauge\n go_memstats_alloc_bytes 6.396288e+06\n \"\"\"\n )\n ...\n
The fixture dd_environment_runner manages communication between environments and the ddev env command group. You will never use it directly as it runs automatically.
It acts upon a fixture named dd_environment that every integration's test suite will define if E2E testing on a live Agent is desired. This fixture is responsible for starting and stopping environments and must adhere to the following requirements:
It yields a single dict representing the default configuration the Agent will use. It must be either:
a single instance
a full configuration with top level keys instances, init_config, etc.
Additionally, you can pass a second dict containing metadata.
The setup logic must occur before the yield and the tear down logic must occur after it. Also, both steps must only execute based on the value of environment variables.
Setup - only if DDEV_E2E_UP is not set to false
Tear down - only if DDEV_E2E_DOWN is not set to false
Note
The provided Docker and Terraform environment runner utilities will do this automatically for you.
env_type - This is the type of interface that will be used to interact with the Agent. Currently, we support docker (default) and local.
env_vars - A dict of environment variables and their values that will be present when starting the Agent.
docker_volumes - A list of str representing Docker volume mounts if env_type is docker e.g. /local/path:/agent/container/path:ro.
docker_platform - The container architecture to use if env_type is docker. Currently, we support linux (default) and windows.
logs_config - A list of configs that will be used by the Logs Agent. You will never need to use this directly, but rather via higher level abstractions.
Most integrations monitor services like databases or web servers, rather than system properties like CPU usage. For such cases, you'll want to spin up an environment and gracefully tear it down when tests finish.
We define all environment actions in a fixture called dd_environment that looks semantically like this:
This is not only used for regular tests, but is also the basis of our E2E testing. The start command executes everything before the yield and the stop command executes everything after it.
We provide a few utilities for common environment types.
The terraform_run utility makes it easy to create services from a directory of Terraform files.
from datadog_checks.dev.terraform import terraform_run\n\n@pytest.fixture(scope='session')\ndef dd_environment():\n with terraform_run(os.path.join(HERE, 'terraform')):\n yield ...\n
Currently, we only use this for services that would be too complex to setup with Docker (like OpenStack) or things that cannot be provided by Docker (like vSphere). We provide some ready-to-use cloud templates that are available for referencing by default. We prefer using GCP when possible.
Terraform E2E tests are not run in our public CI as that would needlessly slow down builds.
The mocker fixture is provided by the pytest-mock plugin. This fixture automatically restores anything that was mocked at the end of each test and is more ergonomic to use than stacking decorators or nesting context managers.
The benchmark fixture is provided by the pytest-benchmark plugin. It enables the profiling of functions with the low-overhead cProfile module.
It is quite useful for seeing the approximate time a given check takes to run, as well as gaining insight into any potential performance bottlenecks. You would use it like this:
def test_large_payload(benchmark, dd_run_check):\n check = AwesomeCheck('awesome', {}, [instance])\n\n # Run once to get any initialization out of the way.\n dd_run_check(check)\n\n benchmark(dd_run_check, check)\n
To add benchmarks, define a bench environment in hatch.toml:
[envs.bench]\n
By default, the test command skips all benchmark environments. To run only benchmark environments use the --bench/-b flag. The results are sorted by tottime, which is the total time spent in the given function (and excluding time made in calls to sub-functions).
We provide an easy way to utilize log collection with E2E Docker environments.
Pass mount_logs=True to docker_run. This will use the logs example in the integration's config spec. For example, the following defines 2 example log files:
If mount_logs is a sequence of int, only the selected indices (starting at 1) will be used. So, using the Apache example above, to only monitor the error log you would set it to [2].
In lieu of a config spec, for whatever reason, you may set mount_logs to a dict containing the standard logs key.
All requested log files are available to reference as environment variables for any Docker calls as DD_LOG_<LOG_CONFIG_INDEX> where the indices start at 1.
A convenient context manager for safely setting up and tearing down Docker environments.
Parameters:
compose_file (str):\n A path to a Docker compose file. A custom tear\n down is not required when using this.\nbuild (bool):\n Whether or not to build images for when `compose_file` is provided\nservice_name (str):\n Optional name for when ``compose_file`` is provided\nup (callable):\n A custom setup callable\ndown (callable):\n A custom tear down callable. This is required when using a custom setup.\non_error (callable):\n A callable called in case of an unhandled exception\nsleep (float):\n Number of seconds to wait before yielding. This occurs after all conditions are successful.\nendpoints (list[str]):\n Endpoints to verify access for before yielding. Shorthand for adding\n `CheckEndpoints(endpoints)` to the `conditions` argument.\nlog_patterns (list[str | re.Pattern]):\n Regular expression patterns to find in Docker logs before yielding.\n This is only available when `compose_file` is provided. Shorthand for adding\n `CheckDockerLogs(compose_file, log_patterns, 'all')` to the `conditions` argument.\nmount_logs (bool):\n Whether or not to mount log files in Agent containers based on example logs configuration\nconditions (callable):\n A list of callable objects that will be executed before yielding to check for errors\nenv_vars (dict[str, str]):\n A dictionary to update `os.environ` with during execution\nwrappers (list[callable]):\n A list of context managers to use during execution\nattempts (int):\n Number of attempts to run `up` and the `conditions` successfully. Defaults to 2 in CI\nattempts_wait (int):\n Time to wait between attempts\n
Source code in datadog_checks_dev/datadog_checks/dev/docker.py
@contextmanager\ndef docker_run(\n compose_file=None,\n build=False,\n service_name=None,\n up=None,\n down=None,\n on_error=None,\n sleep=None,\n endpoints=None,\n log_patterns=None,\n mount_logs=False,\n conditions=None,\n env_vars=None,\n wrappers=None,\n attempts=None,\n attempts_wait=1,\n):\n \"\"\"\n A convenient context manager for safely setting up and tearing down Docker environments.\n\n Parameters:\n\n compose_file (str):\n A path to a Docker compose file. A custom tear\n down is not required when using this.\n build (bool):\n Whether or not to build images for when `compose_file` is provided\n service_name (str):\n Optional name for when ``compose_file`` is provided\n up (callable):\n A custom setup callable\n down (callable):\n A custom tear down callable. This is required when using a custom setup.\n on_error (callable):\n A callable called in case of an unhandled exception\n sleep (float):\n Number of seconds to wait before yielding. This occurs after all conditions are successful.\n endpoints (list[str]):\n Endpoints to verify access for before yielding. Shorthand for adding\n `CheckEndpoints(endpoints)` to the `conditions` argument.\n log_patterns (list[str | re.Pattern]):\n Regular expression patterns to find in Docker logs before yielding.\n This is only available when `compose_file` is provided. Shorthand for adding\n `CheckDockerLogs(compose_file, log_patterns, 'all')` to the `conditions` argument.\n mount_logs (bool):\n Whether or not to mount log files in Agent containers based on example logs configuration\n conditions (callable):\n A list of callable objects that will be executed before yielding to check for errors\n env_vars (dict[str, str]):\n A dictionary to update `os.environ` with during execution\n wrappers (list[callable]):\n A list of context managers to use during execution\n attempts (int):\n Number of attempts to run `up` and the `conditions` successfully. Defaults to 2 in CI\n attempts_wait (int):\n Time to wait between attempts\n \"\"\"\n if compose_file and up:\n raise TypeError('You must select either a compose file or a custom setup callable, not both.')\n\n if compose_file is not None:\n if not isinstance(compose_file, str):\n raise TypeError('The path to the compose file is not a string: {}'.format(repr(compose_file)))\n\n set_up = ComposeFileUp(compose_file, build=build, service_name=service_name)\n if down is not None:\n tear_down = down\n else:\n tear_down = ComposeFileDown(compose_file)\n if on_error is None:\n on_error = ComposeFileLogs(compose_file)\n else:\n set_up = up\n tear_down = down\n\n docker_conditions = []\n\n if log_patterns is not None:\n if compose_file is None:\n raise ValueError(\n 'The `log_patterns` convenience is unavailable when using '\n 'a custom setup. Please use a custom condition instead.'\n )\n docker_conditions.append(CheckDockerLogs(compose_file, log_patterns, 'all'))\n\n if conditions is not None:\n docker_conditions.extend(conditions)\n\n wrappers = list(wrappers) if wrappers is not None else []\n\n if mount_logs:\n if isinstance(mount_logs, dict):\n wrappers.append(shared_logs(mount_logs['logs']))\n # Easy mode, read example config\n else:\n # An extra level deep because of the context manager\n check_root = find_check_root(depth=2)\n\n example_log_configs = _read_example_logs_config(check_root)\n if mount_logs is True:\n wrappers.append(shared_logs(example_log_configs))\n elif isinstance(mount_logs, (list, set)):\n wrappers.append(shared_logs(example_log_configs, mount_whitelist=mount_logs))\n else:\n raise TypeError(\n 'mount_logs: expected True, a list or a set, but got {}'.format(type(mount_logs).__name__)\n )\n\n with environment_run(\n up=set_up,\n down=tear_down,\n on_error=on_error,\n sleep=sleep,\n endpoints=endpoints,\n conditions=docker_conditions,\n env_vars=env_vars,\n wrappers=wrappers,\n attempts=attempts,\n attempts_wait=attempts_wait,\n ) as result:\n yield result\n
Determine the hostname Docker uses based on the environment, defaulting to localhost.
Source code in datadog_checks_dev/datadog_checks/dev/docker.py
def get_docker_hostname():\n \"\"\"\n Determine the hostname Docker uses based on the environment, defaulting to `localhost`.\n \"\"\"\n return urlparse(os.getenv('DOCKER_HOST', '')).hostname or 'localhost'\n
Get a Docker container's IP address from its ID or name.
Source code in datadog_checks_dev/datadog_checks/dev/docker.py
def get_container_ip(container_id_or_name):\n \"\"\"\n Get a Docker container's IP address from its ID or name.\n \"\"\"\n command = [\n 'docker',\n 'inspect',\n '-f',\n '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}',\n container_id_or_name,\n ]\n\n return run_command(command, capture='out', check=True).stdout.strip()\n
Returns a bool indicating whether or not a compose file has any active services.
Source code in datadog_checks_dev/datadog_checks/dev/docker.py
def compose_file_active(compose_file):\n \"\"\"\n Returns a `bool` indicating whether or not a compose file has any active services.\n \"\"\"\n command = ['docker', 'compose', '-f', compose_file, 'ps']\n lines = run_command(command, capture='out', check=True).stdout.strip().splitlines()\n\n return len(lines) > 1\n
A convenient context manager for safely setting up and tearing down Terraform environments.
Parameters:
directory (str):\n A path containing Terraform files\nsleep (float):\n Number of seconds to wait before yielding. This occurs after all conditions are successful.\nendpoints (list[str]):\n Endpoints to verify access for before yielding. Shorthand for adding\n `CheckEndpoints(endpoints)` to the `conditions` argument.\nconditions (list[callable]):\n A list of callable objects that will be executed before yielding to check for errors\nenv_vars (dict[str, str]):\n A dictionary to update `os.environ` with during execution\nwrappers (list[callable]):\n A list of context managers to use during execution\n
Source code in datadog_checks_dev/datadog_checks/dev/terraform.py
@contextmanager\ndef terraform_run(directory, sleep=None, endpoints=None, conditions=None, env_vars=None, wrappers=None):\n \"\"\"\n A convenient context manager for safely setting up and tearing down Terraform environments.\n\n Parameters:\n\n directory (str):\n A path containing Terraform files\n sleep (float):\n Number of seconds to wait before yielding. This occurs after all conditions are successful.\n endpoints (list[str]):\n Endpoints to verify access for before yielding. Shorthand for adding\n `CheckEndpoints(endpoints)` to the `conditions` argument.\n conditions (list[callable]):\n A list of callable objects that will be executed before yielding to check for errors\n env_vars (dict[str, str]):\n A dictionary to update `os.environ` with during execution\n wrappers (list[callable]):\n A list of context managers to use during execution\n \"\"\"\n if not shutil.which('terraform'):\n pytest.skip('Terraform not available')\n\n set_up = TerraformUp(directory)\n tear_down = TerraformDown(directory)\n\n with environment_run(\n up=set_up,\n down=tear_down,\n sleep=sleep,\n endpoints=endpoints,\n conditions=conditions,\n env_vars=env_vars,\n wrappers=wrappers,\n ) as result:\n yield result\n
This is not meant to be an exhaustive list of all the things we use, but rather a token of appreciation for the services and open source software we publicly benefit from.
The Python programming language, the default language of Agent Integrations, enables us and contributors to think about problems abstractly and express intent as clearly and concisely as possible.
A huge thanks to everyone involved in maintaining PyPI. We rely on it for providing all dependencies for not only tests, but also all Datadog Agent deployments.
Azure Pipelines is used for testing all Agent Integrations. A special shout-out to Microsoft for being extremely generous with our allowance of parallel runners; only they were able to meet the requirements of our unique monorepo.
GitHub Actions is used for all repository automation, like documentation deployment and pull request labeling.
"},{"location":"faq/faq/","title":"FAQ","text":""},{"location":"faq/faq/#integration-vs-check","title":"Integration vs Check","text":"
A Check is any integration whose execution is triggered directly in code by the Datadog Agent. Therefore, all Agent-based integrations written in Python or Go are considered Checks.
"},{"location":"faq/faq/#why-test-tests","title":"Why test tests","text":"
We track the coverage of tests in all cases as a drop in test coverage for test code means a test function or part of it is not called. For an example see this test bug fixed thanks to test coverage. See pyca/pynacl#290 and #4280 for more details.
Often, libraries that interact with a product will name their packages after the product. So if you name a file <PRODUCT_NAME>.py, and inside try to import the library of the same name, you will get import errors that will be difficult to diagnose.
Never name a Python file the same as the integration's name.
The base classes may freely add new attributes for new features. Therefore to avoid collisions it is recommended that attribute names be prefixed with underscores, especially for names that are generic. For an example, see below.
Since Agent v6, every instance of AgentCheck corresponds to a single YAML instance of an integration defined in the instances array of user configuration. As such, the instance argument the check method accepts is redundant and wasteful since you are parsing the same configuration at every run.
If you would like to create a default dashboard for an integration, follow the guidelines in the Best Practices section.
"},{"location":"guidelines/dashboards/#exporting-a-dashboard-payload","title":"Exporting a dashboard payload","text":"
When you've created a dashboard in the Datadog UI, you can export the dashboard payload to be included in its integration's assets directory.
Ensure that you have set an api_key and app_key for the org that contains the new dashboard in the ddev configuration.
Run the following command to export the dashboard:
ddev meta dash export <URL_OF_DASHBOARD> <INTEGRATION>\n
Tip
If the dashboard is for a contributor-maintained integration in the integration-extras repo, run ddev --extras meta ... instead of ddev meta ....
The command will add the dashboard definition to the manifest.json file of the integration. The dashboard JSON payload will be available in /assets/dashboards/<DASHBOARD_TITLE>.json.
Tip
The dashboard is available at the following address /dash/integration/<DASHBOARD_KEY> in each region, where <DASHBOARD_KEY> is the one you have in the manifest.json file of the integration for this dashboard. This can be useful when you want to add a link to another dashboard inside your dashboard.
Commit the changes and create a pull request.
"},{"location":"guidelines/dashboards/#verify-the-preset-dashboard","title":"Verify the Preset Dashboard","text":"
Once your PR is merged and synced on production, you can find your dashboard in the Dashboard List page.
Tip
Make sure the integration tile is Installed in order to see the preset dashboard in the list.
Ensure logos render correctly on the Dashboard List page and within the preset dashboard.
"},{"location":"guidelines/dashboards/#best-practices","title":"Best Practices","text":""},{"location":"guidelines/dashboards/#why-are-dashboard-best-practices-useful","title":"Why are dashboard best practices useful?","text":"
A dashboard that follows best practices helps users consume data quickly. Best practices reduce friction when figuring out where to search for specific information or how to interpret data and find meaning. Additionally, guidelines give dashboard makers a starting point when creating a new dashboard.
Attention-grabbing \"about\" section with a banner image, concise copy, useful links, and a good typography hierarchy
A brief, annotated \"overview\" section with the most important data, right at the top
Simple graph titles and title-case group names
Nearly symmetrical in high density mode
Well formatted, concise notes explaining the value or purpose of data in each group. Try the presets \"caption\", \"annotation\", or \"header\", or pick your own combination of styles. Avoid using the smallest font size for notes that are long or include complex formatting, like bulleted lists or code blocks.
All widgets are placed within a group based on thematic organization, rather than directly on the background of the dashboard
Query value widgets have a timeseries background (e.g. \"Bars\") instead of being blank
Visualizations with obvious thresholds or zones use semantic formatting for graphs or custom red/green/yellow text formatting for query values.
Color coordination between group headers, notes within groups, and graphs within groups (e.g. all group headers or note widgets the same color). If you've applied a vivid green to all group headers, try making its notes light green.
Legends for each graph. Legends make it easy to read a graph without having to hover over each series or maximize the widget. Make sure you use aliases so the legend is easy to read. Automatic mode for legends is a great option that hides legends when space is tight and shows them when there's room.
Adjacent graphs have aligned x-axes. If one graph is showing a legend and the other isn't, the x-axes won't align\u2014make sure they either both show a legend or both do not.
For timeseries, base the display type on the type of metric.
Types of metric Display type Volume (e.g. number of connections) area Counts (e.g. number of errors) bars Multiple groups or default lines
"},{"location":"guidelines/dashboards/#creating-a-new-dashboard","title":"Creating a New Dashboard","text":"
After selecting New Dashboard, you will have the option to choose from: Dashboard, Screenboard, and Timeboard. Dashboard is recommended.
Add a logo to the dashboard header. The integration logo will automatically appear in the header if the icon exists here and the integration_id matches the icon name. That means it will only appear when the dashboard you're working on is made into the official integration board.
Include the integration name in the dashboard title. (e.g. \"Elasticsearch Overview Dashboard\").
Warning
Avoid using - (hyphen) in the dashboard title as the dashboard URL is generated from the title.
"},{"location":"guidelines/dashboards/#standard-groups-to-include","title":"Standard Groups to Include","text":"
Always include an About group for the integration containing a brief description and helpful links. Edit the About group and select the \"banner\" display option (with the \"Show Title\" option unchecked), then link to a banner image like this: /static/images/integration_dashboard/your-image.png. For instructions on how to create and upload a banner image, go to the DRUIDS logo gallery, click the relevant logo, and click the Dashboard Banner tab. The About section should contain content, not data; avoid making the About section full-width. Consider copying the content in the About section into the hovercard that appears when hovering over the dashboard title.
Also include an Overview group containing service checks (e.g. liveness or readiness checks), a few of the most important metrics, and a monitor summary if you have pre-existing monitors for this integration, and place it at the top of the dashboard. The Overview section should contain data.
If log collection is enabled, make a Logs group. Insert a timeseries widget showing a bar graph of logs by status over time. Also include a log stream of logs with the \"Error\" or \"Critical\" status.
Tip
Consider turning groups into powerpacks if they appear repeatedly in dashboards irrespective of the integration type, so that you can insert the entire group with the correct formatting with a few clicks rather than adding the same widgets from scratch each time.\n
Research the metrics supported by the integration and consider grouping them in relevant categories. Groups containing prioritized metrics that are key to the performance and overview of the integration should be closer to the top. Some considerations when deciding which widgets should be grouped together:
Go from macro to micro levels within the system (e.g. for a database integration's dashboard, you could group node metrics in one group, index metrics in the next group, shard metrics in the third group)
Go from upstream to downstream sections within the system (e.g. for a data streams integration's dashboard, you could group producer metrics in one group, broker metrics in the next group, and consumer metrics in the third group)
Group together metrics that lead to the same actionable insights (e.g. all indexing metrics that reveal which indexes/shards should be optimized could all go in one group, while resource utilization metrics like disk space or memory usage that inform allocation and redistribution decisions should be grouped together in a separate group).
Template variables allow you to dynamically filter one or more widgets in a dashboard. Template variables must be universal and accessible by any user or account using the monitored service. Make sure all relevant graphs are listening to the relevant template variable filters. Template variables should be customized based on the type of technology.
Type of integration technology Typical Template Variable Database Shards Data Streaming Consumer ML Model Serving Model
Tip
Adding *=scope as a template variable is useful since users can access all their own tags.
Prioritize concise graph titles that start with the most important information. Avoid common phrases such as \"number of\", and don't include the integration title e.g. \"Memcached Load\".
Concise title (good) Verbose title (bad) Events per node Number of Kubernetes events per node Pending tasks: [$node_name] Total number of pending tasks in [$node_name] Read/write operations Number of read/write operations Connections to server - rate Rate of connections to server Load Memcached Load
Avoid repeating the group title or integration name in every widget in a group, especially if the widgets are query values with a custom unit of the same name. Note the word \"shards\" in each widget title in the group named \"shards\".
Always alias formulas
Group titles should be title case. Widget titles should be sentence case.
If you're showing a legend, make sure the aliases are easy to understand.
Graph titles should summarize the queried metric. Do not indicate the unit in the graph title because unit types are displayed automatically from metadata. An exception to this is if the calculation of the query represents a different type of unit.
Which widgets best represent your data? Try using a mix of widget types and sizes. Explore visualizations and formatting options until you're confident your dashboard is as clear as it can be. Sometimes a whole dashboard of timeseries is ok, but other times variety can improve things. The most commonly used metric widgets are timeseries, query values, and tables. For more information on the available widget types, see the list of supported dashboard widgets.
Try to make the left and right halves of your dashboard symmetrical in high density mode. Users with large monitors will see your dashboard in high density mode by default, so it's important to make sure the group relationships make sense, and the dashboard looks good. You can adjust group heights to achieve this, and move groups between the left and right halves.
a. (perfectly symmetrical)
b. (close enough)
Timeseries widgets should be at least 4 columns wide in order not to appear squashed on smaller displays.
Stream widgets should be at least 6 columns wide (half the dashboard width) for readability. You should place them at the end of a dashboard so they don't \"trap\" scrolling. It's useful to put stream widgets in a group by themselves so they can be collapsed. Add an event stream only if the service monitored by the dashboard is reporting events. Use sources:service_name.
Always check a dashboard at 1280px wide and 2560px wide to see how it looks on a smaller laptop and a larger monitor. The most common screen widths for dashboards are 1920, 1680, 1440, 2560, and 1280px, making up more than half of all dashboard page views combined.
Tip
If your monitor isn't large enough for high density mode, use the browser zoom controls to zoom out.
"},{"location":"guidelines/pr/","title":"Pull requests","text":""},{"location":"guidelines/pr/#separation-of-concerns","title":"Separation of concerns","text":"
Every pull request should do one thing only for easier Git management. For example, if you are editing documentation and notice an error in the shipped example configuration, fix the error in a separate pull request. Doing so enables a clean cherry-pick or revert of the bug fix should the need arise.
Different guidelines apply depending on which repo you are contributing to.
integrations-extras and marketplaceintegrations-core
Every PR must add a changelog entry to each integration that has had its shipped code modified.
Each integration that can be installed on the Agent has its own CHANGELOG.md file at the root of its directory. Entries accumulate under the Unreleased section and at release time get put under their own section. For example:
# CHANGELOG - Foo\n\n## Unreleased\n\n***Changed***:\n\n* Made a breaking change ([#9000](https://github.com/DataDog/repo/pull/9000))\n\n Here's some extra context [...]\n\n***Added***:\n\n* Add a cool feature ([#42](https://github.com/DataDog/repo/pull/42))\n\n## 1.2.3 / 2081-04-01\n\n***Fixed***:\n\n...\n
For changelog types, we adhere to those defined by Keep a Changelog:
Added for new features or any non-trivial refactors.
Changed for changes in existing functionality.
Deprecated for soon-to-be removed features.
Removed for now removed features.
Fixed for any bug fixes.
Security in case of vulnerabilities.
The first line of every new changelog entry must end with a link to the PR in which the change occurred. To automatically apply this suffix to manually added entries, you may run the release changelog fix command. To create new entries, you may use the release changelog new command.
Tip
You may apply the changelog/no-changelog label to remove the CI check for changelog entries.
Formatting rules
If you are contributing to integrations-core all you need to do is use the release changelog new command. It adds files in the changelog.d folder inside the integrations that you have modified. Commit these files and push them to your PR.
If you decide that you do not need a changelog because the change you made won't be shipped with the Agent, add the changelog/no-changelog label to the PR.
The header for an integration version should be in the following format: version number / YYYY-MM-DD / Agent Version Number. The Agent version number is not necessary, but a valid version number and date are required. The first header after the file's title can be Unreleased. The content under this section is the same as any other.
Version is formatted incorrectly on line {line number}: The version you inputted is not a valid version, or there is no / separator between the version and date in your header.
Date is formatted incorrectly on line {line number}: The date must be formatted as YYYY-MM-DD, with no spaces in between.
The changelog header must be capitalized and written in this format: ***HEADER***:. Note that it should be bold and italicized.
Changelog type is incorrect on line {line count}: The changelog header on that line is not one of the six valid changelog types.
Changelog header order is incorrect on line {line count}: The changelog header on that line is in the wrong order. Double check the ordering of the changelogs and ensure that the headers for the changelog types are correctly ordered by priority.
Changelogs should start with asterisks, on line {line count}: All changelog details below each header should be bullet points, using asterisks.
A tool to sort imports lexicographically, by section, and by type. We use the 5 standard sections: __future__, stdlib, third party, first party, and local.
datadog_checks is configured as a first party namespace.
An easy-to-use wrapper around pycodestyle and pyflakes. We select everything it provides and only ignore a few things to give precedence to other tools.
A flake8 plugin for finding likely bugs and design problems in programs. We enable:
B001: Do not use bare except:, it also catches unexpected events like memory errors, interrupts, system exit, and so on. Prefer except Exception:.
B003: Assigning to os.environ doesn't clear the environment. Subprocesses are going to see outdated variables, in disagreement with the current process. Use os.environ.clear() or the env= argument to Popen.
B006: Do not use mutable data structures for argument defaults. All calls reuse one instance of that data structure, persisting changes between them.
B007: Loop control variable not used within the loop body. If this is intended, start the name with an underscore.
B301: Python 3 does not include .iter* methods on dictionaries. The default behavior is to return iterables. Simply remove the iter prefix from the method. For Python 2 compatibility, also prefer the Python 3 equivalent if you expect that the size of the dict to be small and bounded. The performance regression on Python 2 will be negligible and the code is going to be the clearest. Alternatively, use six.iter*.
B305: .next() is not a thing on Python 3. Use the next() builtin. For Python 2 compatibility, use six.next().
B306: BaseException.message has been deprecated as of Python 2.6 and is removed in Python 3. Use str(e) to access the user-readable message. Use e.args to access arguments passed to the exception.
B902: Invalid first argument used for method. Use self for instance methods, and cls for class methods.
A comment-based type checker allowing a mix of dynamic and static typing. This is optional for now. In order to enable mypy for a specific integration, open its hatch.toml file and add the lines in the correct section:
The mypy-args defines the mypy command line option for this specific integration. --py2 is here to make sure the integration is Python2.7 compatible. Here are some useful flags you can add:
--check-untyped-defs: Type-checks the interior of functions without type annotations.
--disallow-untyped-defs: Disallows defining functions without type annotations or with incomplete type annotations.
The datadog_checks/ tests/ arguments represent the list of files that mypy should type check. Feel free to edit them as desired, including removing tests/ (if you'd prefer to not type-check the test suite), or targeting specific files (when doing partial type checking).
Note that there is a default configuration in the mypy.ini file.
Prometheus is an open source monitoring system for timeseries metric data. Many Datadog integrations collect metrics based on Prometheus exported data sets.
Prometheus-based integrations use the OpenMetrics exposition format to collect metrics.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
class OpenMetricsBaseCheck(OpenMetricsScraperMixin, AgentCheck):\n \"\"\"\n OpenMetricsBaseCheck is a class that helps scrape endpoints that emit Prometheus metrics only\n with YAML configurations.\n\n Minimal example configuration:\n\n instances:\n - prometheus_url: http://example.com/endpoint\n namespace: \"foobar\"\n metrics:\n - bar\n - foo\n\n Agent 6 signature:\n\n OpenMetricsBaseCheck(name, init_config, instances, default_instances=None, default_namespace=None)\n\n \"\"\"\n\n DEFAULT_METRIC_LIMIT = 2000\n\n HTTP_CONFIG_REMAPPER = {\n 'ssl_verify': {'name': 'tls_verify'},\n 'ssl_cert': {'name': 'tls_cert'},\n 'ssl_private_key': {'name': 'tls_private_key'},\n 'ssl_ca_cert': {'name': 'tls_ca_cert'},\n 'prometheus_timeout': {'name': 'timeout'},\n 'request_size': {'name': 'request_size', 'default': 10},\n }\n\n # Allow tracing for openmetrics integrations\n def __init_subclass__(cls, **kwargs):\n super().__init_subclass__(**kwargs)\n return traced_class(cls)\n\n def __init__(self, *args, **kwargs):\n \"\"\"\n The base class for any Prometheus-based integration.\n \"\"\"\n args = list(args)\n default_instances = kwargs.pop('default_instances', None) or {}\n default_namespace = kwargs.pop('default_namespace', None)\n\n legacy_kwargs_in_args = args[4:]\n del args[4:]\n\n if len(legacy_kwargs_in_args) > 0:\n default_instances = legacy_kwargs_in_args[0] or {}\n if len(legacy_kwargs_in_args) > 1:\n default_namespace = legacy_kwargs_in_args[1]\n\n super(OpenMetricsBaseCheck, self).__init__(*args, **kwargs)\n self.config_map = {}\n self._http_handlers = {}\n self.default_instances = default_instances\n self.default_namespace = default_namespace\n\n # pre-generate the scraper configurations\n\n if 'instances' in kwargs:\n instances = kwargs['instances']\n elif len(args) == 4:\n # instances from agent 5 signature\n instances = args[3]\n elif isinstance(args[2], (tuple, list)):\n # instances from agent 6 signature\n instances = args[2]\n else:\n instances = None\n\n if instances is not None:\n for instance in instances:\n possible_urls = instance.get('possible_prometheus_urls')\n if possible_urls is not None:\n for url in possible_urls:\n try:\n new_instance = deepcopy(instance)\n new_instance.update({'prometheus_url': url})\n scraper_config = self.get_scraper_config(new_instance)\n response = self.send_request(url, scraper_config)\n response.raise_for_status()\n instance['prometheus_url'] = url\n self.get_scraper_config(instance)\n break\n except (IOError, requests.HTTPError, requests.exceptions.SSLError) as e:\n self.log.info(\"Couldn't connect to %s: %s, trying next possible URL.\", url, str(e))\n else:\n raise CheckException(\n \"The agent could not connect to any of the following URLs: %s.\" % possible_urls\n )\n else:\n self.get_scraper_config(instance)\n\n def check(self, instance):\n # Get the configuration for this specific instance\n scraper_config = self.get_scraper_config(instance)\n\n # We should be specifying metrics for checks that are vanilla OpenMetricsBaseCheck-based\n if not scraper_config['metrics_mapper']:\n raise CheckException(\n \"You have to collect at least one metric from the endpoint: {}\".format(scraper_config['prometheus_url'])\n )\n\n self.process(scraper_config)\n\n def get_scraper_config(self, instance):\n \"\"\"\n Validates the instance configuration and creates a scraper configuration for a new instance.\n If the endpoint already has a corresponding configuration, return the cached configuration.\n \"\"\"\n endpoint = instance.get('prometheus_url')\n\n if endpoint is None:\n raise CheckException(\"Unable to find prometheus URL in config file.\")\n\n # If we've already created the corresponding scraper configuration, return it\n if endpoint in self.config_map:\n return self.config_map[endpoint]\n\n # Otherwise, we create the scraper configuration\n config = self.create_scraper_configuration(instance)\n\n # Add this configuration to the config_map\n self.config_map[endpoint] = config\n\n return config\n\n def _finalize_tags_to_submit(self, _tags, metric_name, val, metric, custom_tags=None, hostname=None):\n \"\"\"\n Format the finalized tags\n This is generally a noop, but it can be used to change the tags before sending metrics\n \"\"\"\n return _tags\n\n def _filter_metric(self, metric, scraper_config):\n \"\"\"\n Used to filter metrics at the beginning of the processing, by default no metric is filtered\n \"\"\"\n return False\n
The base class for any Prometheus-based integration.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
def __init__(self, *args, **kwargs):\n \"\"\"\n The base class for any Prometheus-based integration.\n \"\"\"\n args = list(args)\n default_instances = kwargs.pop('default_instances', None) or {}\n default_namespace = kwargs.pop('default_namespace', None)\n\n legacy_kwargs_in_args = args[4:]\n del args[4:]\n\n if len(legacy_kwargs_in_args) > 0:\n default_instances = legacy_kwargs_in_args[0] or {}\n if len(legacy_kwargs_in_args) > 1:\n default_namespace = legacy_kwargs_in_args[1]\n\n super(OpenMetricsBaseCheck, self).__init__(*args, **kwargs)\n self.config_map = {}\n self._http_handlers = {}\n self.default_instances = default_instances\n self.default_namespace = default_namespace\n\n # pre-generate the scraper configurations\n\n if 'instances' in kwargs:\n instances = kwargs['instances']\n elif len(args) == 4:\n # instances from agent 5 signature\n instances = args[3]\n elif isinstance(args[2], (tuple, list)):\n # instances from agent 6 signature\n instances = args[2]\n else:\n instances = None\n\n if instances is not None:\n for instance in instances:\n possible_urls = instance.get('possible_prometheus_urls')\n if possible_urls is not None:\n for url in possible_urls:\n try:\n new_instance = deepcopy(instance)\n new_instance.update({'prometheus_url': url})\n scraper_config = self.get_scraper_config(new_instance)\n response = self.send_request(url, scraper_config)\n response.raise_for_status()\n instance['prometheus_url'] = url\n self.get_scraper_config(instance)\n break\n except (IOError, requests.HTTPError, requests.exceptions.SSLError) as e:\n self.log.info(\"Couldn't connect to %s: %s, trying next possible URL.\", url, str(e))\n else:\n raise CheckException(\n \"The agent could not connect to any of the following URLs: %s.\" % possible_urls\n )\n else:\n self.get_scraper_config(instance)\n
"},{"location":"legacy/prometheus/#datadog_checks.base.checks.openmetrics.base_check.OpenMetricsBaseCheck.check","title":"check(instance)","text":"Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
def check(self, instance):\n # Get the configuration for this specific instance\n scraper_config = self.get_scraper_config(instance)\n\n # We should be specifying metrics for checks that are vanilla OpenMetricsBaseCheck-based\n if not scraper_config['metrics_mapper']:\n raise CheckException(\n \"You have to collect at least one metric from the endpoint: {}\".format(scraper_config['prometheus_url'])\n )\n\n self.process(scraper_config)\n
Validates the instance configuration and creates a scraper configuration for a new instance. If the endpoint already has a corresponding configuration, return the cached configuration.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
def get_scraper_config(self, instance):\n \"\"\"\n Validates the instance configuration and creates a scraper configuration for a new instance.\n If the endpoint already has a corresponding configuration, return the cached configuration.\n \"\"\"\n endpoint = instance.get('prometheus_url')\n\n if endpoint is None:\n raise CheckException(\"Unable to find prometheus URL in config file.\")\n\n # If we've already created the corresponding scraper configuration, return it\n if endpoint in self.config_map:\n return self.config_map[endpoint]\n\n # Otherwise, we create the scraper configuration\n config = self.create_scraper_configuration(instance)\n\n # Add this configuration to the config_map\n self.config_map[endpoint] = config\n\n return config\n
"},{"location":"legacy/prometheus/#datadog_checks.base.checks.openmetrics.mixins.OpenMetricsScraperMixin","title":"datadog_checks.base.checks.openmetrics.mixins.OpenMetricsScraperMixin","text":"Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
class OpenMetricsScraperMixin(object):\n # pylint: disable=E1101\n # This class is not supposed to be used by itself, it provides scraping behavior but\n # need to be within a check in the end\n\n # indexes in the sample tuple of core.Metric\n SAMPLE_NAME = 0\n SAMPLE_LABELS = 1\n SAMPLE_VALUE = 2\n\n MICROS_IN_S = 1000000\n\n MINUS_INF = float(\"-inf\")\n\n TELEMETRY_GAUGE_MESSAGE_SIZE = \"payload.size\"\n TELEMETRY_COUNTER_METRICS_BLACKLIST_COUNT = \"metrics.blacklist.count\"\n TELEMETRY_COUNTER_METRICS_INPUT_COUNT = \"metrics.input.count\"\n TELEMETRY_COUNTER_METRICS_IGNORE_COUNT = \"metrics.ignored.count\"\n TELEMETRY_COUNTER_METRICS_PROCESS_COUNT = \"metrics.processed.count\"\n\n METRIC_TYPES = ['counter', 'gauge', 'summary', 'histogram']\n\n KUBERNETES_TOKEN_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'\n METRICS_WITH_COUNTERS = {\"counter\", \"histogram\", \"summary\"}\n\n def __init__(self, *args, **kwargs):\n # Initialize AgentCheck's base class\n super(OpenMetricsScraperMixin, self).__init__(*args, **kwargs)\n\n def create_scraper_configuration(self, instance=None):\n \"\"\"\n Creates a scraper configuration.\n\n If instance does not specify a value for a configuration option, the value will default to the `init_config`.\n Otherwise, the `default_instance` value will be used.\n\n A default mixin configuration will be returned if there is no instance.\n \"\"\"\n if 'openmetrics_endpoint' in instance:\n raise CheckException('The setting `openmetrics_endpoint` is only available for Agent version 7 or later')\n\n # We can choose to create a default mixin configuration for an empty instance\n if instance is None:\n instance = {}\n\n # Supports new configuration options\n config = copy.deepcopy(instance)\n\n # Set the endpoint\n endpoint = instance.get('prometheus_url')\n if instance and endpoint is None:\n raise CheckException(\"You have to define a prometheus_url for each prometheus instance\")\n\n # Set the bearer token authorization to customer value, then get the bearer token\n self.update_prometheus_url(instance, config, endpoint)\n\n # `NAMESPACE` is the prefix metrics will have. Need to be hardcoded in the\n # child check class.\n namespace = instance.get('namespace')\n # Check if we have a namespace\n if instance and namespace is None:\n if self.default_namespace is None:\n raise CheckException(\"You have to define a namespace for each prometheus check\")\n namespace = self.default_namespace\n\n config['namespace'] = namespace\n\n # Retrieve potential default instance settings for the namespace\n default_instance = self.default_instances.get(namespace, {})\n\n def _get_setting(name, default):\n return instance.get(name, default_instance.get(name, default))\n\n # `metrics_mapper` is a dictionary where the keys are the metrics to capture\n # and the values are the corresponding metrics names to have in datadog.\n # Note: it is empty in the parent class but will need to be\n # overloaded/hardcoded in the final check not to be counted as custom metric.\n\n # Metrics are preprocessed if no mapping\n metrics_mapper = {}\n # We merge list and dictionaries from optional defaults & instance settings\n metrics = default_instance.get('metrics', []) + instance.get('metrics', [])\n for metric in metrics:\n if isinstance(metric, string_types):\n metrics_mapper[metric] = metric\n else:\n metrics_mapper.update(metric)\n\n config['metrics_mapper'] = metrics_mapper\n\n # `_wildcards_re` is a Pattern object used to match metric wildcards\n config['_wildcards_re'] = None\n\n wildcards = set()\n for metric in config['metrics_mapper']:\n if \"*\" in metric:\n wildcards.add(translate(metric))\n\n if wildcards:\n config['_wildcards_re'] = compile('|'.join(wildcards))\n\n # `prometheus_metrics_prefix` allows to specify a prefix that all\n # prometheus metrics should have. This can be used when the prometheus\n # endpoint we are scrapping allows to add a custom prefix to it's\n # metrics.\n config['prometheus_metrics_prefix'] = instance.get(\n 'prometheus_metrics_prefix', default_instance.get('prometheus_metrics_prefix', '')\n )\n\n # `label_joins` holds the configuration for extracting 1:1 labels from\n # a target metric to all metric matching the label, example:\n # self.label_joins = {\n # 'kube_pod_info': {\n # 'labels_to_match': ['pod'],\n # 'labels_to_get': ['node', 'host_ip']\n # }\n # }\n config['label_joins'] = default_instance.get('label_joins', {})\n config['label_joins'].update(instance.get('label_joins', {}))\n\n # `_label_mapping` holds the additionals label info to add for a specific\n # label value, example:\n # self._label_mapping = {\n # 'pod': {\n # 'dd-agent-9s1l1': {\n # \"node\": \"yolo\",\n # \"host_ip\": \"yey\"\n # }\n # }\n # }\n config['_label_mapping'] = {}\n\n # `_active_label_mapping` holds a dictionary of label values found during the run\n # to cleanup the label_mapping of unused values, example:\n # self._active_label_mapping = {\n # 'pod': {\n # 'dd-agent-9s1l1': True\n # }\n # }\n config['_active_label_mapping'] = {}\n\n # `_watched_labels` holds the sets of labels to watch for enrichment\n config['_watched_labels'] = {}\n\n config['_dry_run'] = True\n\n # Some metrics are ignored because they are duplicates or introduce a\n # very high cardinality. Metrics included in this list will be silently\n # skipped without a 'Unable to handle metric' debug line in the logs\n config['ignore_metrics'] = instance.get('ignore_metrics', default_instance.get('ignore_metrics', []))\n config['_ignored_metrics'] = set()\n\n # `_ignored_re` is a Pattern object used to match ignored metric patterns\n config['_ignored_re'] = None\n ignored_patterns = set()\n\n # Separate ignored metric names and ignored patterns in different sets for faster lookup later\n for metric in config['ignore_metrics']:\n if '*' in metric:\n ignored_patterns.add(translate(metric))\n else:\n config['_ignored_metrics'].add(metric)\n\n if ignored_patterns:\n config['_ignored_re'] = compile('|'.join(ignored_patterns))\n\n # Ignore metrics based on label keys or specific label values\n config['ignore_metrics_by_labels'] = instance.get(\n 'ignore_metrics_by_labels', default_instance.get('ignore_metrics_by_labels', {})\n )\n\n # If you want to send the buckets as tagged values when dealing with histograms,\n # set send_histograms_buckets to True, set to False otherwise.\n config['send_histograms_buckets'] = is_affirmative(\n instance.get('send_histograms_buckets', default_instance.get('send_histograms_buckets', True))\n )\n\n # If you want the bucket to be non cumulative and to come with upper/lower bound tags\n # set non_cumulative_buckets to True, enabled when distribution metrics are enabled.\n config['non_cumulative_buckets'] = is_affirmative(\n instance.get('non_cumulative_buckets', default_instance.get('non_cumulative_buckets', False))\n )\n\n # Send histograms as datadog distribution metrics\n config['send_distribution_buckets'] = is_affirmative(\n instance.get('send_distribution_buckets', default_instance.get('send_distribution_buckets', False))\n )\n\n # Non cumulative buckets are mandatory for distribution metrics\n if config['send_distribution_buckets'] is True:\n config['non_cumulative_buckets'] = True\n\n # If you want to send `counter` metrics as monotonic counts, set this value to True.\n # Set to False if you want to instead send those metrics as `gauge`.\n config['send_monotonic_counter'] = is_affirmative(\n instance.get('send_monotonic_counter', default_instance.get('send_monotonic_counter', True))\n )\n\n # If you want `counter` metrics to be submitted as both gauges and monotonic counts. Set this value to True.\n config['send_monotonic_with_gauge'] = is_affirmative(\n instance.get('send_monotonic_with_gauge', default_instance.get('send_monotonic_with_gauge', False))\n )\n\n config['send_distribution_counts_as_monotonic'] = is_affirmative(\n instance.get(\n 'send_distribution_counts_as_monotonic',\n default_instance.get('send_distribution_counts_as_monotonic', False),\n )\n )\n\n config['send_distribution_sums_as_monotonic'] = is_affirmative(\n instance.get(\n 'send_distribution_sums_as_monotonic',\n default_instance.get('send_distribution_sums_as_monotonic', False),\n )\n )\n\n # If the `labels_mapper` dictionary is provided, the metrics labels names\n # in the `labels_mapper` will use the corresponding value as tag name\n # when sending the gauges.\n config['labels_mapper'] = default_instance.get('labels_mapper', {})\n config['labels_mapper'].update(instance.get('labels_mapper', {}))\n # Rename bucket \"le\" label to \"upper_bound\"\n config['labels_mapper']['le'] = 'upper_bound'\n\n # `exclude_labels` is an array of label names to exclude. Those labels\n # will just not be added as tags when submitting the metric.\n config['exclude_labels'] = default_instance.get('exclude_labels', []) + instance.get('exclude_labels', [])\n\n # `include_labels` is an array of label names to include. If these labels are not in\n # the `exclude_labels` list, then they are added as tags when submitting the metric.\n config['include_labels'] = default_instance.get('include_labels', []) + instance.get('include_labels', [])\n\n # `type_overrides` is a dictionary where the keys are prometheus metric names\n # and the values are a metric type (name as string) to use instead of the one\n # listed in the payload. It can be used to force a type on untyped metrics.\n # Note: it is empty in the parent class but will need to be\n # overloaded/hardcoded in the final check not to be counted as custom metric.\n config['type_overrides'] = default_instance.get('type_overrides', {})\n config['type_overrides'].update(instance.get('type_overrides', {}))\n\n # `_type_override_patterns` is a dictionary where we store Pattern objects\n # that match metric names as keys, and their corresponding metric type overrides as values.\n config['_type_override_patterns'] = {}\n\n with_wildcards = set()\n for metric, type in iteritems(config['type_overrides']):\n if '*' in metric:\n config['_type_override_patterns'][compile(translate(metric))] = type\n with_wildcards.add(metric)\n\n # cleanup metric names with wildcards from the 'type_overrides' dict\n for metric in with_wildcards:\n del config['type_overrides'][metric]\n\n # Some metrics are retrieved from different hosts and often\n # a label can hold this information, this transfers it to the hostname\n config['label_to_hostname'] = instance.get('label_to_hostname', default_instance.get('label_to_hostname', None))\n\n # In combination to label_as_hostname, allows to add a common suffix to the hostnames\n # submitted. This can be used for instance to discriminate hosts between clusters.\n config['label_to_hostname_suffix'] = instance.get(\n 'label_to_hostname_suffix', default_instance.get('label_to_hostname_suffix', None)\n )\n\n # Add a 'health' service check for the prometheus endpoint\n config['health_service_check'] = is_affirmative(\n instance.get('health_service_check', default_instance.get('health_service_check', True))\n )\n\n # Can either be only the path to the certificate and thus you should specify the private key\n # or it can be the path to a file containing both the certificate & the private key\n config['ssl_cert'] = instance.get('ssl_cert', default_instance.get('ssl_cert', None))\n\n # Needed if the certificate does not include the private key\n #\n # /!\\ The private key to your local certificate must be unencrypted.\n # Currently, Requests does not support using encrypted keys.\n config['ssl_private_key'] = instance.get('ssl_private_key', default_instance.get('ssl_private_key', None))\n\n # The path to the trusted CA used for generating custom certificates\n config['ssl_ca_cert'] = instance.get('ssl_ca_cert', default_instance.get('ssl_ca_cert', None))\n\n # Whether or not to validate SSL certificates\n config['ssl_verify'] = is_affirmative(instance.get('ssl_verify', default_instance.get('ssl_verify', True)))\n\n # Extra http headers to be sent when polling endpoint\n config['extra_headers'] = default_instance.get('extra_headers', {})\n config['extra_headers'].update(instance.get('extra_headers', {}))\n\n # Timeout used during the network request\n config['prometheus_timeout'] = instance.get(\n 'prometheus_timeout', default_instance.get('prometheus_timeout', 10)\n )\n\n # Authentication used when polling endpoint\n config['username'] = instance.get('username', default_instance.get('username', None))\n config['password'] = instance.get('password', default_instance.get('password', None))\n\n # Custom tags that will be sent with each metric\n config['custom_tags'] = instance.get('tags', [])\n\n # Some tags can be ignored to reduce the cardinality.\n # This can be useful for cost optimization in containerized environments\n # when the openmetrics check is configured to collect custom metrics.\n # Even when the Agent's Tagger is configured to add low-cardinality tags only,\n # some tags can still generate unwanted metric contexts (e.g pod annotations as tags).\n ignore_tags = instance.get('ignore_tags', default_instance.get('ignore_tags', []))\n if ignore_tags:\n ignored_tags_re = compile('|'.join(set(ignore_tags)))\n config['custom_tags'] = [tag for tag in config['custom_tags'] if not ignored_tags_re.search(tag)]\n\n # Additional tags to be sent with each metric\n config['_metric_tags'] = []\n\n # List of strings to filter the input text payload on. If any line contains\n # one of these strings, it will be filtered out before being parsed.\n # INTERNAL FEATURE, might be removed in future versions\n config['_text_filter_blacklist'] = []\n\n # Refresh the bearer token every 60 seconds by default.\n # Ref https://github.com/DataDog/datadog-agent/pull/11686\n config['bearer_token_refresh_interval'] = instance.get(\n 'bearer_token_refresh_interval', default_instance.get('bearer_token_refresh_interval', 60)\n )\n\n config['telemetry'] = is_affirmative(instance.get('telemetry', default_instance.get('telemetry', False)))\n\n # The metric name services use to indicate build information\n config['metadata_metric_name'] = instance.get(\n 'metadata_metric_name', default_instance.get('metadata_metric_name')\n )\n\n # Map of metadata key names to label names\n config['metadata_label_map'] = instance.get(\n 'metadata_label_map', default_instance.get('metadata_label_map', {})\n )\n\n config['_default_metric_transformers'] = {}\n if config['metadata_metric_name'] and config['metadata_label_map']:\n config['_default_metric_transformers'][config['metadata_metric_name']] = self.transform_metadata\n\n # Whether or not to enable flushing of the first value of monotonic counts\n config['_flush_first_value'] = False\n\n # Whether to use process_start_time_seconds to decide if counter-like values should be flushed\n # on first scrape.\n config['use_process_start_time'] = is_affirmative(_get_setting('use_process_start_time', False))\n\n return config\n\n def get_http_handler(self, scraper_config):\n \"\"\"\n Get http handler for a specific scraper config.\n The http handler is cached using `prometheus_url` as key.\n The http handler doesn't use the cache if a bearer token is used to allow refreshing it.\n \"\"\"\n prometheus_url = scraper_config['prometheus_url']\n bearer_token = scraper_config['_bearer_token']\n if prometheus_url in self._http_handlers and bearer_token is None:\n return self._http_handlers[prometheus_url]\n\n # TODO: Deprecate this behavior in Agent 8\n if scraper_config['ssl_ca_cert'] is False:\n scraper_config['ssl_verify'] = False\n\n # TODO: Deprecate this behavior in Agent 8\n if scraper_config['ssl_verify'] is False:\n scraper_config.setdefault('tls_ignore_warning', True)\n\n http_handler = self._http_handlers[prometheus_url] = RequestsWrapper(\n scraper_config, self.init_config, self.HTTP_CONFIG_REMAPPER, self.log\n )\n\n headers = http_handler.options['headers']\n\n bearer_token = scraper_config['_bearer_token']\n if bearer_token is not None:\n headers['Authorization'] = 'Bearer {}'.format(bearer_token)\n\n # TODO: Determine if we really need this\n headers.setdefault('accept-encoding', 'gzip')\n\n # Explicitly set the content type we accept\n headers.setdefault('accept', 'text/plain')\n\n return http_handler\n\n def reset_http_config(self):\n \"\"\"\n You may need to use this when configuration is determined dynamically during every\n check run, such as when polling an external resource like the Kubelet.\n \"\"\"\n self._http_handlers.clear()\n\n def update_prometheus_url(self, instance, config, endpoint):\n if not endpoint:\n return\n\n config['prometheus_url'] = endpoint\n # Whether or not to use the service account bearer token for authentication.\n # Can be explicitly set to true or false to send or not the bearer token.\n # If set to the `tls_only` value, the bearer token will be sent only to https endpoints.\n # If 'bearer_token_path' is not set, we use /var/run/secrets/kubernetes.io/serviceaccount/token\n # as a default path to get the token.\n namespace = instance.get('namespace')\n default_instance = self.default_instances.get(namespace, {})\n bearer_token_auth = instance.get('bearer_token_auth', default_instance.get('bearer_token_auth', False))\n if bearer_token_auth == 'tls_only':\n config['bearer_token_auth'] = config['prometheus_url'].startswith(\"https://\")\n else:\n config['bearer_token_auth'] = is_affirmative(bearer_token_auth)\n\n # Can be used to get a service account bearer token from files\n # other than /var/run/secrets/kubernetes.io/serviceaccount/token\n # 'bearer_token_auth' should be enabled.\n config['bearer_token_path'] = instance.get('bearer_token_path', default_instance.get('bearer_token_path', None))\n\n # The service account bearer token to be used for authentication\n config['_bearer_token'] = self._get_bearer_token(config['bearer_token_auth'], config['bearer_token_path'])\n config['_bearer_token_last_refresh'] = time.time()\n\n def parse_metric_family(self, response, scraper_config):\n \"\"\"\n Parse the MetricFamily from a valid `requests.Response` object to provide a MetricFamily object.\n The text format uses iter_lines() generator.\n \"\"\"\n if response.encoding is None:\n response.encoding = 'utf-8'\n input_gen = response.iter_lines(decode_unicode=True)\n if scraper_config['_text_filter_blacklist']:\n input_gen = self._text_filter_input(input_gen, scraper_config)\n\n for metric in text_fd_to_metric_families(input_gen):\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_INPUT_COUNT, len(metric.samples), scraper_config\n )\n type_override = scraper_config['type_overrides'].get(metric.name)\n if type_override:\n metric.type = type_override\n elif scraper_config['_type_override_patterns']:\n for pattern, new_type in iteritems(scraper_config['_type_override_patterns']):\n if pattern.search(metric.name):\n metric.type = new_type\n break\n if metric.type not in self.METRIC_TYPES:\n continue\n metric.name = self._remove_metric_prefix(metric.name, scraper_config)\n yield metric\n\n def _text_filter_input(self, input_gen, scraper_config):\n \"\"\"\n Filters out the text input line by line to avoid parsing and processing\n metrics we know we don't want to process. This only works on `text/plain`\n payloads, and is an INTERNAL FEATURE implemented for the kubelet check\n :param input_get: line generator\n :output: generator of filtered lines\n \"\"\"\n for line in input_gen:\n for item in scraper_config['_text_filter_blacklist']:\n if item in line:\n self._send_telemetry_counter(self.TELEMETRY_COUNTER_METRICS_BLACKLIST_COUNT, 1, scraper_config)\n break\n else:\n # No blacklist matches, passing the line through\n yield line\n\n def _remove_metric_prefix(self, metric, scraper_config):\n prometheus_metrics_prefix = scraper_config['prometheus_metrics_prefix']\n return metric[len(prometheus_metrics_prefix) :] if metric.startswith(prometheus_metrics_prefix) else metric\n\n def scrape_metrics(self, scraper_config):\n \"\"\"\n Poll the data from Prometheus and return the metrics as a generator.\n \"\"\"\n response = self.poll(scraper_config)\n if scraper_config['telemetry']:\n if 'content-length' in response.headers:\n content_len = int(response.headers['content-length'])\n else:\n content_len = len(response.content)\n self._send_telemetry_gauge(self.TELEMETRY_GAUGE_MESSAGE_SIZE, content_len, scraper_config)\n try:\n # no dry run if no label joins\n if not scraper_config['label_joins']:\n scraper_config['_dry_run'] = False\n elif not scraper_config['_watched_labels']:\n watched = scraper_config['_watched_labels']\n watched['sets'] = {}\n watched['keys'] = {}\n watched['singles'] = set()\n for key, val in iteritems(scraper_config['label_joins']):\n labels = []\n if 'labels_to_match' in val:\n labels = val['labels_to_match']\n elif 'label_to_match' in val:\n self.log.warning(\"`label_to_match` is being deprecated, please use `labels_to_match`\")\n if isinstance(val['label_to_match'], list):\n labels = val['label_to_match']\n else:\n labels = [val['label_to_match']]\n\n if labels:\n s = frozenset(labels)\n watched['sets'][key] = s\n watched['keys'][key] = ','.join(s)\n if len(labels) == 1:\n watched['singles'].add(labels[0])\n\n for metric in self.parse_metric_family(response, scraper_config):\n yield metric\n\n # Set dry run off\n scraper_config['_dry_run'] = False\n # Garbage collect unused mapping and reset active labels\n for metric, mapping in list(iteritems(scraper_config['_label_mapping'])):\n for key in list(mapping):\n if (\n metric in scraper_config['_active_label_mapping']\n and key not in scraper_config['_active_label_mapping'][metric]\n ):\n del scraper_config['_label_mapping'][metric][key]\n scraper_config['_active_label_mapping'] = {}\n finally:\n response.close()\n\n def process(self, scraper_config, metric_transformers=None):\n \"\"\"\n Polls the data from Prometheus and submits them as Datadog metrics.\n `endpoint` is the metrics endpoint to use to poll metrics from Prometheus\n\n Note that if the instance has a `tags` attribute, it will be pushed\n automatically as additional custom tags and added to the metrics\n \"\"\"\n\n transformers = scraper_config['_default_metric_transformers'].copy()\n if metric_transformers:\n transformers.update(metric_transformers)\n\n counter_buffer = []\n agent_start_time = None\n process_start_time = None\n if not scraper_config['_flush_first_value'] and scraper_config['use_process_start_time']:\n agent_start_time = datadog_agent.get_process_start_time()\n\n if scraper_config['bearer_token_auth']:\n self._refresh_bearer_token(scraper_config)\n\n for metric in self.scrape_metrics(scraper_config):\n if agent_start_time is not None:\n if metric.name == 'process_start_time_seconds' and metric.samples:\n min_metric_value = min(s[self.SAMPLE_VALUE] for s in metric.samples)\n if process_start_time is None or min_metric_value < process_start_time:\n process_start_time = min_metric_value\n if metric.type in self.METRICS_WITH_COUNTERS:\n counter_buffer.append(metric)\n continue\n\n self.process_metric(metric, scraper_config, metric_transformers=transformers)\n\n if agent_start_time and process_start_time and agent_start_time < process_start_time:\n # If agent was started before the process, we assume counters were started recently from zero,\n # and thus we can compute the rates.\n scraper_config['_flush_first_value'] = True\n\n for metric in counter_buffer:\n self.process_metric(metric, scraper_config, metric_transformers=transformers)\n\n scraper_config['_flush_first_value'] = True\n\n def transform_metadata(self, metric, scraper_config):\n labels = metric.samples[0][self.SAMPLE_LABELS]\n for metadata_name, label_name in iteritems(scraper_config['metadata_label_map']):\n if label_name in labels:\n self.set_metadata(metadata_name, labels[label_name])\n\n def _metric_name_with_namespace(self, metric_name, scraper_config):\n namespace = scraper_config['namespace']\n if not namespace:\n return metric_name\n return '{}.{}'.format(namespace, metric_name)\n\n def _telemetry_metric_name_with_namespace(self, metric_name, scraper_config):\n namespace = scraper_config['namespace']\n if not namespace:\n return '{}.{}'.format('telemetry', metric_name)\n return '{}.{}.{}'.format(namespace, 'telemetry', metric_name)\n\n def _send_telemetry_gauge(self, metric_name, val, scraper_config):\n if scraper_config['telemetry']:\n metric_name_with_namespace = self._telemetry_metric_name_with_namespace(metric_name, scraper_config)\n # Determine the tags to send\n custom_tags = scraper_config['custom_tags']\n tags = list(custom_tags)\n tags.extend(scraper_config['_metric_tags'])\n self.gauge(metric_name_with_namespace, val, tags=tags)\n\n def _send_telemetry_counter(self, metric_name, val, scraper_config, extra_tags=None):\n if scraper_config['telemetry']:\n metric_name_with_namespace = self._telemetry_metric_name_with_namespace(metric_name, scraper_config)\n # Determine the tags to send\n custom_tags = scraper_config['custom_tags']\n tags = list(custom_tags)\n tags.extend(scraper_config['_metric_tags'])\n if extra_tags:\n tags.extend(extra_tags)\n self.count(metric_name_with_namespace, val, tags=tags)\n\n def _store_labels(self, metric, scraper_config):\n # If targeted metric, store labels\n if metric.name not in scraper_config['label_joins']:\n return\n\n watched = scraper_config['_watched_labels']\n matching_labels = watched['sets'][metric.name]\n mapping_key = watched['keys'][metric.name]\n\n labels_to_get = scraper_config['label_joins'][metric.name]['labels_to_get']\n get_all = '*' in labels_to_get\n match_all = mapping_key == '*'\n for sample in metric.samples:\n # metadata-only metrics that are used for label joins are always equal to 1\n # this is required for metrics where all combinations of a state are sent\n # but only the active one is set to 1 (others are set to 0)\n # example: kube_pod_status_phase in kube-state-metrics\n if sample[self.SAMPLE_VALUE] != 1:\n continue\n\n sample_labels = sample[self.SAMPLE_LABELS]\n sample_labels_keys = sample_labels.keys()\n\n if match_all or matching_labels.issubset(sample_labels_keys):\n label_dict = {}\n\n if get_all:\n for label_name, label_value in iteritems(sample_labels):\n if label_name in matching_labels:\n continue\n label_dict[label_name] = label_value\n else:\n for label_name in labels_to_get:\n if label_name in sample_labels:\n label_dict[label_name] = sample_labels[label_name]\n\n if match_all:\n mapping_value = '*'\n else:\n mapping_value = ','.join([sample_labels[l] for l in matching_labels])\n\n scraper_config['_label_mapping'].setdefault(mapping_key, {}).setdefault(mapping_value, {}).update(\n label_dict\n )\n\n def _join_labels(self, metric, scraper_config):\n # Filter metric to see if we can enrich with joined labels\n if not scraper_config['label_joins']:\n return\n\n label_mapping = scraper_config['_label_mapping']\n active_label_mapping = scraper_config['_active_label_mapping']\n\n watched = scraper_config['_watched_labels']\n sets = watched['sets']\n keys = watched['keys']\n singles = watched['singles']\n\n for sample in metric.samples:\n sample_labels = sample[self.SAMPLE_LABELS]\n sample_labels_keys = sample_labels.keys()\n\n # Match with wildcard label\n # Label names are [a-zA-Z0-9_]*, so no risk of collision\n if '*' in singles:\n active_label_mapping.setdefault('*', {})['*'] = True\n\n if '*' in label_mapping and '*' in label_mapping['*']:\n sample_labels.update(label_mapping['*']['*'])\n\n # Match with single labels\n matching_single_labels = singles.intersection(sample_labels_keys)\n for label in matching_single_labels:\n mapping_key = label\n mapping_value = sample_labels[label]\n\n active_label_mapping.setdefault(mapping_key, {})[mapping_value] = True\n\n if mapping_key in label_mapping and mapping_value in label_mapping[mapping_key]:\n sample_labels.update(label_mapping[mapping_key][mapping_value])\n\n # Match with tuples of labels\n for key, mapping_key in iteritems(keys):\n if mapping_key in matching_single_labels:\n continue\n\n matching_labels = sets[key]\n\n if matching_labels.issubset(sample_labels_keys):\n matching_values = [sample_labels[l] for l in matching_labels]\n mapping_value = ','.join(matching_values)\n\n active_label_mapping.setdefault(mapping_key, {})[mapping_value] = True\n\n if mapping_key in label_mapping and mapping_value in label_mapping[mapping_key]:\n sample_labels.update(label_mapping[mapping_key][mapping_value])\n\n def _ignore_metrics_by_label(self, scraper_config, metric_name, sample):\n ignore_metrics_by_label = scraper_config['ignore_metrics_by_labels']\n sample_labels = sample[self.SAMPLE_LABELS]\n for label_key, label_values in ignore_metrics_by_label.items():\n if not label_values:\n self.log.debug(\n \"Skipping filter label `%s` with an empty values list, did you mean to use '*' wildcard?\", label_key\n )\n elif '*' in label_values:\n # Wildcard '*' means all metrics with label_key will be ignored\n self.log.debug(\"Detected wildcard for label `%s`\", label_key)\n if label_key in sample_labels.keys():\n self.log.debug(\"Skipping metric `%s` due to label key matching: %s\", metric_name, label_key)\n return True\n else:\n for val in label_values:\n if label_key in sample_labels and sample_labels[label_key] == val:\n self.log.debug(\n \"Skipping metric `%s` due to label `%s` value matching: %s\", metric_name, label_key, val\n )\n return True\n return False\n\n def process_metric(self, metric, scraper_config, metric_transformers=None):\n \"\"\"\n Handle a Prometheus metric according to the following flow:\n - search `scraper_config['metrics_mapper']` for a prometheus.metric to datadog.metric mapping\n - call check method with the same name as the metric\n - log info if none of the above worked\n\n `metric_transformers` is a dict of `<metric name>:<function to run when the metric name is encountered>`\n \"\"\"\n # If targeted metric, store labels\n self._store_labels(metric, scraper_config)\n\n if scraper_config['ignore_metrics']:\n if metric.name in scraper_config['_ignored_metrics']:\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config\n )\n return # Ignore the metric\n\n if scraper_config['_ignored_re'] and scraper_config['_ignored_re'].search(metric.name):\n # Metric must be ignored\n scraper_config['_ignored_metrics'].add(metric.name)\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config\n )\n return # Ignore the metric\n\n self._send_telemetry_counter(self.TELEMETRY_COUNTER_METRICS_PROCESS_COUNT, len(metric.samples), scraper_config)\n\n if self._filter_metric(metric, scraper_config):\n return # Ignore the metric\n\n # Filter metric to see if we can enrich with joined labels\n self._join_labels(metric, scraper_config)\n\n if scraper_config['_dry_run']:\n return\n\n try:\n self.submit_openmetric(scraper_config['metrics_mapper'][metric.name], metric, scraper_config)\n except KeyError:\n if metric_transformers is not None and metric.name in metric_transformers:\n try:\n # Get the transformer function for this specific metric\n transformer = metric_transformers[metric.name]\n transformer(metric, scraper_config)\n except Exception as err:\n self.log.warning('Error handling metric: %s - error: %s', metric.name, err)\n\n return\n # check for wildcards in transformers\n for transformer_name, transformer in iteritems(metric_transformers):\n if transformer_name.endswith('*') and metric.name.startswith(transformer_name[:-1]):\n transformer(metric, scraper_config, transformer_name)\n\n # try matching wildcards\n if scraper_config['_wildcards_re'] and scraper_config['_wildcards_re'].search(metric.name):\n self.submit_openmetric(metric.name, metric, scraper_config)\n return\n\n self.log.debug(\n 'Skipping metric `%s` as it is not defined in the metrics mapper, '\n 'has no transformer function, nor does it match any wildcards.',\n metric.name,\n )\n\n def poll(self, scraper_config, headers=None):\n \"\"\"\n Returns a valid `requests.Response`, otherwise raise requests.HTTPError if the status code of the\n response isn't valid - see `response.raise_for_status()`\n\n The caller needs to close the requests.Response.\n\n Custom headers can be added to the default headers.\n \"\"\"\n endpoint = scraper_config.get('prometheus_url')\n\n # Should we send a service check for when we make a request\n health_service_check = scraper_config['health_service_check']\n service_check_name = self._metric_name_with_namespace('prometheus.health', scraper_config)\n service_check_tags = ['endpoint:{}'.format(endpoint)]\n service_check_tags.extend(scraper_config['custom_tags'])\n\n try:\n response = self.send_request(endpoint, scraper_config, headers)\n except requests.exceptions.SSLError:\n self.log.error(\"Invalid SSL settings for requesting %s endpoint\", endpoint)\n raise\n except IOError:\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.CRITICAL, tags=service_check_tags)\n raise\n try:\n response.raise_for_status()\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.OK, tags=service_check_tags)\n return response\n except requests.HTTPError:\n response.close()\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.CRITICAL, tags=service_check_tags)\n raise\n\n def send_request(self, endpoint, scraper_config, headers=None):\n kwargs = {}\n if headers:\n kwargs['headers'] = headers\n\n http_handler = self.get_http_handler(scraper_config)\n\n return http_handler.get(endpoint, stream=True, **kwargs)\n\n def get_hostname_for_sample(self, sample, scraper_config):\n \"\"\"\n Expose the label_to_hostname mapping logic to custom handler methods\n \"\"\"\n return self._get_hostname(None, sample, scraper_config)\n\n def submit_openmetric(self, metric_name, metric, scraper_config, hostname=None):\n \"\"\"\n For each sample in the metric, report it as a gauge with all labels as tags\n except if a labels `dict` is passed, in which case keys are label names we'll extract\n and corresponding values are tag names we'll use (eg: {'node': 'node'}).\n\n Histograms generate a set of values instead of a unique metric.\n `send_histograms_buckets` is used to specify if you want to\n send the buckets as tagged values when dealing with histograms.\n\n `custom_tags` is an array of `tag:value` that will be added to the\n metric when sending the gauge to Datadog.\n \"\"\"\n if metric.type in [\"gauge\", \"counter\", \"rate\"]:\n metric_name_with_namespace = self._metric_name_with_namespace(metric_name, scraper_config)\n for sample in metric.samples:\n if self._ignore_metrics_by_label(scraper_config, metric_name, sample):\n continue\n\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n custom_hostname = self._get_hostname(hostname, sample, scraper_config)\n # Determine the tags to send\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n if metric.type == \"counter\" and scraper_config['send_monotonic_counter']:\n self.monotonic_count(\n metric_name_with_namespace,\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif metric.type == \"rate\":\n self.rate(metric_name_with_namespace, val, tags=tags, hostname=custom_hostname)\n else:\n self.gauge(metric_name_with_namespace, val, tags=tags, hostname=custom_hostname)\n\n # Metric is a \"counter\" but legacy behavior has \"send_as_monotonic\" defaulted to False\n # Submit metric as monotonic_count with appended name\n if metric.type == \"counter\" and scraper_config['send_monotonic_with_gauge']:\n self.monotonic_count(\n metric_name_with_namespace + '.total',\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif metric.type == \"histogram\":\n self._submit_gauges_from_histogram(metric_name, metric, scraper_config)\n elif metric.type == \"summary\":\n self._submit_gauges_from_summary(metric_name, metric, scraper_config)\n else:\n self.log.error(\"Metric type %s unsupported for metric %s.\", metric.type, metric_name)\n\n def _get_hostname(self, hostname, sample, scraper_config):\n \"\"\"\n If hostname is None, look at label_to_hostname setting\n \"\"\"\n if (\n hostname is None\n and scraper_config['label_to_hostname'] is not None\n and sample[self.SAMPLE_LABELS].get(scraper_config['label_to_hostname'])\n ):\n hostname = sample[self.SAMPLE_LABELS][scraper_config['label_to_hostname']]\n suffix = scraper_config['label_to_hostname_suffix']\n if suffix is not None:\n hostname += suffix\n\n return hostname\n\n def _submit_gauges_from_summary(self, metric_name, metric, scraper_config, hostname=None):\n \"\"\"\n Extracts metrics from a prometheus summary metric and sends them as gauges\n \"\"\"\n for sample in metric.samples:\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n if self._ignore_metrics_by_label(scraper_config, metric_name, sample):\n continue\n custom_hostname = self._get_hostname(hostname, sample, scraper_config)\n if sample[self.SAMPLE_NAME].endswith(\"_sum\"):\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n self._submit_distribution_count(\n scraper_config['send_distribution_sums_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.sum\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif sample[self.SAMPLE_NAME].endswith(\"_count\"):\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n self._submit_distribution_count(\n scraper_config['send_distribution_counts_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.count\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n else:\n try:\n quantile = sample[self.SAMPLE_LABELS][\"quantile\"]\n except KeyError:\n # TODO: In the Prometheus spec the 'quantile' label is optional, but it's not clear yet\n # what we should do in this case. Let's skip for now and submit the rest of metrics.\n message = (\n '\"quantile\" label not present in metric %r. '\n 'Quantile-less summary metrics are not currently supported. Skipping...'\n )\n self.log.debug(message, metric_name)\n continue\n\n sample[self.SAMPLE_LABELS][\"quantile\"] = str(float(quantile))\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n self.gauge(\n \"{}.quantile\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n )\n\n def _submit_gauges_from_histogram(self, metric_name, metric, scraper_config, hostname=None):\n \"\"\"\n Extracts metrics from a prometheus histogram and sends them as gauges\n \"\"\"\n if scraper_config['non_cumulative_buckets']:\n self._decumulate_histogram_buckets(metric)\n for sample in metric.samples:\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n if self._ignore_metrics_by_label(scraper_config, metric_name, sample):\n continue\n custom_hostname = self._get_hostname(hostname, sample, scraper_config)\n if sample[self.SAMPLE_NAME].endswith(\"_sum\") and not scraper_config['send_distribution_buckets']:\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname)\n self._submit_distribution_count(\n scraper_config['send_distribution_sums_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.sum\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif sample[self.SAMPLE_NAME].endswith(\"_count\") and not scraper_config['send_distribution_buckets']:\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname)\n if scraper_config['send_histograms_buckets']:\n tags.append(\"upper_bound:none\")\n self._submit_distribution_count(\n scraper_config['send_distribution_counts_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.count\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif scraper_config['send_histograms_buckets'] and sample[self.SAMPLE_NAME].endswith(\"_bucket\"):\n if scraper_config['send_distribution_buckets']:\n self._submit_sample_histogram_buckets(metric_name, sample, scraper_config, hostname)\n elif \"Inf\" not in sample[self.SAMPLE_LABELS][\"le\"] or scraper_config['non_cumulative_buckets']:\n sample[self.SAMPLE_LABELS][\"le\"] = str(float(sample[self.SAMPLE_LABELS][\"le\"]))\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname)\n self._submit_distribution_count(\n scraper_config['send_distribution_counts_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.count\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n\n def _compute_bucket_hash(self, tags):\n # we need the unique context for all the buckets\n # hence we remove the \"le\" tag\n return hash(frozenset(sorted((k, v) for k, v in iteritems(tags) if k != 'le')))\n\n def _decumulate_histogram_buckets(self, metric):\n \"\"\"\n Decumulate buckets in a given histogram metric and adds the lower_bound label (le being upper_bound)\n \"\"\"\n bucket_values_by_context_upper_bound = {}\n for sample in metric.samples:\n if sample[self.SAMPLE_NAME].endswith(\"_bucket\"):\n context_key = self._compute_bucket_hash(sample[self.SAMPLE_LABELS])\n if context_key not in bucket_values_by_context_upper_bound:\n bucket_values_by_context_upper_bound[context_key] = {}\n bucket_values_by_context_upper_bound[context_key][float(sample[self.SAMPLE_LABELS][\"le\"])] = sample[\n self.SAMPLE_VALUE\n ]\n\n sorted_buckets_by_context = {}\n for context in bucket_values_by_context_upper_bound:\n sorted_buckets_by_context[context] = sorted(bucket_values_by_context_upper_bound[context])\n\n # Tuples (lower_bound, upper_bound, value)\n bucket_tuples_by_context_upper_bound = {}\n for context in sorted_buckets_by_context:\n for i, upper_b in enumerate(sorted_buckets_by_context[context]):\n if i == 0:\n if context not in bucket_tuples_by_context_upper_bound:\n bucket_tuples_by_context_upper_bound[context] = {}\n if upper_b > 0:\n # positive buckets start at zero\n bucket_tuples_by_context_upper_bound[context][upper_b] = (\n 0,\n upper_b,\n bucket_values_by_context_upper_bound[context][upper_b],\n )\n else:\n # negative buckets start at -inf\n bucket_tuples_by_context_upper_bound[context][upper_b] = (\n self.MINUS_INF,\n upper_b,\n bucket_values_by_context_upper_bound[context][upper_b],\n )\n continue\n tmp = (\n bucket_values_by_context_upper_bound[context][upper_b]\n - bucket_values_by_context_upper_bound[context][sorted_buckets_by_context[context][i - 1]]\n )\n bucket_tuples_by_context_upper_bound[context][upper_b] = (\n sorted_buckets_by_context[context][i - 1],\n upper_b,\n tmp,\n )\n\n # modify original metric to inject lower_bound & modified value\n for i, sample in enumerate(metric.samples):\n if not sample[self.SAMPLE_NAME].endswith(\"_bucket\"):\n continue\n\n context_key = self._compute_bucket_hash(sample[self.SAMPLE_LABELS])\n matching_bucket_tuple = bucket_tuples_by_context_upper_bound[context_key][\n float(sample[self.SAMPLE_LABELS][\"le\"])\n ]\n # Replacing the sample tuple\n sample[self.SAMPLE_LABELS][\"lower_bound\"] = str(matching_bucket_tuple[0])\n metric.samples[i] = Sample(sample[self.SAMPLE_NAME], sample[self.SAMPLE_LABELS], matching_bucket_tuple[2])\n\n def _submit_sample_histogram_buckets(self, metric_name, sample, scraper_config, hostname=None):\n if \"lower_bound\" not in sample[self.SAMPLE_LABELS] or \"le\" not in sample[self.SAMPLE_LABELS]:\n self.log.warning(\n \"Metric: %s was not containing required bucket boundaries labels: %s\",\n metric_name,\n sample[self.SAMPLE_LABELS],\n )\n return\n sample[self.SAMPLE_LABELS][\"le\"] = str(float(sample[self.SAMPLE_LABELS][\"le\"]))\n sample[self.SAMPLE_LABELS][\"lower_bound\"] = str(float(sample[self.SAMPLE_LABELS][\"lower_bound\"]))\n if sample[self.SAMPLE_LABELS][\"le\"] == sample[self.SAMPLE_LABELS][\"lower_bound\"]:\n # this can happen for -inf/-inf bucket that we don't want to send (always 0)\n self.log.warning(\n \"Metric: %s has bucket boundaries equal, skipping: %s\", metric_name, sample[self.SAMPLE_LABELS]\n )\n return\n tags = self._metric_tags(metric_name, sample[self.SAMPLE_VALUE], sample, scraper_config, hostname)\n self.submit_histogram_bucket(\n self._metric_name_with_namespace(metric_name, scraper_config),\n sample[self.SAMPLE_VALUE],\n float(sample[self.SAMPLE_LABELS][\"lower_bound\"]),\n float(sample[self.SAMPLE_LABELS][\"le\"]),\n True,\n hostname,\n tags,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n\n def _submit_distribution_count(\n self,\n monotonic,\n send_monotonic_with_gauge,\n metric_name,\n value,\n tags=None,\n hostname=None,\n flush_first_value=False,\n ):\n if monotonic:\n self.monotonic_count(metric_name, value, tags=tags, hostname=hostname, flush_first_value=flush_first_value)\n else:\n self.gauge(metric_name, value, tags=tags, hostname=hostname)\n if send_monotonic_with_gauge:\n self.monotonic_count(\n metric_name + \".total\", value, tags=tags, hostname=hostname, flush_first_value=flush_first_value\n )\n\n def _metric_tags(self, metric_name, val, sample, scraper_config, hostname=None):\n custom_tags = scraper_config['custom_tags']\n _tags = list(custom_tags)\n _tags.extend(scraper_config['_metric_tags'])\n for label_name, label_value in iteritems(sample[self.SAMPLE_LABELS]):\n if label_name not in scraper_config['exclude_labels']:\n if label_name in scraper_config['include_labels'] or len(scraper_config['include_labels']) == 0:\n tag_name = scraper_config['labels_mapper'].get(label_name, label_name)\n _tags.append('{}:{}'.format(to_native_string(tag_name), to_native_string(label_value)))\n return self._finalize_tags_to_submit(\n _tags, metric_name, val, sample, custom_tags=custom_tags, hostname=hostname\n )\n\n def _is_value_valid(self, val):\n return not (isnan(val) or isinf(val))\n\n def _get_bearer_token(self, bearer_token_auth, bearer_token_path):\n if bearer_token_auth is False:\n return None\n\n path = None\n if bearer_token_path is not None:\n if isfile(bearer_token_path):\n path = bearer_token_path\n else:\n self.log.error(\"File not found: %s\", bearer_token_path)\n elif isfile(self.KUBERNETES_TOKEN_PATH):\n path = self.KUBERNETES_TOKEN_PATH\n\n if path is None:\n self.log.error(\"Cannot get bearer token from bearer_token_path or auto discovery\")\n raise IOError(\"Cannot get bearer token from bearer_token_path or auto discovery\")\n\n try:\n with open(path, 'r') as f:\n return f.read().rstrip()\n except Exception as err:\n self.log.error(\"Cannot get bearer token from path: %s - error: %s\", path, err)\n raise\n\n def _refresh_bearer_token(self, scraper_config):\n \"\"\"\n Refreshes the bearer token if the refresh interval is elapsed.\n \"\"\"\n now = time.time()\n if now - scraper_config['_bearer_token_last_refresh'] > scraper_config['bearer_token_refresh_interval']:\n scraper_config['_bearer_token'] = self._get_bearer_token(\n scraper_config['bearer_token_auth'], scraper_config['bearer_token_path']\n )\n scraper_config['_bearer_token_last_refresh'] = now\n\n def _histogram_convert_values(self, metric_name, converter):\n def _convert(metric, scraper_config=None):\n for index, sample in enumerate(metric.samples):\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n if sample[self.SAMPLE_NAME].endswith(\"_sum\"):\n lst = list(sample)\n lst[self.SAMPLE_VALUE] = converter(val)\n metric.samples[index] = tuple(lst)\n elif sample[self.SAMPLE_NAME].endswith(\"_bucket\") and \"Inf\" not in sample[self.SAMPLE_LABELS][\"le\"]:\n sample[self.SAMPLE_LABELS][\"le\"] = str(converter(float(sample[self.SAMPLE_LABELS][\"le\"])))\n self.submit_openmetric(metric_name, metric, scraper_config)\n\n return _convert\n\n def _histogram_from_microseconds_to_seconds(self, metric_name):\n return self._histogram_convert_values(metric_name, lambda v: v / self.MICROS_IN_S)\n\n def _histogram_from_seconds_to_microseconds(self, metric_name):\n return self._histogram_convert_values(metric_name, lambda v: v * self.MICROS_IN_S)\n\n def _summary_convert_values(self, metric_name, converter):\n def _convert(metric, scraper_config=None):\n for index, sample in enumerate(metric.samples):\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n if sample[self.SAMPLE_NAME].endswith(\"_count\"):\n continue\n else:\n lst = list(sample)\n lst[self.SAMPLE_VALUE] = converter(val)\n metric.samples[index] = tuple(lst)\n self.submit_openmetric(metric_name, metric, scraper_config)\n\n return _convert\n\n def _summary_from_microseconds_to_seconds(self, metric_name):\n return self._summary_convert_values(metric_name, lambda v: v / self.MICROS_IN_S)\n\n def _summary_from_seconds_to_microseconds(self, metric_name):\n return self._summary_convert_values(metric_name, lambda v: v * self.MICROS_IN_S)\n
Parse the MetricFamily from a valid requests.Response object to provide a MetricFamily object. The text format uses iter_lines() generator.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def parse_metric_family(self, response, scraper_config):\n \"\"\"\n Parse the MetricFamily from a valid `requests.Response` object to provide a MetricFamily object.\n The text format uses iter_lines() generator.\n \"\"\"\n if response.encoding is None:\n response.encoding = 'utf-8'\n input_gen = response.iter_lines(decode_unicode=True)\n if scraper_config['_text_filter_blacklist']:\n input_gen = self._text_filter_input(input_gen, scraper_config)\n\n for metric in text_fd_to_metric_families(input_gen):\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_INPUT_COUNT, len(metric.samples), scraper_config\n )\n type_override = scraper_config['type_overrides'].get(metric.name)\n if type_override:\n metric.type = type_override\n elif scraper_config['_type_override_patterns']:\n for pattern, new_type in iteritems(scraper_config['_type_override_patterns']):\n if pattern.search(metric.name):\n metric.type = new_type\n break\n if metric.type not in self.METRIC_TYPES:\n continue\n metric.name = self._remove_metric_prefix(metric.name, scraper_config)\n yield metric\n
Poll the data from Prometheus and return the metrics as a generator.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def scrape_metrics(self, scraper_config):\n \"\"\"\n Poll the data from Prometheus and return the metrics as a generator.\n \"\"\"\n response = self.poll(scraper_config)\n if scraper_config['telemetry']:\n if 'content-length' in response.headers:\n content_len = int(response.headers['content-length'])\n else:\n content_len = len(response.content)\n self._send_telemetry_gauge(self.TELEMETRY_GAUGE_MESSAGE_SIZE, content_len, scraper_config)\n try:\n # no dry run if no label joins\n if not scraper_config['label_joins']:\n scraper_config['_dry_run'] = False\n elif not scraper_config['_watched_labels']:\n watched = scraper_config['_watched_labels']\n watched['sets'] = {}\n watched['keys'] = {}\n watched['singles'] = set()\n for key, val in iteritems(scraper_config['label_joins']):\n labels = []\n if 'labels_to_match' in val:\n labels = val['labels_to_match']\n elif 'label_to_match' in val:\n self.log.warning(\"`label_to_match` is being deprecated, please use `labels_to_match`\")\n if isinstance(val['label_to_match'], list):\n labels = val['label_to_match']\n else:\n labels = [val['label_to_match']]\n\n if labels:\n s = frozenset(labels)\n watched['sets'][key] = s\n watched['keys'][key] = ','.join(s)\n if len(labels) == 1:\n watched['singles'].add(labels[0])\n\n for metric in self.parse_metric_family(response, scraper_config):\n yield metric\n\n # Set dry run off\n scraper_config['_dry_run'] = False\n # Garbage collect unused mapping and reset active labels\n for metric, mapping in list(iteritems(scraper_config['_label_mapping'])):\n for key in list(mapping):\n if (\n metric in scraper_config['_active_label_mapping']\n and key not in scraper_config['_active_label_mapping'][metric]\n ):\n del scraper_config['_label_mapping'][metric][key]\n scraper_config['_active_label_mapping'] = {}\n finally:\n response.close()\n
Polls the data from Prometheus and submits them as Datadog metrics. endpoint is the metrics endpoint to use to poll metrics from Prometheus
Note that if the instance has a tags attribute, it will be pushed automatically as additional custom tags and added to the metrics
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def process(self, scraper_config, metric_transformers=None):\n \"\"\"\n Polls the data from Prometheus and submits them as Datadog metrics.\n `endpoint` is the metrics endpoint to use to poll metrics from Prometheus\n\n Note that if the instance has a `tags` attribute, it will be pushed\n automatically as additional custom tags and added to the metrics\n \"\"\"\n\n transformers = scraper_config['_default_metric_transformers'].copy()\n if metric_transformers:\n transformers.update(metric_transformers)\n\n counter_buffer = []\n agent_start_time = None\n process_start_time = None\n if not scraper_config['_flush_first_value'] and scraper_config['use_process_start_time']:\n agent_start_time = datadog_agent.get_process_start_time()\n\n if scraper_config['bearer_token_auth']:\n self._refresh_bearer_token(scraper_config)\n\n for metric in self.scrape_metrics(scraper_config):\n if agent_start_time is not None:\n if metric.name == 'process_start_time_seconds' and metric.samples:\n min_metric_value = min(s[self.SAMPLE_VALUE] for s in metric.samples)\n if process_start_time is None or min_metric_value < process_start_time:\n process_start_time = min_metric_value\n if metric.type in self.METRICS_WITH_COUNTERS:\n counter_buffer.append(metric)\n continue\n\n self.process_metric(metric, scraper_config, metric_transformers=transformers)\n\n if agent_start_time and process_start_time and agent_start_time < process_start_time:\n # If agent was started before the process, we assume counters were started recently from zero,\n # and thus we can compute the rates.\n scraper_config['_flush_first_value'] = True\n\n for metric in counter_buffer:\n self.process_metric(metric, scraper_config, metric_transformers=transformers)\n\n scraper_config['_flush_first_value'] = True\n
Returns a valid requests.Response, otherwise raise requests.HTTPError if the status code of the response isn't valid - see response.raise_for_status()
The caller needs to close the requests.Response.
Custom headers can be added to the default headers.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def poll(self, scraper_config, headers=None):\n \"\"\"\n Returns a valid `requests.Response`, otherwise raise requests.HTTPError if the status code of the\n response isn't valid - see `response.raise_for_status()`\n\n The caller needs to close the requests.Response.\n\n Custom headers can be added to the default headers.\n \"\"\"\n endpoint = scraper_config.get('prometheus_url')\n\n # Should we send a service check for when we make a request\n health_service_check = scraper_config['health_service_check']\n service_check_name = self._metric_name_with_namespace('prometheus.health', scraper_config)\n service_check_tags = ['endpoint:{}'.format(endpoint)]\n service_check_tags.extend(scraper_config['custom_tags'])\n\n try:\n response = self.send_request(endpoint, scraper_config, headers)\n except requests.exceptions.SSLError:\n self.log.error(\"Invalid SSL settings for requesting %s endpoint\", endpoint)\n raise\n except IOError:\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.CRITICAL, tags=service_check_tags)\n raise\n try:\n response.raise_for_status()\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.OK, tags=service_check_tags)\n return response\n except requests.HTTPError:\n response.close()\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.CRITICAL, tags=service_check_tags)\n raise\n
For each sample in the metric, report it as a gauge with all labels as tags except if a labels dict is passed, in which case keys are label names we'll extract and corresponding values are tag names we'll use (eg: {'node': 'node'}).
Histograms generate a set of values instead of a unique metric. send_histograms_buckets is used to specify if you want to send the buckets as tagged values when dealing with histograms.
custom_tags is an array of tag:value that will be added to the metric when sending the gauge to Datadog.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def submit_openmetric(self, metric_name, metric, scraper_config, hostname=None):\n \"\"\"\n For each sample in the metric, report it as a gauge with all labels as tags\n except if a labels `dict` is passed, in which case keys are label names we'll extract\n and corresponding values are tag names we'll use (eg: {'node': 'node'}).\n\n Histograms generate a set of values instead of a unique metric.\n `send_histograms_buckets` is used to specify if you want to\n send the buckets as tagged values when dealing with histograms.\n\n `custom_tags` is an array of `tag:value` that will be added to the\n metric when sending the gauge to Datadog.\n \"\"\"\n if metric.type in [\"gauge\", \"counter\", \"rate\"]:\n metric_name_with_namespace = self._metric_name_with_namespace(metric_name, scraper_config)\n for sample in metric.samples:\n if self._ignore_metrics_by_label(scraper_config, metric_name, sample):\n continue\n\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n custom_hostname = self._get_hostname(hostname, sample, scraper_config)\n # Determine the tags to send\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n if metric.type == \"counter\" and scraper_config['send_monotonic_counter']:\n self.monotonic_count(\n metric_name_with_namespace,\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif metric.type == \"rate\":\n self.rate(metric_name_with_namespace, val, tags=tags, hostname=custom_hostname)\n else:\n self.gauge(metric_name_with_namespace, val, tags=tags, hostname=custom_hostname)\n\n # Metric is a \"counter\" but legacy behavior has \"send_as_monotonic\" defaulted to False\n # Submit metric as monotonic_count with appended name\n if metric.type == \"counter\" and scraper_config['send_monotonic_with_gauge']:\n self.monotonic_count(\n metric_name_with_namespace + '.total',\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif metric.type == \"histogram\":\n self._submit_gauges_from_histogram(metric_name, metric, scraper_config)\n elif metric.type == \"summary\":\n self._submit_gauges_from_summary(metric_name, metric, scraper_config)\n else:\n self.log.error(\"Metric type %s unsupported for metric %s.\", metric.type, metric_name)\n
Handle a Prometheus metric according to the following flow: - search scraper_config['metrics_mapper'] for a prometheus.metric to datadog.metric mapping - call check method with the same name as the metric - log info if none of the above worked
metric_transformers is a dict of <metric name>:<function to run when the metric name is encountered>
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def process_metric(self, metric, scraper_config, metric_transformers=None):\n \"\"\"\n Handle a Prometheus metric according to the following flow:\n - search `scraper_config['metrics_mapper']` for a prometheus.metric to datadog.metric mapping\n - call check method with the same name as the metric\n - log info if none of the above worked\n\n `metric_transformers` is a dict of `<metric name>:<function to run when the metric name is encountered>`\n \"\"\"\n # If targeted metric, store labels\n self._store_labels(metric, scraper_config)\n\n if scraper_config['ignore_metrics']:\n if metric.name in scraper_config['_ignored_metrics']:\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config\n )\n return # Ignore the metric\n\n if scraper_config['_ignored_re'] and scraper_config['_ignored_re'].search(metric.name):\n # Metric must be ignored\n scraper_config['_ignored_metrics'].add(metric.name)\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config\n )\n return # Ignore the metric\n\n self._send_telemetry_counter(self.TELEMETRY_COUNTER_METRICS_PROCESS_COUNT, len(metric.samples), scraper_config)\n\n if self._filter_metric(metric, scraper_config):\n return # Ignore the metric\n\n # Filter metric to see if we can enrich with joined labels\n self._join_labels(metric, scraper_config)\n\n if scraper_config['_dry_run']:\n return\n\n try:\n self.submit_openmetric(scraper_config['metrics_mapper'][metric.name], metric, scraper_config)\n except KeyError:\n if metric_transformers is not None and metric.name in metric_transformers:\n try:\n # Get the transformer function for this specific metric\n transformer = metric_transformers[metric.name]\n transformer(metric, scraper_config)\n except Exception as err:\n self.log.warning('Error handling metric: %s - error: %s', metric.name, err)\n\n return\n # check for wildcards in transformers\n for transformer_name, transformer in iteritems(metric_transformers):\n if transformer_name.endswith('*') and metric.name.startswith(transformer_name[:-1]):\n transformer(metric, scraper_config, transformer_name)\n\n # try matching wildcards\n if scraper_config['_wildcards_re'] and scraper_config['_wildcards_re'].search(metric.name):\n self.submit_openmetric(metric.name, metric, scraper_config)\n return\n\n self.log.debug(\n 'Skipping metric `%s` as it is not defined in the metrics mapper, '\n 'has no transformer function, nor does it match any wildcards.',\n metric.name,\n )\n
If instance does not specify a value for a configuration option, the value will default to the init_config. Otherwise, the default_instance value will be used.
A default mixin configuration will be returned if there is no instance.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def create_scraper_configuration(self, instance=None):\n \"\"\"\n Creates a scraper configuration.\n\n If instance does not specify a value for a configuration option, the value will default to the `init_config`.\n Otherwise, the `default_instance` value will be used.\n\n A default mixin configuration will be returned if there is no instance.\n \"\"\"\n if 'openmetrics_endpoint' in instance:\n raise CheckException('The setting `openmetrics_endpoint` is only available for Agent version 7 or later')\n\n # We can choose to create a default mixin configuration for an empty instance\n if instance is None:\n instance = {}\n\n # Supports new configuration options\n config = copy.deepcopy(instance)\n\n # Set the endpoint\n endpoint = instance.get('prometheus_url')\n if instance and endpoint is None:\n raise CheckException(\"You have to define a prometheus_url for each prometheus instance\")\n\n # Set the bearer token authorization to customer value, then get the bearer token\n self.update_prometheus_url(instance, config, endpoint)\n\n # `NAMESPACE` is the prefix metrics will have. Need to be hardcoded in the\n # child check class.\n namespace = instance.get('namespace')\n # Check if we have a namespace\n if instance and namespace is None:\n if self.default_namespace is None:\n raise CheckException(\"You have to define a namespace for each prometheus check\")\n namespace = self.default_namespace\n\n config['namespace'] = namespace\n\n # Retrieve potential default instance settings for the namespace\n default_instance = self.default_instances.get(namespace, {})\n\n def _get_setting(name, default):\n return instance.get(name, default_instance.get(name, default))\n\n # `metrics_mapper` is a dictionary where the keys are the metrics to capture\n # and the values are the corresponding metrics names to have in datadog.\n # Note: it is empty in the parent class but will need to be\n # overloaded/hardcoded in the final check not to be counted as custom metric.\n\n # Metrics are preprocessed if no mapping\n metrics_mapper = {}\n # We merge list and dictionaries from optional defaults & instance settings\n metrics = default_instance.get('metrics', []) + instance.get('metrics', [])\n for metric in metrics:\n if isinstance(metric, string_types):\n metrics_mapper[metric] = metric\n else:\n metrics_mapper.update(metric)\n\n config['metrics_mapper'] = metrics_mapper\n\n # `_wildcards_re` is a Pattern object used to match metric wildcards\n config['_wildcards_re'] = None\n\n wildcards = set()\n for metric in config['metrics_mapper']:\n if \"*\" in metric:\n wildcards.add(translate(metric))\n\n if wildcards:\n config['_wildcards_re'] = compile('|'.join(wildcards))\n\n # `prometheus_metrics_prefix` allows to specify a prefix that all\n # prometheus metrics should have. This can be used when the prometheus\n # endpoint we are scrapping allows to add a custom prefix to it's\n # metrics.\n config['prometheus_metrics_prefix'] = instance.get(\n 'prometheus_metrics_prefix', default_instance.get('prometheus_metrics_prefix', '')\n )\n\n # `label_joins` holds the configuration for extracting 1:1 labels from\n # a target metric to all metric matching the label, example:\n # self.label_joins = {\n # 'kube_pod_info': {\n # 'labels_to_match': ['pod'],\n # 'labels_to_get': ['node', 'host_ip']\n # }\n # }\n config['label_joins'] = default_instance.get('label_joins', {})\n config['label_joins'].update(instance.get('label_joins', {}))\n\n # `_label_mapping` holds the additionals label info to add for a specific\n # label value, example:\n # self._label_mapping = {\n # 'pod': {\n # 'dd-agent-9s1l1': {\n # \"node\": \"yolo\",\n # \"host_ip\": \"yey\"\n # }\n # }\n # }\n config['_label_mapping'] = {}\n\n # `_active_label_mapping` holds a dictionary of label values found during the run\n # to cleanup the label_mapping of unused values, example:\n # self._active_label_mapping = {\n # 'pod': {\n # 'dd-agent-9s1l1': True\n # }\n # }\n config['_active_label_mapping'] = {}\n\n # `_watched_labels` holds the sets of labels to watch for enrichment\n config['_watched_labels'] = {}\n\n config['_dry_run'] = True\n\n # Some metrics are ignored because they are duplicates or introduce a\n # very high cardinality. Metrics included in this list will be silently\n # skipped without a 'Unable to handle metric' debug line in the logs\n config['ignore_metrics'] = instance.get('ignore_metrics', default_instance.get('ignore_metrics', []))\n config['_ignored_metrics'] = set()\n\n # `_ignored_re` is a Pattern object used to match ignored metric patterns\n config['_ignored_re'] = None\n ignored_patterns = set()\n\n # Separate ignored metric names and ignored patterns in different sets for faster lookup later\n for metric in config['ignore_metrics']:\n if '*' in metric:\n ignored_patterns.add(translate(metric))\n else:\n config['_ignored_metrics'].add(metric)\n\n if ignored_patterns:\n config['_ignored_re'] = compile('|'.join(ignored_patterns))\n\n # Ignore metrics based on label keys or specific label values\n config['ignore_metrics_by_labels'] = instance.get(\n 'ignore_metrics_by_labels', default_instance.get('ignore_metrics_by_labels', {})\n )\n\n # If you want to send the buckets as tagged values when dealing with histograms,\n # set send_histograms_buckets to True, set to False otherwise.\n config['send_histograms_buckets'] = is_affirmative(\n instance.get('send_histograms_buckets', default_instance.get('send_histograms_buckets', True))\n )\n\n # If you want the bucket to be non cumulative and to come with upper/lower bound tags\n # set non_cumulative_buckets to True, enabled when distribution metrics are enabled.\n config['non_cumulative_buckets'] = is_affirmative(\n instance.get('non_cumulative_buckets', default_instance.get('non_cumulative_buckets', False))\n )\n\n # Send histograms as datadog distribution metrics\n config['send_distribution_buckets'] = is_affirmative(\n instance.get('send_distribution_buckets', default_instance.get('send_distribution_buckets', False))\n )\n\n # Non cumulative buckets are mandatory for distribution metrics\n if config['send_distribution_buckets'] is True:\n config['non_cumulative_buckets'] = True\n\n # If you want to send `counter` metrics as monotonic counts, set this value to True.\n # Set to False if you want to instead send those metrics as `gauge`.\n config['send_monotonic_counter'] = is_affirmative(\n instance.get('send_monotonic_counter', default_instance.get('send_monotonic_counter', True))\n )\n\n # If you want `counter` metrics to be submitted as both gauges and monotonic counts. Set this value to True.\n config['send_monotonic_with_gauge'] = is_affirmative(\n instance.get('send_monotonic_with_gauge', default_instance.get('send_monotonic_with_gauge', False))\n )\n\n config['send_distribution_counts_as_monotonic'] = is_affirmative(\n instance.get(\n 'send_distribution_counts_as_monotonic',\n default_instance.get('send_distribution_counts_as_monotonic', False),\n )\n )\n\n config['send_distribution_sums_as_monotonic'] = is_affirmative(\n instance.get(\n 'send_distribution_sums_as_monotonic',\n default_instance.get('send_distribution_sums_as_monotonic', False),\n )\n )\n\n # If the `labels_mapper` dictionary is provided, the metrics labels names\n # in the `labels_mapper` will use the corresponding value as tag name\n # when sending the gauges.\n config['labels_mapper'] = default_instance.get('labels_mapper', {})\n config['labels_mapper'].update(instance.get('labels_mapper', {}))\n # Rename bucket \"le\" label to \"upper_bound\"\n config['labels_mapper']['le'] = 'upper_bound'\n\n # `exclude_labels` is an array of label names to exclude. Those labels\n # will just not be added as tags when submitting the metric.\n config['exclude_labels'] = default_instance.get('exclude_labels', []) + instance.get('exclude_labels', [])\n\n # `include_labels` is an array of label names to include. If these labels are not in\n # the `exclude_labels` list, then they are added as tags when submitting the metric.\n config['include_labels'] = default_instance.get('include_labels', []) + instance.get('include_labels', [])\n\n # `type_overrides` is a dictionary where the keys are prometheus metric names\n # and the values are a metric type (name as string) to use instead of the one\n # listed in the payload. It can be used to force a type on untyped metrics.\n # Note: it is empty in the parent class but will need to be\n # overloaded/hardcoded in the final check not to be counted as custom metric.\n config['type_overrides'] = default_instance.get('type_overrides', {})\n config['type_overrides'].update(instance.get('type_overrides', {}))\n\n # `_type_override_patterns` is a dictionary where we store Pattern objects\n # that match metric names as keys, and their corresponding metric type overrides as values.\n config['_type_override_patterns'] = {}\n\n with_wildcards = set()\n for metric, type in iteritems(config['type_overrides']):\n if '*' in metric:\n config['_type_override_patterns'][compile(translate(metric))] = type\n with_wildcards.add(metric)\n\n # cleanup metric names with wildcards from the 'type_overrides' dict\n for metric in with_wildcards:\n del config['type_overrides'][metric]\n\n # Some metrics are retrieved from different hosts and often\n # a label can hold this information, this transfers it to the hostname\n config['label_to_hostname'] = instance.get('label_to_hostname', default_instance.get('label_to_hostname', None))\n\n # In combination to label_as_hostname, allows to add a common suffix to the hostnames\n # submitted. This can be used for instance to discriminate hosts between clusters.\n config['label_to_hostname_suffix'] = instance.get(\n 'label_to_hostname_suffix', default_instance.get('label_to_hostname_suffix', None)\n )\n\n # Add a 'health' service check for the prometheus endpoint\n config['health_service_check'] = is_affirmative(\n instance.get('health_service_check', default_instance.get('health_service_check', True))\n )\n\n # Can either be only the path to the certificate and thus you should specify the private key\n # or it can be the path to a file containing both the certificate & the private key\n config['ssl_cert'] = instance.get('ssl_cert', default_instance.get('ssl_cert', None))\n\n # Needed if the certificate does not include the private key\n #\n # /!\\ The private key to your local certificate must be unencrypted.\n # Currently, Requests does not support using encrypted keys.\n config['ssl_private_key'] = instance.get('ssl_private_key', default_instance.get('ssl_private_key', None))\n\n # The path to the trusted CA used for generating custom certificates\n config['ssl_ca_cert'] = instance.get('ssl_ca_cert', default_instance.get('ssl_ca_cert', None))\n\n # Whether or not to validate SSL certificates\n config['ssl_verify'] = is_affirmative(instance.get('ssl_verify', default_instance.get('ssl_verify', True)))\n\n # Extra http headers to be sent when polling endpoint\n config['extra_headers'] = default_instance.get('extra_headers', {})\n config['extra_headers'].update(instance.get('extra_headers', {}))\n\n # Timeout used during the network request\n config['prometheus_timeout'] = instance.get(\n 'prometheus_timeout', default_instance.get('prometheus_timeout', 10)\n )\n\n # Authentication used when polling endpoint\n config['username'] = instance.get('username', default_instance.get('username', None))\n config['password'] = instance.get('password', default_instance.get('password', None))\n\n # Custom tags that will be sent with each metric\n config['custom_tags'] = instance.get('tags', [])\n\n # Some tags can be ignored to reduce the cardinality.\n # This can be useful for cost optimization in containerized environments\n # when the openmetrics check is configured to collect custom metrics.\n # Even when the Agent's Tagger is configured to add low-cardinality tags only,\n # some tags can still generate unwanted metric contexts (e.g pod annotations as tags).\n ignore_tags = instance.get('ignore_tags', default_instance.get('ignore_tags', []))\n if ignore_tags:\n ignored_tags_re = compile('|'.join(set(ignore_tags)))\n config['custom_tags'] = [tag for tag in config['custom_tags'] if not ignored_tags_re.search(tag)]\n\n # Additional tags to be sent with each metric\n config['_metric_tags'] = []\n\n # List of strings to filter the input text payload on. If any line contains\n # one of these strings, it will be filtered out before being parsed.\n # INTERNAL FEATURE, might be removed in future versions\n config['_text_filter_blacklist'] = []\n\n # Refresh the bearer token every 60 seconds by default.\n # Ref https://github.com/DataDog/datadog-agent/pull/11686\n config['bearer_token_refresh_interval'] = instance.get(\n 'bearer_token_refresh_interval', default_instance.get('bearer_token_refresh_interval', 60)\n )\n\n config['telemetry'] = is_affirmative(instance.get('telemetry', default_instance.get('telemetry', False)))\n\n # The metric name services use to indicate build information\n config['metadata_metric_name'] = instance.get(\n 'metadata_metric_name', default_instance.get('metadata_metric_name')\n )\n\n # Map of metadata key names to label names\n config['metadata_label_map'] = instance.get(\n 'metadata_label_map', default_instance.get('metadata_label_map', {})\n )\n\n config['_default_metric_transformers'] = {}\n if config['metadata_metric_name'] and config['metadata_label_map']:\n config['_default_metric_transformers'][config['metadata_metric_name']] = self.transform_metadata\n\n # Whether or not to enable flushing of the first value of monotonic counts\n config['_flush_first_value'] = False\n\n # Whether to use process_start_time_seconds to decide if counter-like values should be flushed\n # on first scrape.\n config['use_process_start_time'] = is_affirmative(_get_setting('use_process_start_time', False))\n\n return config\n
Some options can be set globally in init_config (with instances taking precedence). For complete documentation of every option, see the associated configuration templates for the instances and init_config sections.
"},{"location":"legacy/prometheus/#config-changes-between-versions","title":"Config changes between versions","text":"
There are config option changes between OpenMetrics V1 and V2, so check if any updated OpenMetrics instances use deprecated options and update accordingly.
Note: The type_overrides option is incorporated in the metrics option. This metrics option defines the list of which metrics to collect from the openmetrics_endpoint, and it can be used to remap the names and types of exposed metrics as well as use regular expression to match exposed metrics.
share_labels are used to join labels with a 1:1 mapping and can take other parameters for sharing. More information can be found in the conf.yaml.exmaple.
All HTTP options are also supported.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
class StandardFields(object):\n pass\n
"},{"location":"legacy/prometheus/#prometheus-to-datadog-metric-types","title":"Prometheus to Datadog metric types","text":"
The Openmetrics Base Check supports various configurations for submitting Prometheus metrics to Datadog. We currently support Prometheus gauge, counter, histogram, and summary metric types.
A Prometheus counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart.
Config Option Value Datadog Metric Submitted send_monotonic_countertrue (default) monotonic_countfalsegauge"},{"location":"legacy/prometheus/#histogram","title":"Histogram","text":"
A Prometheus histogram samples observations and counts them in configurable buckets along with a sum of all observed values.
Histogram metrics ending in:
_sum represent the total sum of all observed values. Generally sums are like counters but it's also possible for a negative observation which would not behave like a typical always increasing counter.
_count represent the total number of events that have been observed.
_bucket represent the cumulative counters for the observation buckets. Note that buckets are only submitted if send_histograms_buckets is enabled.
Subtype Config Option Value Datadog Metric Submitted send_distribution_bucketstrue The entire histogram can be submitted as a single distribution metric. If the option is enabled, none of the subtype metrics will be submitted. _sumsend_distribution_sums_as_monotonicfalse (default) gaugetruemonotonic_count_countsend_distribution_counts_as_monotonicfalse (default) gaugetruemonotonic_count_bucketnon_cumulative_bucketsfalse (default) gaugetruemonotonic_count under .count metric name if send_distribution_counts_as_monotonic is enabled. Otherwise, gauge."},{"location":"legacy/prometheus/#summary","title":"Summary","text":"
Prometheus summary metrics are similar to histograms but allow configurable quantiles.
Summary metrics ending in:
_sum represent the total sum of all observed values. Generally sums are like counters but it's also possible for a negative observation which would not behave like a typical always increasing counter.
_count represent the total number of events that have been observed.
metrics with labels like {quantile=\"<\u03c6>\"} represent the streaming quantiles of observed events.
The default values for optional settings are populated in defaults.py and are derived from the value property of config spec options. The precedence is the default key followed by the example key (if it appears to represent a real value rather than an illustrative example and the type is a primitive). In all other cases, the default is None, which means there is no default getter function.
If such a validator exists in validators.py, then it is called once with the raw config that was supplied by the user. The returned mapping is used as the input config for the subsequent stages.
The value of each field goes through the following steps.
"},{"location":"meta/config-models/#default-value-population","title":"Default value population","text":"
If a field was not supplied by the user nor during the initialization stage, then its default value is taken from defaults.py. This stage is skipped for required fields.
"},{"location":"meta/config-models/#custom-field-validators","title":"Custom field validators","text":"
The contents of validators.py are entirely custom and contain functions to perform extra validation if necessary.
Such validators are called for the appropriate field of the proper model. The returned value is used as the new value of the option for the subsequent stages.
Note
This only occurs if the option was supplied by the user.
"},{"location":"meta/config-models/#pre-defined-field-validators","title":"Pre-defined field validators","text":"
A validators key under the value property of config spec options is considered. Every entry refers to a relative import path to a field validator under datadog_checks.base.utils.models.validation and is executed in the defined order.
Note
This only occurs if the option was supplied by the user.
"},{"location":"meta/config-models/#conversion-to-immutable-types","title":"Conversion to immutable types","text":"
Every list is converted to tuple and every dict is converted to types.MappingProxyType.
Note
A field or nested field would only be a dict when it is defined as a mapping with arbitrary keys. Otherwise, it would be a model with its own properties as usual.
If such a validator exists in validators.py, then it is called with the final constructed model. At this point, it cannot be mutated, so you can only raise errors.
Every integration has a specification detailing all the options that influence behavior. These YAML files are located at <INTEGRATION>/assets/configuration/spec.yaml.
name - This is the name of the file the Agent will look for (REQUIRED)
example_name - This is the name of the example file the Agent will ship. If none is provided, the default will be conf.yaml.example. The exceptions are as follows:
Auto-discovery files, which are named auto_conf.yaml
Python-based core check default files, which are named conf.yaml.default
description - Information about the option. This can be a multi-line string, but each line must contain fewer than 120 characters (REQUIRED).
required - Whether or not the option is required for basic functionality. It defaults to false.
hidden - Whether or not the option should not be publicly exposed. It defaults to false.
display_priority - An integer representing the relative visual rank the option should take on compared to other options when publicly exposed. It defaults to 0, meaning that every option will be displayed in the order defined in the spec.
deprecation - If the option is deprecated, a mapping of relevant information. For example:
deprecation:\n Agent version: 8.0.0\n Migration: |\n do this\n and that\n
multiple - Whether or not options may be selected multiple times like instances or just once like init_config
multiple_instances_defined - Whether or not we separate the definition into multiple instances or just one
metadata_tags - A list of tags (like docs:foo) that can be used for unexpected use cases
options - Nested options, indicating that this is a section like instances or logs
value - The expected type data
There are 2 types of options: those with and without a value. Those with a value attribute are the actual user-controlled settings that influence behavior like username. Those without are expected to be sections and therefore must have an options attribute. An option cannot have both attributes.
Options with a value (non-section) also support:
secret - Whether or not consumers should treat the option as sensitive information like password. It defaults to false.
Info
The option vs section logic was chosen instead of going fully typed to avoid deeply nested values.
The type system is based on a loose subset of OpenAPI 3 data types.
The differences are:
Only the minimum and maximum numeric modifiers are supported
Only the pattern string modifier is supported
The properties object modifier is not a map, but rather a list of maps with a required name attribute. This is so consumers will load objects consistently regardless of language guarantees regarding map key order.
Values also support 1 field of our own:
example - An example value, only required if the type is boolean. The default is <OPTION_NAME>.
Every option may reference pre-defined templates using a key called template. The template format looks like path/to/template_file where path/to must point an existing directory relative to a template directory and template_file must have the file extension .yaml or .yml.
You can use custom templates that will take precedence over the pre-defined templates by using the template_paths parameter of the ConfigSpec class.
The example consumer uses each spec to render the example configuration files that are shipped with every Agent and individual Integration release.
It respects a few extra option-level attributes:
example - A complete example of an option in lieu of a strictly typed value attribute
enabled - Whether or not to un-comment the option, overriding the behavior of required
display_priority - This is an integer affecting the order in which options are displayed, with higher values indicating higher priority. The default is 0.
It also respects a few extra fields under the value attribute of each option:
display_default - This is the default value that will be shown in the header of each option, useful if it differs from the example. You may set it to null explicitly to disable showing this part of the header.
compact_example - Whether or not to display complex types like arrays in their most compact representation. It defaults to false.
Use the --sync flag of the config validation command to render the example configuration files.
"},{"location":"meta/config-specs/#data-model-consumer","title":"Data model consumer","text":"
The model consumer uses each spec to render the pydantic models that checks use to validate and interface with configuration. The models are shipped with every Agent and individual Integration release.
It respects an extra field under the value attribute of each option:
default - This is the default value that options will be set to, taking precedence over the example.
validators - This refers to an array of pre-defined field validators to use. Every entry will refer to a relative import path to a field validator under datadog_checks.base.utils.models.validation and will be executed in the defined order.
Use the --sync flag of the model validation command to render the data model files.
"},{"location":"meta/config-specs/#api","title":"API","text":""},{"location":"meta/config-specs/#datadog_checks.dev.tooling.configuration.ConfigSpec","title":"datadog_checks.dev.tooling.configuration.ConfigSpec","text":"Source code in datadog_checks_dev/datadog_checks/dev/tooling/configuration/core.py
class ConfigSpec(object):\n def __init__(self, contents: str, template_paths: List[str] = None, source: str = None, version: str = None):\n \"\"\"\n Parameters:\n\n contents:\n the raw text contents of a spec\n template_paths:\n a sequence of directories that will take precedence when looking for templates\n source:\n a textual representation of what the spec refers to, usually an integration name\n version:\n the version of the spec to default to if the spec does not define one\n \"\"\"\n self.contents = contents\n self.source = source\n self.version = version\n self.templates = ConfigTemplates(template_paths)\n self.data: Union[dict, None] = None\n self.errors = []\n\n def load(self) -> None:\n \"\"\"\n This function de-serializes the specification and:\n 1. fills in default values\n 2. populates any selected templates\n 3. accumulates all error/warning messages\n If the `errors` attribute is empty after this is called, the `data` attribute\n will be the fully resolved spec object.\n \"\"\"\n if self.data is not None and not self.errors:\n return\n\n try:\n self.data = yaml.safe_load(self.contents)\n except Exception as e:\n self.errors.append(f'{self.source}: Unable to parse the configuration specification: {e}')\n return\n\n spec_validator(self.data, self)\n
contents:\n the raw text contents of a spec\ntemplate_paths:\n a sequence of directories that will take precedence when looking for templates\nsource:\n a textual representation of what the spec refers to, usually an integration name\nversion:\n the version of the spec to default to if the spec does not define one\n
Source code in datadog_checks_dev/datadog_checks/dev/tooling/configuration/core.py
def __init__(self, contents: str, template_paths: List[str] = None, source: str = None, version: str = None):\n \"\"\"\n Parameters:\n\n contents:\n the raw text contents of a spec\n template_paths:\n a sequence of directories that will take precedence when looking for templates\n source:\n a textual representation of what the spec refers to, usually an integration name\n version:\n the version of the spec to default to if the spec does not define one\n \"\"\"\n self.contents = contents\n self.source = source\n self.version = version\n self.templates = ConfigTemplates(template_paths)\n self.data: Union[dict, None] = None\n self.errors = []\n
This function de-serializes the specification and: 1. fills in default values 2. populates any selected templates 3. accumulates all error/warning messages If the errors attribute is empty after this is called, the data attribute will be the fully resolved spec object.
Source code in datadog_checks_dev/datadog_checks/dev/tooling/configuration/core.py
def load(self) -> None:\n \"\"\"\n This function de-serializes the specification and:\n 1. fills in default values\n 2. populates any selected templates\n 3. accumulates all error/warning messages\n If the `errors` attribute is empty after this is called, the `data` attribute\n will be the fully resolved spec object.\n \"\"\"\n if self.data is not None and not self.errors:\n return\n\n try:\n self.data = yaml.safe_load(self.contents)\n except Exception as e:\n self.errors.append(f'{self.source}: Unable to parse the configuration specification: {e}')\n return\n\n spec_validator(self.data, self)\n
Our CI deploys the documentation to GitHub Pages if any changes occur on commits to the master branch.
Danger
Never make documentation non-deterministic as it will trigger deploys for every single commit.
For example, say you want to display the valid values of a CLI option and the enumeration is represented as a set. Formatting the sequence directly will produce inconsistent results because sets do not guarantee order like dictionaries do, so you must sort it first.
We use official labeler action to automatically add labels to pull requests.
The labeler is configured to add the following:
Label Condition integration/<NAME> any directory at the root that actually contains an integration documentation any Markdown, config specs, manifest.json, or anything in /docs/ dev/testing GitHub Actions or Codecov config dev/tooling GitLab or GitHub Actions config, or ddev dependencies any change in shipped dependencies release any base package, dev package, or integration release changelog/no-changelog any release, or if all files don't modify code that is shipped"},{"location":"meta/ci/testing/","title":"Testing","text":""},{"location":"meta/ci/testing/#workflows","title":"Workflows","text":"
Master - Runs tests on Python 3 for every target on merges to the master branch
PR - Runs tests on Python 2 & 3 for any modified target in a pull request as long as the base or developer packages were not modified
PR All - Runs tests on Python 2 & 3 for every target in a pull request if the base or developer packages were modified
Nightly minimum base package test - Runs tests for every target once nightly using the minimum declared required version of the base package
Nightly Python 2 tests - Runs tests on Python 2 for every target once nightly
Test Agent release - Runs tests for every target when manually scheduled using specific versions of the Agent for E2E tests
This workflow is meant to be used on pull requests.
First it computes the job matrix based on what was changed. Since this is time sensitive, rather than fetching the entire history we use GitHub's API to find out the precise depth to fetch in order to reach the merge base. Then it runs the test workflow for every job in the matrix.
Note
Changes that match any of the following patterns inside a directory will trigger the testing of that target:
assets/configuration/**/*
tests/**/*
*.py
hatch.toml
metadata.csv
pyproject.toml
Warning
A matrix is limited to 256 jobs. Rather than allowing a workflow error, the matrix generator will enforce the cap and emit a warning.
This workflow runs a single job that is the foundation of how all tests are executed. Depending on the input parameters, the order of operations is as follows:
Checkout code (on pull requests this is a merge commit)
Set up Python 2.7
Set up the Python version the Agent currently ships
Some targets require additional set up such as the installation of system dependencies. Therefore, all such logic is put into scripts that live under /.ddev/ci/scripts.
As targets may need different set up on different platforms, all scripts live under a directory named after the platform ID. All scripts in the directory are executed in lexicographical order. Files in the scripts directory whose names begin with an underscore are not executed.
The step that executes these scripts is the only step that has access to secrets.
Since environment variables defined in a workflow do not propagate to reusable workflows, secrets must be passed as a JSON string representing a map.
Both the PR test and Test target reusable workflows for testing accept a setup-env-vars input parameter that defines the environment variables for the setup step. For example:
If environment variables need to be available for testing, you can add a script that writes to the file defined by the GITHUB_ENV environment variable:
Configuration for targets lives under the overrides.ci key inside a /.ddev/config.toml file.
Note
Targets are referenced by the name of their directory.
"},{"location":"meta/ci/testing/#platforms","title":"Platforms","text":"Name ID Default runner Linux linux Ubuntu 22.04 Windows windows Windows Server 2022 macOS macos macOS 12
If an integration's manifest.json indicates that the only supported platform is Windows then that will be used to run tests, otherwise they will run on Linux.
To override the platform(s) used, one can set the overrides.ci.<TARGET>.platforms array. For example:
During testing we use ddtrace to submit APM data to the Datadog Agent. To avoid every job pulling the Agent, these HTTP trace requests are captured and saved to a newline-delimited JSON file.
A workflow then runs after all jobs are finished and replays the requests to the Agent. At the end the artifact is deleted to avoid needless storage persistence and also so if individual jobs are rerun that only the new traces will be submitted.
We maintain a public dashboard for monitoring our CI.
A workflow runs on merges to the master branch that, if the files defining the dependencies have not changed, saves the dependencies shared by all targets for the current Python version for each platform.
During testing the cache is restored, with a fallback to an older compatible version of the cache.
The first command invocation is extraordinarily slow (see actions/runner-images#6561). Bash appears to be the least affected so we set that as the default shell for all workflows that run commands.
Note
The official checkout action is affected by a similar issue (see actions/checkout#1246) that has been narrowed down to disk I/O.
Various validations are ran to check for correctness. There is a reusable workflow that repositories may call with input parameters defining which validations to use, with each input parameter corresponding to a subcommand under the ddev validate command group.
This validates that each integration version is in sync with the requirements-agent-release.txt file. It is uncommon for this to fail because the release process is automated.
This validates that all CI entries for integrations are valid. This includes checking if the integration has the correct Codecov config, and has a valid CI entry if it is testable.
Tip
Run ddev validate ci --sync to resolve most errors.
This validates that every integration has a codeowner entry. If this validation fails, add an entry in the codewners file corresponding to any newly added integration.
Note
This validation is only enabled for integrations-extras.
This verifies that the config specs for all integrations are valid by enforcing our configuration spec schema. The most common failure is some version of File <INTEGRATION_SPEC> needs to be synced. To resolve this issue, you can run ddev validate config --sync
If you see failures regarding formatting or missing parameters, see our config spec documentation for more details on how to construct configuration specs.
This validates that the manifest files contain required fields, are formatted correctly, and don't contain common errors. See the Datadog docs for more detailed constraints.
This ensures that every integration's README.md file is formatted correctly. The main purpose of this validation is to ensure that any image linked in the readme exists and that all images are located in an integration's /image directory.
"},{"location":"tutorials/jmx/integration/#step-1-create-a-jmx-integration-scaffolding","title":"Step 1: Create a JMX integration scaffolding","text":"
ddev create --type jmx MyJMXIntegration\n
JMX integration contains specific init configs and instance configs:
init_config:\n is_jmx: true # tells the Agent that the integration is a JMX type of integration\n collect_default_metrics: true # if true, metrics declared in `metrics.yaml` are collected\n\ninstances:\n - host: <HOST> # JMX hostname\n port: <PORT> # JMX port\n ...\n
Other init and instance configs can be found on JMX integration page
"},{"location":"tutorials/jmx/integration/#step-2-define-metrics-you-want-to-collect","title":"Step 2: Define metrics you want to collect","text":"
Select what metrics you want to collect from JMX. Available metrics can be usually found on official documentation of the service you want to monitor.
You can also use tools like VisualVM, JConsole or jmxterm to explore the available JMX beans and their descriptions.
"},{"location":"tutorials/logs/http-crawler/#define-an-agent-check","title":"Define an Agent Check","text":"
We start by registering an implementation for our integration. At first it is empty, we will expand on it step by step.
Open datadog_checks/acme/check.py in our editor and put the following there:
from datadog_checks.base.checks.logs.crawler.base import LogCrawlerCheck\n\n\nclass AcmeCheck(LogCrawlerCheck):\n __NAMESPACE__ = 'acme'\n
Now we'll run something we will refer to as the check command:
ddev env agent acme py3.11 check\n
We'll see the following error:
Can't instantiate abstract class AcmeCheck with abstract method get_log_streams\n
We need to define the get_log_streams method. As stated in the docs, it must return an iterator over LogStream subclasses. The next section describes this further.
"},{"location":"tutorials/logs/http-crawler/#define-a-stream-of-logs","title":"Define a Stream of Logs","text":"
In the same file, add a LogStream subclass and return it (wrapped in a list) from AcmeCheck.get_log_streams:
from datadog_checks.base.checks.logs.crawler.base import LogCrawlerCheck\nfrom datadog_checks.base.checks.logs.crawler.stream import LogStream\n\nclass AcmeCheck(LogCrawlerCheck):\n __NAMESPACE__ = 'acme'\n\n def get_log_streams(self):\n return [AcmeLogStream(check=self, name='ACME log stream')]\n\nclass AcmeLogStream(LogStream):\n \"\"\"Stream of Logs from ACME\"\"\"\n
Now running the check command will show a new error:
TypeError: Can't instantiate abstract class AcmeLogStream with abstract method records\n
Once again we need to define a method, this time LogStream.records. This method accepts a cursor argument. We ignore this argument for now and explain it later.
from datadog_checks.base.checks.logs.crawler.stream import LogRecord, LogStream\nfrom datadog_checks.base.utils.time import get_timestamp\n\n... # Skip AcmeCheck to focus on LogStream.\n\n\nclass AcmeLogStream(LogStream):\n \"\"\"Stream of Logs from ACME\"\"\"\n\n def records(self, cursor=None):\n return [\n LogRecord(\n data={'message': 'This is a log from ACME.', 'level': 'info'},\n cursor={'timestamp': get_timestamp()},\n )\n ]\n
There are several things going on here. AcmeLogStream.records returns an iterator over LogRecord objects. For simplicity here we return a list with just one record. After we understand what each LogRecord looks like we can discuss how to generate multiple records.
"},{"location":"tutorials/logs/http-crawler/#what-is-a-log-record","title":"What is a Log Record?","text":"
The LogRecord class has 2 fields. In data we put any data in here that we want to submit as a log to Datadog. In cursor we store a unique identifier for this specific LogRecord.
We use the cursor field to checkpoint our progress as we scrape the external API. In other words, every time our integration completes its run we save the last cursor we submitted. We can then resume scraping from this cursor. That's what the cursor argument to the records method is for. The very first time the integration runs this cursor is None because we have no checkpoints. For every subsequent integration run, the cursor will be set to the LogRecord.cursor of the last LogRecord yielded or returned from records.
Some things to consider when defining cursors:
Use UTC time stamps!
Only using the timestamp as a unique identifier may not be enough. We can have different records with the same timestamp.
One popular identifier is the order of the log record in the stream. Whether this works or not depends on the API we are crawling.
"},{"location":"tutorials/logs/http-crawler/#scraping-for-log-records","title":"Scraping for Log Records","text":"
In our toy example we returned a list with just one record. In practice we will need to create a list or lazy iterator over LogRecords. We will construct them from data that we collect from the external API, in this case the one from ACME.
Below are some tips and considerations when scraping external APIs:
Use the cursor argument to checkpoint your progress.
The Agent schedules an integration run approximately every 10-15 seconds.
The intake won't accept logs that are older than 18 hours. For better performance skip such logs as you generate LogRecord items.
SNMP is a protocol for gathering metrics from network devices, but automated testing of the integration would not be practical nor reliable if we used actual devices.
Our approach is to use a simulated SNMP device that responds to SNMP queries using simulation data.
This simulated device is brought up as a Docker container when starting the SNMP test environment using:
The community_string must match the corresponding device .snmprec file name. For example, myprofile.snmprec gives community_string: myprofile. This also applies to walk files: myprofile.snmpwalk gives community_string: myprofile.
To find the IP address of the SNMP container, run:
Make sure you have the Net-SNMP tools installed on your machine. These should come pre-installed by default on Linux and macOS. If necessary, you can download them on the Net-SNMP website.
To query a specific OID from a device, we can use the snmpget command.
For example, the following command will query sysDescr OID of an SNMP device, which returns its human-readable description:
$ snmpget -v 2c -c public -IR 127.0.0.1:1161 system.sysDescr.0\nSNMPv2-MIB::sysDescr.0 = STRING: Linux 41ba948911b9 4.9.87-linuxkit-aufs #1 SMP Wed Mar 14 15:12:16 UTC 2018 x86_64\nSNMPv2-MIB::sysORUpTime.1 = Timeticks: (9) 0:00:00.09\n
Let's break this command down:
snmpget: this command sends an SNMP GET request, and can be used to query the value of an OID. Here, we are requesting the system.sysDescr.0 OID.
-v 2c: instructs your SNMP client to send the request using SNMP version 2c. See SNMP Versions.
-c public: instructs the SNMP client to send the community string public along with our request. (This is a form of authentication provided by SNMP v2. See SNMP Versions.)
127.0.0.1:1161: this is the host and port where the simulated SNMP agent is available at. (Confirm the port used by the ddev environment by inspecting the Docker port mapping via $ docker ps.)
system.sysDescr.0: this is the OID that the client should request. In practice this can refer to either a fully-resolved OID (e.g. 1.3.6.1.4.1[...]), or a label (e.g. sysDescr.0).
-IR: this option allows us to use labels for OIDs that aren't in the generic 1.3.6.1.2.1.* sub-tree (see: The OID tree). TL;DR: always use this option when working with OIDs coming from vendor-specific MIBs.
Tip
If the above command fails, try using the explicit OID like so:
$ snmpget -v 2c -c public -IR 127.0.0.1:1161 iso.3.6.1.2.1.1.1.0\n
To generate simulation data for tables automatically, use the mib2dev.py tool shipped with snmpsim. This tool will be renamed as snmpsim-record-mibs in the upcoming 1.0 release of the library.
First, install snmpsim:
pip install snmpsim\n
Then run the tool, specifying the MIB with the start and stop OIDs (which can correspond to .e.g the first and last columns in the table respectively).
mib2dev has a known issue with IF-MIB::ifPhysAddress, that is expected to contain an hexadecimal string, but mib2dev fills it with a string. To fix this, provide a valid hextring when prompted on the command line:
# Synthesizing row #1 of table 1.3.6.1.2.1.2.2.1\n*** Inconsistent value: Display format eval failure: b'driving kept zombies quaintly forward zombies': invalid literal for int() with base 16: 'driving kept zombies quaintly forward zombies'caused by <class 'ValueError'>: invalid literal for int() with base 16: 'driving kept zombies quaintly forward zombies'\n*** See constraints and suggest a better one for:\n# Table IF-MIB::ifTable\n# Row IF-MIB::ifEntry\n# Index IF-MIB::ifIndex (type InterfaceIndex)\n# Column IF-MIB::ifPhysAddress (type PhysAddress)\n# Value ['driving kept zombies quaintly forward zombies'] ? 001122334455\n
"},{"location":"tutorials/snmp/how-to/#generate-simulation-data-from-a-walk","title":"Generate simulation data from a walk","text":"
As an alternative to .snmprec files, it is possible to use a walk as simulation data. This is especially useful when debugging live devices, since you can export the device walk and use this real data locally.
To do so, paste the output of a walk query into a .snmpwalk file, and add this file to the test data directory. Then, pass the name of the walk file as the community_string. For more information, see Test SNMP profiles locally.
"},{"location":"tutorials/snmp/how-to/#find-where-mibs-are-installed-on-your-machine","title":"Find where MIBs are installed on your machine","text":"
Since community resources that list MIBs and OIDs are best effort, the MIB you are investigating may not be present or may not be available in its the latest version.
In that case, you can use the snmptranslate CLI tool to output similar information for MIBs installed on your system. This tool is part of Net-SNMP - see SNMP queries prerequisites.
Steps
Run $ snmptranslate -m <MIBNAME> -Tz -On to get a complete list of OIDs in the <MIBNAME> MIB along with their labels.
Redirect to a file for nicer formatting as needed.
Use the -M <DIR> option to specify the directory where snmptranslate should look for MIBs. Useful if you want to inspect a MIB you've just downloaded but not moved to the default MIB directory.
Tip
Use -Tp for an alternative tree-like formatting.
"},{"location":"tutorials/snmp/introduction/","title":"Introduction to SNMP","text":"
In this introduction, we'll cover general information about the SNMP protocol, including key concepts such as OIDs and MIBs.
If you're already familiar with the SNMP protocol, feel free to skip to the next page.
"},{"location":"tutorials/snmp/introduction/#what-is-snmp","title":"What is SNMP?","text":""},{"location":"tutorials/snmp/introduction/#overview","title":"Overview","text":"
SNMP (Simple Network Management Protocol) is a protocol for monitoring network devices. It uses UDP and supports both a request/response model (commands and queries) and a notification model (traps, informs).
In the request/response model, the SNMP manager (eg. the Datadog Agent) issues an SNMP command (GET, GETNEXT, BULK) to an SNMP agent (eg. a network device).
SNMP was born in the 1980s, so it has been around for a long time. While more modern alternatives like NETCONF and OpenConfig have been gaining attention, a large amount of network devices still use SNMP as their primary monitoring interface.
The SNMP protocol exists in 3 versions: v1 (legacy), v2c, and v3.
The main differences between v1/v2c and v3 are the authentication mechanism and transport layer, as summarized below.
Version Authentication Transport layer v1/v2c Password (the community string) Plain text only v3 Username/password Support for packet signing and encryption"},{"location":"tutorials/snmp/introduction/#oids","title":"OIDs","text":""},{"location":"tutorials/snmp/introduction/#what-is-an-oid","title":"What is an OID?","text":"
Identifiers for queryable quantities
An OID, also known as an Object Identifier, is an identifier for a quantity (\"object\") that can be retrieved from an SNMP device. Such quantities may include uptime, temperature, network traffic, etc (quantities available will vary across devices).
To make them processable by machines, OIDs are represented as dot-separated sequences of numbers, e.g. 1.3.6.1.2.1.1.1.
Global definition
OIDs are globally defined, which means they have the same meaning regardless of the device that processes the SNMP query. For example, querying the 1.3.6.1.2.1.1.1 OID (also known as sysDescr) on any SNMP agent will make it return the system description. (More on the OID/label mapping can be found in the MIBs section below.)
Not all OIDs contain metrics data
OIDs can refer to various types of objects, such as strings, numbers, tables, etc.
In particular, this means that only a fraction of OIDs refer to numerical quantities that can actually be sent as metrics to Datadog. However, non-numerical OIDs can also be useful, especially for tagging.
"},{"location":"tutorials/snmp/introduction/#the-oid-tree","title":"The OID tree","text":"
OIDs are structured in a tree-like fashion. Each number in the OID represents a node in the tree.
The wildcard notation is often used to refer to a sub-tree of OIDs, e.g. 1.3.6.1.2.*.
It so happens that there are two main OID sub-trees: a sub-tree for general-purpose OIDs, and a sub-tree for vendor-specific OIDs.
Located under the sub-tree: 1.3.6.1.4.1.* (a.k.a. enterprises).
These OIDs are defined and managed by network device vendors themselves.
Each vendor is assigned its own enterprise sub-tree in the form of 1.3.6.1.4.1.<N>.*.
For example:
1.3.6.1.4.1.2.* is the sub-tree for IBM-specific OIDs.
1.3.6.1.4.1.9.* is the sub-tree for Cisco-specific OIDs.
The full list of vendor sub-trees can be found here: SNMP OID 1.3.6.1.4.1.
"},{"location":"tutorials/snmp/introduction/#notable-oids","title":"Notable OIDs","text":"OID Label Description 1.3.6.1.2.1.2sysObjectId An OID whose value is an OID that represents the device make and model (yes, it's a bit meta). 1.3.6.1.2.1.1.1sysDescr A human-readable, free-form description of the device. 1.3.6.1.2.1.1.3sysUpTimeInstance The device uptime."},{"location":"tutorials/snmp/introduction/#mibs","title":"MIBs","text":""},{"location":"tutorials/snmp/introduction/#what-is-an-mib","title":"What is an MIB?","text":"
OIDs are grouped in modules called MIBs (Management Information Base). An MIB describes the hierarchy of a given set of OIDs. (This is somewhat analogous to a dictionary that contains the definitions for each word in a spoken language.)
For example, the IF-MIB describes the hierarchy of OIDs within the sub-tree 1.3.6.1.2.1.2.*. These OIDs contain metrics about the network interfaces available on the device. (Note how its location under the 1.3.6.1.2.* sub-tree indicates that it is a generic MIB, available on most network devices.)
As part of the description of OIDs, an MIB defines a human-readable label for each OID. For example, IF-MIB describes the OID 1.3.6.1.2.1.1 and assigns it the label sysDescr. The operation that consists in finding the OID from a label is called OID resolution.
"},{"location":"tutorials/snmp/introduction/#tools-and-resources","title":"Tools and resources","text":"
The following resources can be useful when working with MIBs:
MIB Discovery: a search engine for OIDs. Use it to find what an OID corresponds to, which MIB it comes from, what label it is known as, etc.
Circitor MIB files repository: a repository and search engine where one can download actual .mib files.
SNMP Labs MIB repository: alternate repo of many common MIBs. Note: this site hosts the underlying MIBs which the pysnmp-mibs library (used by the SNMP Python check) actually validates against. Double check any MIB you get from an alternate source with what is in this repo.
Tutorials: Internet Management and SNMP (YouTube) (In-depth videos about SNMP architecture, MIBs, protocol data structures, security models, monitoring code examples, etc.)
"},{"location":"tutorials/snmp/profile-format/","title":"Profile Format Reference","text":""},{"location":"tutorials/snmp/profile-format/#overview","title":"Overview","text":"
SNMP profiles are our way of providing out-of-the-box monitoring for certain makes and models of network devices.
An SNMP profile is materialised as a YAML file with the following structure:
sysobjectid: <x.y.z...>\n\n# extends:\n# <Optional list of base profiles to extend from...>\n\nmetrics:\n # <List of metrics to collect...>\n\n# metric_tags:\n# <List of tags to apply to collected metrics. Required for table metrics, optional otherwise>\n
This field can be used to include metrics and metric tags from other so-called base profiles. Base profiles can derive from other base profiles to build a hierarchy of reusable profile mixins.
Important
All device profiles should extend from the _base.yaml profile, which defines items that should be collected for all devices.
Example:
extends:\n - _base.yaml\n - _generic-if.yaml # Include basic metrics from IF-MIB.\n
Entries in the metrics field define which metrics will be collected by the profile. They can reference either a single OID (a.k.a symbol), or an SNMP table.
An SNMP symbol is an object with a scalar type (i.e. Counter32, Integer32, OctetString, etc).
In a MIB file, a symbol can be recognized as an OBJECT-TYPE node with a scalar SYNTAX, placed under an OBJECT IDENTIFIER node (which is often the root OID of the MIB):
In profiles, tables can be specified as entries containing the MIB, table and symbols fields. The syntax for the value contained in each row is typically <TABLE_OID>.1.<COLUMN_ID>.<INDEX>:
metrics:\n # Example for the dummy table above:\n - MIB: EXAMPLE-MIB\n table:\n # Identification of the table which metrics come from.\n OID: 1.3.6.1.4.1.10\n name: exampleTable\n symbols:\n # List of symbols ('columns') to retrieve.\n # Same format as for a single OID.\n # The value from each row (index) in the table will be collected `<TABLE_OID>.1.<COLUMN_ID>.<INDEX>`\n - OID: 1.3.6.1.4.1.10.1.1\n name: exampleColumn1\n - OID: 1.3.6.1.4.1.10.1.2\n name: exampleColumn2\n # ...\n\n # More realistic example:\n - MIB: CISCO-PROCESS-MIB\n table:\n # Each row in this table contains information about a CPU unit of the device.\n OID: 1.3.6.1.4.1.9.9.109.1.1.1\n name: cpmCPUTotalTable\n symbols:\n - OID: 1.3.6.1.4.1.9.9.109.1.1.1.1.12\n name: cpmCPUMemoryUsed\n # ...\n
Table metrics require metric_tags to identify each row's metric. It is possible to add tags to metrics retrieved from a table in three ways:
"},{"location":"tutorials/snmp/profile-format/#using-a-column-within-the-same-table","title":"Using a column within the same table","text":"
metrics:\n - MIB: IF-MIB\n table:\n OID: 1.3.6.1.2.1.2.2\n name: ifTable\n symbols:\n - OID: 1.3.6.1.2.1.2.2.1.14\n name: ifInErrors\n # ...\n metric_tags:\n # Add an 'interface' tag to each metric of each row,\n # whose value is obtained from the 'ifDescr' column of the row.\n # This allows querying metrics by interface, e.g. 'interface:eth0'.\n - tag: interface\n symbol:\n OID: 1.3.6.1.2.1.2.2.1.2\n name: ifDescr\n
"},{"location":"tutorials/snmp/profile-format/#using-a-column-from-a-different-table-with-identical-indexes","title":"Using a column from a different table with identical indexes","text":"
"},{"location":"tutorials/snmp/profile-format/#using-a-column-from-a-different-table-with-different-indexes","title":"Using a column from a different table with different indexes","text":"
If the external table has different indexes, use index_transform to select a subset of the full index. index_transform is a list of start/end ranges to extract from the current table index to match the external table index. start and end are inclusive.
External table indexes must be a subset of the indexes of the current table, or same indexes in a different order.
Example
In the example above, the index of cpiPduBranchTable looks like 1.6.0.36.155.53.3.246, the first digit is the cpiPduBranchId index and the rest is the cpiPduBranchMac index. The index of cpiPduTable looks like 6.0.36.155.53.3.246 and represents cpiPduMac (equivalent to cpiPduBranchMac).
By using the index_transform with start 1 and end 7, we extract 6.0.36.155.53.3.246 from 1.6.0.36.155.53.3.246 (cpiPduBranchTable full index), and then use it to match 6.0.36.155.53.3.246 (cpiPduTable full index).
index_transform can be more complex, the following definition will extract 2.3.5.6.7 from 1.2.3.4.5.6.7.
"},{"location":"tutorials/snmp/profile-format/#mapping-column-to-tag-string-value","title":"Mapping column to tag string value","text":"
You can use the following syntax to map OID values to tag string values. In the example below, the submitted metrics will be snmp.ifInOctets with tags like if_type:regular1822. Available in Agent 7.45+.
"},{"location":"tutorials/snmp/profile-format/#using-an-index","title":"Using an index","text":"
Important: \"index\" refers to one digit of the index part of the row OID. For example, if the column OID is 1.2.3.1.2 and the row OID is 1.2.3.1.2.7.8.9, the full index is 7.8.9. In this example, index: 1 refers to 7 and index: 2 refers to 8, and so on.
Here is specific example of an OID with multiple positions in the index (OID ref):
cfwConnectionStatEntry OBJECT-TYPE\n SYNTAX CfwConnectionStatEntry\n ACCESS not-accessible\n STATUS mandatory\n DESCRIPTION\n \"An entry in the table, containing information about a\n firewall statistic.\"\n INDEX { cfwConnectionStatService, cfwConnectionStatType }\n ::= { cfwConnectionStatTable 1 }\n
The index in the case is a combination of cfwConnectionStatService and cfwConnectionStatType. Inspecting the OBJECT-TYPE of cfwConnectionStatService reveals the SYNTAX as Services (OID ref):
cfwConnectionStatService OBJECT-TYPE\n SYNTAX Services\n MAX-ACCESS not-accessible\n STATUS current\n DESCRIPTION\n \"The identification of the type of connection providing\n statistics.\"\n ::= { cfwConnectionStatEntry 1 }\n
For example, when we fetch the value of cfwConnectionStatValue, the OID with the index is like 1.3.6.1.4.1.9.9.147.1.2.2.2.1.5.20.2 = 4087850099, here the indexes are 20.2 (1.3.6.1.4.1.9.9.147.1.2.2.2.1.5.<service type>.<stat type>). Here is how we would specify this configuration in the yaml (as seen in the corresponding profile packaged with the agent):
metrics:\n - MIB: CISCO-FIREWALL-MIB\n table:\n OID: 1.3.6.1.4.1.9.9.147.1.2.2.2\n name: cfwConnectionStatTable\n symbols:\n - OID: 1.3.6.1.4.1.9.9.147.1.2.2.2.1.5\n name: cfwConnectionStatValue\n metric_tags:\n - index: 1 // capture first index digit\n tag: service_type\n - index: 2 // capture second index digit\n tag: stat_type\n
"},{"location":"tutorials/snmp/profile-format/#mapping-index-to-tag-string-value","title":"Mapping index to tag string value","text":"
You can use the following syntax to map indexes to tag string values. In the example below, the submitted metrics will be snmp.ipSystemStatsHCInReceives with tags like ipversion:ipv6.
General guidelines on Datadog tagging also apply to table metric tags.
In particular, be mindful of the kind of value contained in the columns used a tag sources. E.g. avoid using a DisplayString (an arbitrarily long human-readable text description) or unbounded sources (timestamps, IDs...) as tag values.
Good candidates for tag values include short strings, enums, or integer indexes.
"},{"location":"tutorials/snmp/profile-format/#metric-type-inference","title":"Metric type inference","text":"
By default, the Datadog metric type of a symbol will be inferred from the SNMP type (i.e. the MIB SYNTAX):
SNMP type Inferred metric type Counter32rateCounter64rateGauge32gaugeIntegergaugeInteger32gaugeCounterBasedGauge64gaugeOpaquegauge
SNMP types not listed in this table are submitted as gauge by default.
Sometimes the inferred type may not be what you want. Typically, OIDs that represent \"total number of X\" are defined as Counter32 in MIBs, but you probably want to submit them monotonic_count instead of a rate.
For such cases, you can define a metric_type. Possible values and their effect are listed below.
Forced type Description gauge Submit as a gauge. rate Submit as a rate. percent Multiply by 100 and submit as a rate. monotonic_count Submit as a monotonic count. monotonic_count_and_rate Submit 2 copies of the metric: one as a monotonic count, and one as a rate (suffixed with .rate). flag_stream Submit each flag of a flag stream as individual metric with value 0 or 1. See Flag Stream section.
This works on both symbol and table metrics:
metrics:\n # On a symbol:\n - MIB: TCP-MIB\n symbol:\n OID: 1.3.6.1.2.1.6.5\n name: tcpActiveOpens\n metric_type: monotonic_count\n # On a table, apply same metric_type to all metrics:\n - MIB: IP-MIB\n table:\n OID: 1.3.6.1.2.1.4.31.1\n name: ipSystemStatsTable\n metric_type: monotonic_count\n symbols:\n - OID: 1.3.6.1.2.1.4.31.1.1.4\n name: ipSystemStatsHCInReceives\n - OID: 1.3.6.1.2.1.4.31.1.1.6\n name: ipSystemStatsHCInOctets\n # On a table, apply different metric_type per metric:\n - MIB: IP-MIB\n table:\n OID: 1.3.6.1.2.1.4.31.1\n name: ipSystemStatsTable\n symbols:\n - OID: 1.3.6.1.2.1.4.31.1.1.4\n name: ipSystemStatsHCInReceives\n metric_type: monotonic_count\n - OID: 1.3.6.1.2.1.4.31.1.1.6\n name: ipSystemStatsHCInOctets\n metric_type: gauge\n
When the value is a flag stream like 010101, you can use metric_type: flag_stream to submit each flag as individual metric with value 0 or 1. Two options are required when using flag_stream:
options.placement: position of the flag in the flag stream (1-based indexing, first element is placement 1).
options.metric_suffix: suffix appended to the metric name for a specific flag, usually matching the name of the flag.
An snmp.myDevice metric is sent, with a value of 1 and tagged by statuses. This allows you to monitor status changes, number of devices per state, etc., in Datadog.
This field is used to apply tags to all metrics collected by the profile. It has the same meaning than the instance-level config option (see conf.yaml.example).
Several collection methods are supported, as illustrated below:
"},{"location":"tutorials/snmp/profile-format/#value-from-multiple-oids-symbols","title":"Value from multiple OIDs (symbols)","text":"
When the value might be from multiple symbols, we try to get the value from first symbol, if the value can't be fetched (e.g. OID not available from the device), we try to get the value from the second symbol, and so on.
In the examples above, the OID value is a snmp OctetString value 22C and we want 22 to be submitted as value for snmp.temperature.
"},{"location":"tutorials/snmp/profile-format/#extract_value-can-be-used-to-trim-surrounding-non-printable-characters","title":"extract_value can be used to trim surrounding non-printable characters","text":"
If the raw SNMP OctetString value contains leading or trailing non-printable characters, you can use extract_value regex like ([a-zA-Z0-9_]+) to ignore them.
If you see MAC Address in tags being encoded as 0x000000000000 instead of 00:00:00:00:00:00, then you can use format: mac_address to format the MAC Address to 00:00:00:00:00:00 format.
If you see IP Address in tags being encoded as 0x0a430007 instead of 10.67.0.7, then you can use format: ip_address to format the IP Address to 10.67.0.7 format.
Generally, you'll want to search the web and find out about the following:
Device name, manufacturer, and device sysobjectid.
Understand what the device does, and what it is used for. (Which metrics are relevant varies between routers, switches, bridges, etc. See Networking hardware.)
E.g. from the HP iLO Wikipedia page, we can see that iLO4 devices are used by system administrators for remote management of embedded servers.
Available versions of the device, and which ones we target.
E.g. HP iLO devices exist in multiple versions (version 3, version 4...). Here, we are specifically targeting HP iLO4.
Supported MIBs and OIDs (often available in official documentation), and associated MIB files.
E.g. we can see that HP provides a MIB package for iLO devices here.
Now that we have gathered some basic information about the device and its SNMP interfaces, we should decide which metrics we want to collect. (Devices often expose thousands of metrics through SNMP. We certainly don't want to collect them all.)
Devices typically expose thousands of OIDs that can span dozens of MIB, so this can feel daunting at first. Remember, never give up!
Some guidelines to help you in this process:
10-40 metrics is a good amount already.
Explore base profiles to see which ones could be applicable to the device.
Explore manufacturer-specific MIB files looking for metrics such as:
General health: status gauges...
Network traffic: bytes in/out, errors in/out, ...
CPU and memory usage.
Temperature: temperature sensors, thermal condition, ...
sysobjectid can also be a wildcard pattern to match a sub-tree of devices, eg 1.3.6.1.131.12.4.*.
"},{"location":"tutorials/snmp/profiles/#generate-a-profile-file-from-a-collection-of-mibs","title":"Generate a profile file from a collection of MIBs","text":"
You can use ddev to create a profile from a list of mibs.
$ ddev meta snmp generate-profile-from-mibs --help\n
This script requires a list of ASN1 MIB files as input argument, and copies to the clipboard a list of metrics that can be used to create a profile.
Will include system, interfaces and ip nodes from RFC1213-MIB, no node from CISCO-SYSLOG-MIB, and node snmpEngine from SNMP-FRAMEWORK-MIB.
Note that each MIB:node_name correspond to exactly one and only one OID. However, some MIBs report legacy nodes that are overwritten.
To resolve, edit the MIB by removing legacy values manually before loading them with this profile generator. If a MIB is fully supported, it can be omitted from the filter as MIBs not found in a filter will be fully loaded. If a MIB is not fully supported, it can be listed with an empty node list, as CISCO-SYSLOG-MIB in the example.
-a, --aliases is an option to provide the path to a YAML file containing a list of aliases to be used as metric tags for tables, in the following format:
MIBs tables most of the time define one or more indexes, as columns within the same table, or columns from a different table and even a different MIB. The index value can be used to tag table's metrics. This is defined in the INDEX field in row nodes.
As an example, entPhysicalContainsTable in ENTITY-MIB is as follows:
entPhysicalContainsEntry OBJECT-TYPE\nSYNTAX EntPhysicalContainsEntry\nMAX-ACCESS not-accessible\nSTATUS current\nDESCRIPTION\n \"A single container/'containee' relationship.\"\nINDEX { entPhysicalIndex, entPhysicalChildIndex } <== this is the index definition\n::= { entPhysicalContainsTable 1 }\n
or its JSON dump, where INDEX is replaced by indices:
Indexes can be replaced by another MIB symbol that is more human friendly. You might prefer to see the interface name versus its numerical table index. This can be achieved using metric_tag_aliases.
"},{"location":"tutorials/snmp/profiles/#add-unit-tests","title":"Add unit tests","text":"
Add a unit test in test_profiles.py to verify that the metric is successfully collected by the integration when the profile is enabled. (These unit tests are mostly used to prevent regressions and will help with maintenance.)
"},{"location":"tutorials/snmp/profiles/#rinse-and-repeat","title":"Rinse and repeat","text":"
We have now covered the basic workflow \u2014 add metrics, expand tests, add simulation data. You can now go ahead and add more metrics to the profile!
Congratulations! You should now be able to write a basic SNMP profile.
We kept this tutorial as simple as possible, but profiles offer many more options to collect metrics from SNMP devices.
To learn more about what can be done in profiles, read the Profile format reference.
To learn more about .snmprec files, see the Simulation data format reference.
"},{"location":"tutorials/snmp/sim-format/","title":"Simulation Data Format Reference","text":""},{"location":"tutorials/snmp/sim-format/#conventions","title":"Conventions","text":"
Simulation data for profiles is contained in .snmprec files located in the tests directory.
Simulation files must be named after the SNMP community string used in the profile unit tests. For example: cisco-nexus.snmprec.
Adding simulation data for tables can be particularly tedious. This section documents the manual process, but automatic generation is possible \u2014 see How to generate table simulation data.
For table metrics, add one copy of the metric per row, appending the index to the OID.
For example, to simulate 3 rows in the table 1.3.6.1.4.1.6.13 that has OIDs 1.3.6.1.4.1.6.13.1.6 and 1.3.6.1.4.1.6.13.1.8, you could write:
If the table uses table metric tags, you may need to add additional OID simulation data for those tags.
"},{"location":"tutorials/snmp/tools/","title":"Tools","text":""},{"location":"tutorials/snmp/tools/#using-tcpdump-with-snmp","title":"Using tcpdump with SNMP","text":"
The tcpdump command shows the exact request and response content of SNMP GET, GETNEXT and other SNMP calls.
In a shell run tcpdump:
tcpdump -vv -nni lo0 -T snmp host localhost and port 161\n
-nn: turn off host and protocol name resolution (to avoid generating DNS packets)
-i INTERFACE: listen on INTERFACE (default: lowest numbered interface)
-T snmp: type/protocol, snmp in our case
In another separate shell run snmpwalk or snmpget:
snmpwalk -O n -v2c -c <COMMUNITY_STRING> localhost:1161 1.3.6\n
After you've run snmpwalk, you'll see results like this from tcpdump:
tcpdump -vv -nni lo0 -T snmp host localhost and port 161\ntcpdump: listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes\n17:25:43.639639 IP (tos 0x0, ttl 64, id 29570, offset 0, flags [none], proto UDP (17), length 76, bad cksum 0 (->91d)!)\n 127.0.0.1.59540 > 127.0.0.1.1161: { SNMPv2c C=\"cisco-nexus\" { GetRequest(28) R=1921760388 .1.3.6.1.2.1.1.2.0 } }\n17:25:43.645088 IP (tos 0x0, ttl 64, id 26543, offset 0, flags [none], proto UDP (17), length 88, bad cksum 0 (->14e4)!)\n 127.0.0.1.1161 > 127.0.0.1.59540: { SNMPv2c C=\"cisco-nexus\" { GetResponse(40) R=1921760388 .1.3.6.1.2.1.1.2.0=.1.3.6.1.4.1.9.12.3.1.3.1.2 } }\n
"},{"location":"tutorials/snmp/tools/#from-the-docker-agent-container","title":"From the Docker Agent container","text":"
If you want to run snmpget, snmpwalk, and tcpdump from the Docker Agent container you can install them by running the following commands (in the container):
apt update\napt install -y snmp tcpdump\n
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Agent Integrations","text":"
Welcome to the wonderful world of developing Agent Integrations for Datadog. Here we document how we do things, the processes for various tasks, coding conventions & best practices, the internals of our testing infrastructure, and so much more.
If you are intrigued, continue reading. If not, continue all the same
To start an environment run ddev env start <INTEGRATION> <ENVIRONMENT>, for example:
$ ddev env start postgres py3.9-14.0\n\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500 Starting: py3.9-14.0 \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\n[+] Running 4/4\n - Network compose_pg-net Created 0.1s\n - Container compose-postgres_replica2-1 Started 0.9s\n - Container compose-postgres_replica-1 Started 0.9s\n - Container compose-postgres-1 Started 0.9s\n\nmaster-py3: Pulling from datadog/agent-dev\nDigest: sha256:72824c9a986b0ef017eabba4e2cc9872333c7e16eec453b02b2276a40518655c\nStatus: Image is up to date for datadog/agent-dev:master-py3\ndocker.io/datadog/agent-dev:master-py3\n\nStop environment -> ddev env stop postgres py3.9-14.0\nExecute tests -> ddev env test postgres py3.9-14.0\nCheck status -> ddev env agent postgres py3.9-14.0 status\nTrigger run -> ddev env agent postgres py3.9-14.0 check\nReload config -> ddev env reload postgres py3.9-14.0\nManage config -> ddev env config\nConfig file -> C:\\Users\\ofek\\AppData\\Local\\ddev\\env\\postgres\\py3.9-14.0\\config\\postgres.yaml\n
This sets up the selected environment and an instance of the Agent running in a Docker container. The default configuration is defined by each environment's test suite and is saved to a file, which is then mounted to the Agent container so you may freely modify it.
Let's see what we have running:
$ docker ps --format \"table {{.Image}}\\t{{.Status}}\\t{{.Ports}}\\t{{.Names}}\"\nIMAGE STATUS PORTS NAMES\ndatadog/agent-dev:master-py3 Up 3 minutes (healthy) dd_postgres_py3.9-14.0\npostgres:14-alpine Up 3 minutes (healthy) 5432/tcp, 0.0.0.0:5434->5434/tcp compose-postgres_replica2-1\npostgres:14-alpine Up 3 minutes (healthy) 0.0.0.0:5432->5432/tcp compose-postgres-1\npostgres:14-alpine Up 3 minutes (healthy) 5432/tcp, 0.0.0.0:5433->5433/tcp compose-postgres_replica-1\n
By default the version of the integration used will be the one shipped with the chosen Agent version. If you wish to modify an integration and test changes in real time, use the --dev flag.
Doing so will mount and install the integration in the Agent container. All modifications to the integration's directory will be propagated to the Agent, whether it be a code change or switching to a different Git branch.
If you modify the base package then you will need to mount that with the --base flag, which implicitly activates --dev.
To run tests against the live Agent, use the ddev env test command. It is similar to the test command except it is capable of running tests marked as E2E, and only runs such tests.
You may start an interactive debugging session using the --breakpoint/-b option.
The option accepts an integer representing the line number at which to break. For convenience, 0 and -1 are shortcuts to the first and last line of the integration's check method, respectively.
$ ddev env agent postgres py3.9-14.0 check -b 0\n> /opt/datadog-agent/embedded/lib/python3.9/site-packages/datadog_checks/postgres/postgres.py(851)check()\n-> tags = copy.copy(self.tags)\n(Pdb) list\n846 }\n847 self._database_instance_emitted[self.resolved_hostname] = event\n848 self.database_monitoring_metadata(json.dumps(event, default=default_json_event_encoding))\n849\n850 def check(self, _):\n851 B-> tags = copy.copy(self.tags)\n852 # Collect metrics\n853 try:\n854 # Check version\n855 self._connect()\n856 self.load_version() # We don't want to cache versions between runs to capture minor updates for metadata\n
Caveat
The line number must be within the integration's check method.
Testing and manual check runs always reflect the current state of code and configuration however, if you want to see the result of changes in-app, you will need to refresh the environment by running ddev env reload <INTEGRATION> <ENVIRONMENT>.
To work on any integration you must install Python 3.12.
After installation, restart your terminal and ensure that your newly installed Python comes first in your PATH.
macOSWindowsLinux
First update the formulae and Homebrew itself:
brew update\n
then install Python:
brew install python@3.12\n
After it completes, check the output to see if it asked you to run any extra commands and if so, execute them.
Verify successful PATH modification:
which -a python\n
Windows users have it the easiest.
Download the Python 3.12 64-bit executable installer and run it. When prompted, be sure to select the option to add to your PATH. Also, it is recommended that you choose the per-user installation method.
Verify successful PATH modification:
where python\n
Ah, you enjoy difficult things. Are you using Gentoo?
We recommend using either Miniconda or pyenv to install Python 3.12. Whatever you do, never modify the system Python.
"},{"location":"setup/#installers","title":"Installers","text":"macOSWindows GUI installerCommand line installer
In your browser, download the .pkg file: ddev-10.2.0.pkg
Run your downloaded file and follow the on-screen instructions.
Restart your terminal.
To verify that the shell can find and run the ddev command in your PATH, use the following command.
$ ddev --version\n10.2.0\n
Download the file using the curl command. The -o option specifies the file name that the downloaded package is written to. In this example, the file is written to ddev-10.2.0.pkg in the current directory.
Run the standard macOS installer program, specifying the downloaded .pkg file as the source. Use the -pkg parameter to specify the name of the package to install, and the -target / parameter for the drive in which to install the package. The files are installed to /usr/local/ddev, and an entry is created at /etc/paths.d/ddev that instructs shells to add the /usr/local/ddev directory to. You must include sudo on the command to grant write permissions to those folders.
sudo installer -pkg ./ddev-10.2.0.pkg -target /\n
Restart your terminal.
To verify that the shell can find and run the ddev command in your PATH, use the following command.
$ ddev --version\n10.2.0\n
GUI installerCommand line installer
In your browser, download one the .msi files:
ddev-10.2.0-x64.msi
ddev-10.2.0-x86.msi
Run your downloaded file and follow the on-screen instructions.
Restart your terminal.
To verify that the shell can find and run the ddev command in your PATH, use the following command.
$ ddev --version\n10.2.0\n
Download and run the installer using the standard Windows msiexec program, specifying one of the .msi files as the source. Use the /passive and /i parameters to request an unattended, normal installation.
After downloading the archive corresponding to your platform and architecture, extract the binary to a directory that is on your PATH and rename to ddev.
Do not use sudo as it may result in a broken installation!
Run:
pipx install -e /path/to/integrations-core/ddev\n
Run:
pipx install -e /path/to/integrations-core/ddev\n
Warning
Do not use sudo as it may result in a broken installation!
Re-sync dependencies at any time by running:
pipx upgrade ddev\n
Note
Be aware that this method does not keep track of dependencies so you will need to re-run the command if/when the required dependencies are changed.
Note
Also be aware that this method does not get any changes from datadog_checks_dev, so if you have unreleased changes from datadog_checks_dev that may affect ddev, you will need to run the following to get the most recent changes from datadog_checks_dev to your ddev:
You'll notice that all environments for running tests are prefixed with pyX.Y, indicating the Python version to use. If you don't have a particular version installed (for example Python 2.7), such environments will be skipped.
The second part of a test environment's name corresponds to the version of the product. For example, the 14.0 in py3.9-14.0 implies tests will run against version 14.x of PostgreSQL.
If there is no version suffix, it means that either:
the version is pinned, usually set to pull the latest release, or
there is no concept of a product, such as the disk check
Passing just the integration name will run every test environment. You may select a subset of environments to run by appending a : followed by a comma-separated list of environments.
For example, executing:
ddev test postgres:py3.9-13.0,py3.9-11.0\n
will run tests for the environment py3.9-13.0 followed by the environment py3.9-11.0.
If no integrations are specified then only integrations that were changed will be tested, based on a diff between the latest commit to the current and master branches.
The criteria for an integration to be considered changed is based on the file extension of paths in the diff. So for example if only Markdown files were modified then nothing will be tested.
The integrations will be tested in lexicographical order.
To measure code coverage, use the --cov/-c flag. Doing so will display a summary of coverage statistics after successful execution of integrations' tests.
To run only the lint checks, use the --lint/-s shortcut flag.
You may also only run the formatter using the --fmt/-fs shortcut flag. The formatter will automatically resolve the most common errors caught by the lint checks.
The IBM i integration uses ODBC to connect to IBM i hosts and query system data through an SQL interface. To do so, it uses the ODBC Driver for IBM i Access Client Solutions, an IBM propietary ODBC driver that manages connections to IBM i hosts.
Limitations in the IBM i ODBC driver make it necessary to structure the check in a more complex way than would be expected, to avoid the check from hanging or leaking threads.
"},{"location":"architecture/ibm_i/#ibm-i-odbc-driver-limitations","title":"IBM i ODBC driver limitations","text":"
ODBC drivers can optionally support custom configuration through connection attributes, which help configure how a connection works. One fundamental connection attribute is SQL_ATTR_QUERY_TIMEOUT (and related _TIMEOUT attributes), which set the timeout for SQL queries done through the driver (or the timeout for other connection steps for other _TIMEOUT attributes). If this connection attribute is not set there is no timeout, which means the driver gets stuck waiting for a reply when a network issue happens.
As of the writing of this document, the IBM i ODBC driver behavior when setting the SQL_ATTR_QUERY_TIMEOUT connection attribute is similar to the one described in ODBC Query Timeout Property. For the IBM i DB2 driver: the driver estimates the running time of a query and preemptively aborts the query if the estimate is above the specified threshold, but it does not take into account the actual running time of the query (and thus, it's not useful for avoiding network issues).
"},{"location":"architecture/ibm_i/#ibm-i-check-workaround","title":"IBM i check workaround","text":"
To deal with the OBDC driver limitations, the IBM i check needs to have an alternative way to abort a query once a given timeout has passed. To do so, the IBM i check runs queries in a subprocess which it kills and restarts when timeouts pass. This subprocess runs query_script.py using the embedded Python interpreter.
It is essential that the connection is kept across queries. For a given connection, ELAPSED_ columns on IBM i views report statistics since the last time the table was queried on that connection, thus if using different connections these values are always zero.
To communicate with the main Agent process, the subprocess and the IBM i check exchange JSON-encoded messages through pipes until the special ENDOFQUERY message is received. Special care is needed to avoid blocking on reads and writes of the pipes.
For adding/modifying the queries, the check uses the standard QueryManager class used for SQL-based checks, except that each query needs to include a timeout value (since, empirically, some queries take much longer to complete on IBM i hosts).
While most integrations are either Python, JMX, or implemented in the Agent in Go, the SNMP integration is a bit more complex.
Here's an overview of what this integration involves:
A Python check, responsible for:
Collecting metrics from a specific device IP. Metrics typically come from profiles, but they can also be specified explicitly.
Auto-discovering devices over a network. (Pending deprecation in favor of Agent auto-discovery.)
An Agent service listener, responsible for auto-discovering devices over a network and forwarding discovered instances to the existing Agent check scheduling pipeline. Also known as \"Agent SNMP auto-discovery\".
The diagram below shows how these components interact for a typical VM-based setup (single Agent on a host). For Datadog Cluster Agent (DCA) deployments, see Cluster Agent support.
The Python check includes a multithreaded implementation of device auto-discovery. It runs on instances that use network_address instead of ip_address:
The main tasks performed by device auto-discovery are:
Find new devices: For each IP in the network_address CIDR range, the check queries the device sysObjectID. If the query succeeds and the sysObjectID matches one of the registered profiles, the device is added as a discovered instance. This logic is run at regular intervals in a separate thread.
Cache devices: To improve performance, discovered instances are cached on disk based on a hash of the instance. Since options from the network_address instance are copied into discovered instances, the cache is invalidated if the network_address changes.
Check devices: On each check run, the check runs a check on all discovered instances. This is done in parallel using a threadpool. The check waits for all sub-checks to finish.
Handle failures: Discovered instances that fail after a configured number of times are dropped. They may be rediscovered later.
Submit discovery-related metrics: the check submits the total number of discovered devices for a given network_address instance.
The approach described above is not ideal for several reasons:
The check code is harder to understand since the two distinct paths (\"single device\" vs \"entire network\") live in a single integration.
Each network instance manages several long-running threads that span well beyond the lifespan of a single check run.
Each network check pseudo-schedules other instances, which is normally the responsibility of the Agent.
For this reason, auto-discovery was eventually implemented in the Agent as a proper service listener (see below), and users should be discouraged from using Python auto-discovery. When the deprecation period expires, we will be able to remove auto-discovery logic from the Python check, making it exclusively focused on checking single devices.
Agent auto-discovery implements the same logic than the Python auto-discovery, but as a service listener in the Agent Go package.
This approach leverages the existing Agent scheduling logic, and makes it possible to scale device auto-discovery using the Datadog Cluster Agent (see Cluster Agent support).
Pending official documentation, here is an example configuration:
For Kubernetes environments, the Cluster Agent can be configured to use the SNMP Agent auto-discovery (via snmp listener) logic as a source of Cluster checks.
The Datadog Cluster Agent (DCA) uses the snmp_listener config (Agent auto-discovery) to listen for IP ranges, then schedules snmp check instances to be run by one or more normal Datadog Agents.
Agent auto-discovery combined with Cluster Agent is very scalable, it can be used to monitor a large number of snmp devices.
"},{"location":"architecture/snmp/#example-cluster-agent-setup-with-snmp-agent-auto-discovery-using-datadog-helm-chart","title":"Example Cluster Agent setup with SNMP Agent auto-discovery using Datadog helm-chart","text":"
datadog:\n ## @param apiKey - string - required\n ## Set this to your Datadog API key before the Agent runs.\n ## ref: https://app.datadoghq.com/account/settings/agent/latest?platform=kubernetes\n #\n apiKey: <DATADOG_API_KEY>\n\n ## @param clusterName - string - optional\n ## Set a unique cluster name to allow scoping hosts and Cluster Checks easily\n ## The name must be unique and must be dot-separated tokens where a token can be up to 40 characters with the following restrictions:\n ## * Lowercase letters, numbers, and hyphens only.\n ## * Must start with a letter.\n ## * Must end with a number or a letter.\n ## Compared to the rules of GKE, dots are allowed whereas they are not allowed on GKE:\n ## https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1beta1/projects.locations.clusters#Cluster.FIELDS.name\n #\n clusterName: my-snmp-cluster\n\n ## @param clusterChecks - object - required\n ## Enable the Cluster Checks feature on both the cluster-agents and the daemonset\n ## ref: https://docs.datadoghq.com/agent/autodiscovery/clusterchecks/\n ## Autodiscovery via Kube Service annotations is automatically enabled\n #\n clusterChecks:\n enabled: true\n\n ## @param tags - list of key:value elements - optional\n ## List of tags to attach to every metric, event and service check collected by this Agent.\n ##\n ## Learn more about tagging: https://docs.datadoghq.com/tagging/\n #\n tags:\n - 'env:test-snmp-cluster-agent'\n\n## @param clusterAgent - object - required\n## This is the Datadog Cluster Agent implementation that handles cluster-wide\n## metrics more cleanly, separates concerns for better rbac, and implements\n## the external metrics API so you can autoscale HPAs based on datadog metrics\n## ref: https://docs.datadoghq.com/agent/kubernetes/cluster/\n#\nclusterAgent:\n ## @param enabled - boolean - required\n ## Set this to true to enable Datadog Cluster Agent\n #\n enabled: true\n\n ## @param confd - list of objects - optional\n ## Provide additional cluster check configurations\n ## Each key will become a file in /conf.d\n ## ref: https://docs.datadoghq.com/agent/autodiscovery/\n #\n confd:\n # Static checks\n http_check.yaml: |-\n cluster_check: true\n instances:\n - name: 'Check Example Site1'\n url: http://example.net\n - name: 'Check Example Site2'\n url: http://example.net\n - name: 'Check Example Site3'\n url: http://example.net\n # Autodiscovery template needed for `snmp_listener` to create instance configs\n snmp.yaml: |-\n cluster_check: true\n\n # AD config below is copied from: https://github.com/DataDog/datadog-agent/blob/master/cmd/agent/dist/conf.d/snmp.d/auto_conf.yaml\n ad_identifiers:\n - snmp\n init_config:\n instances:\n -\n ## @param ip_address - string - optional\n ## The IP address of the device to monitor.\n #\n ip_address: \"%%host%%\"\n\n ## @param port - integer - optional - default: 161\n ## Default SNMP port.\n #\n port: \"%%port%%\"\n\n ## @param snmp_version - integer - optional - default: 2\n ## If you are using SNMP v1 set snmp_version to 1 (required)\n ## If you are using SNMP v3 set snmp_version to 3 (required)\n #\n snmp_version: \"%%extra_version%%\"\n\n ## @param timeout - integer - optional - default: 5\n ## Amount of second before timing out.\n #\n timeout: \"%%extra_timeout%%\"\n\n ## @param retries - integer - optional - default: 5\n ## Amount of retries before failure.\n #\n retries: \"%%extra_retries%%\"\n\n ## @param community_string - string - optional\n ## Only useful for SNMP v1 & v2.\n #\n community_string: \"%%extra_community%%\"\n\n ## @param user - string - optional\n ## USERNAME to connect to your SNMP devices.\n #\n user: \"%%extra_user%%\"\n\n ## @param authKey - string - optional\n ## Authentication key to use with your Authentication type.\n #\n authKey: \"%%extra_auth_key%%\"\n\n ## @param authProtocol - string - optional\n ## Authentication type to use when connecting to your SNMP devices.\n ## It can be one of: MD5, SHA, SHA224, SHA256, SHA384, SHA512.\n ## Default to MD5 when `authKey` is specified.\n #\n authProtocol: \"%%extra_auth_protocol%%\"\n\n ## @param privKey - string - optional\n ## Privacy type key to use with your Privacy type.\n #\n privKey: \"%%extra_priv_key%%\"\n\n ## @param privProtocol - string - optional\n ## Privacy type to use when connecting to your SNMP devices.\n ## It can be one of: DES, 3DES, AES, AES192, AES256, AES192C, AES256C.\n ## Default to DES when `privKey` is specified.\n #\n privProtocol: \"%%extra_priv_protocol%%\"\n\n ## @param context_engine_id - string - optional\n ## ID of your context engine; typically unneeded.\n ## (optional SNMP v3-only parameter)\n #\n context_engine_id: \"%%extra_context_engine_id%%\"\n\n ## @param context_name - string - optional\n ## Name of your context (optional SNMP v3-only parameter).\n #\n context_name: \"%%extra_context_name%%\"\n\n ## @param tags - list of key:value element - optional\n ## List of tags to attach to every metric, event and service check emitted by this integration.\n ##\n ## Learn more about tagging: https://docs.datadoghq.com/tagging/\n #\n tags:\n # The autodiscovery subnet the device is part of.\n # Used by Agent autodiscovery to pass subnet name.\n - \"autodiscovery_subnet:%%extra_autodiscovery_subnet%%\"\n\n ## @param extra_tags - string - optional\n ## Comma separated tags to attach to every metric, event and service check emitted by this integration.\n ## Example:\n ## extra_tags: \"tag1:val1,tag2:val2\"\n #\n extra_tags: \"%%extra_tags%%\"\n\n ## @param oid_batch_size - integer - optional - default: 60\n ## The number of OIDs handled by each batch. Increasing this number improves performance but\n ## uses more resources.\n #\n oid_batch_size: \"%%extra_oid_batch_size%%\"\n\n ## @param datadog-cluster.yaml - object - optional\n ## Specify custom contents for the datadog cluster agent config (datadog-cluster.yaml).\n #\n datadog_cluster_yaml:\n listeners:\n - name: snmp\n\n # See here for all `snmp_listener` configs: https://github.com/DataDog/datadog-agent/blob/master/pkg/config/config_template.yaml\n snmp_listener:\n workers: 2\n discovery_interval: 10\n configs:\n - network: 192.168.1.16/29\n version: 2\n port: 1161\n community: cisco_icm\n - network: 192.168.1.16/29\n version: 2\n port: 1161\n community: f5\n
TODO: architecture diagram, example setup, affected files and repos, local testing tools, etc.
vSphere is a VMware product dedicated to managing a (usually) on-premise infrastructure. From physical machines running VMware ESXi that are called ESXi Hosts, users can spin up or migrate Virtual Machines from one host to another.
vSphere is an integrated solution and provides an easy managing interface over concepts like data storage, or computing resource.
This section details some of vSphere specific elements. This section does not intend to be an extensive list, but rather a place for those unfamiliar with the product to have the basics required to understand how the Datadog integration works.
vSphere - The complete suite of tools and technologies detailed in this article.
vCenter server - The main machine which controls ESXi hosts and provides both a web UI and an API to control the vSphere environment.
vCSA (vCenter Server Appliance) - A specific kind of vCenter where the software runs in a dedicated Linux machine (more recent). By opposition, the legacy vCenter is typically installed on an existing Windows machine.
ESXi host - The physical machine controlled by vCenter where the ESXi (bare-metal) virtualizer is installed. The host boots a minimal OS that can run Virtual Machines.
VM - What anyone using vSphere really needs in the end, instances that can run applications and code. Note: Datadog monitors both ESXi hosts and VMs and it calls them both \"host\" (they are in the host map).
Attributes/tags - It is possible to add attributes and tags to any vSphere resource, note that those two are now very similar with \"attributes\" being the deprecated thing to use.
Datacenter - A set of resources grouped together. A single vCenter server can handle multiple datacenters.
Datastore - A virtual vSphere concept to represent data storing capabilities. It can be an NFS server that ESXi hosts have read/write access to, it can be a mounted disk on the host and more. Datastores are often shared between multiple hosts. This allows Virtual Machines to be migrated from one host to another.
Cluster - A logical grouping of computational resources, you can add multiple ESXi hosts in your cluster and then you can create VM in the cluster (and not on a specific host, vSphere will take care of placing your VM in one of the ESXi hosts and migrating it when needed).
Photon OS - An open-source minimal Linux distribution and used by both ESXi and vCSA as a base.
The Datadog vSphere integration runs from a single agent and pulls all the information from a single vCenter endpoint. Because the agent cannot run directly on Photon OS, it is usually required that the agent runs within a dedicated VM inside the vSphere infrastructure.
Once the agent is running, the minimal configuration (as of version 5.x) is as follows:
host is the endpoint used to access the vSphere Client from a web browser. The host is either a FQDN or an IP, not an http url.
username and password are the credentials to log in to vCenter.
use_legacy_check_version is a backward compatibility flag. It should always be set to false and this flag will be removed in a future version of the integration. Setting it to true tells the agent to use an older and deprecated version of the vSphere integration.
empty_default_hostname is a field used by the agent directly (and not the integration). By default, the agent does not allow submitting metrics without attaching an explicit host tag unless this flag is set to true. The vSphere integration uses that behavior for some metrics and service checks. For example, the vsphere.vm.count metric which gives a count of the VMs in the infra is not submitted with a host tag. This is particularly important if the agent runs inside a vSphere VM. If the vsphere.vm.count was submitted with a host tag, the Datadog backend would attach all the other host tags to the metric, for example vsphere_type:vm or vsphere_host:<NAME_OF_THE_ESX_HOST> which makes the metric almost impossible to use.
vSphere metrics are documented in their documentation page an each metric has a defined \"collection level\".
That level determines the amount of data gathered by the integration and especially which metrics are available. More details here.
By default, only the level 1 metrics are collected but this can be increased in the integration configuration file.
"},{"location":"architecture/vsphere/#realtime-vs-historical","title":"Realtime vs historical","text":"
Each ESXi host collects and stores data for each metric on himself and every VM it hosts every 20 seconds. Those data points are stored for up to one hour and are called realtime. Note: Each metric concerns always either a VM or an ESXi hosts. Metrics that concern datastore for example are not collected in the ESXi hosts.
Additionally, the vCenter server collects data from all the ESXi hosts and stores the datapoint with some aggregation rollup into its own database. Those data points are called \"historical\".
Finally, the vCenter server also collects metrics for other kinds of resources (like Datastore, ClusterComputeResource, Datacenter...) Those data points are necessarily \"historical\".
The reason for such an important distinction is that historical metrics are much MUCH slower to collect than realtime metrics. The vSphere integration will always collect the \"realtime\" data for metrics that concern ESXi hosts and VMs. But the integration also collects metrics for Datastores, ClusterComputeResources, Datacenters, and maybe others in the future.
That's why, in the context of the Datadog vSphere integration, we usually simplify by considering that:
VMs and ESXi hosts are \"realtime resources\". Metrics for such resources are quick and easy to get by querying vCenter that will in turn query all the ESXi hosts.
Datastores, ClusterComputeResources, and Datacenters are \"historical resources\" and are much slower to collect.
To collect all metrics (realtime and historical), it is advised to use two \"check instances\". One with collection_type: realtime and one with collection_type: historical . This way all metrics will be collected but because both check instances are on different schedules, the slowness of collecting historical metrics won't affect the rate at which realtime metrics are collected.
"},{"location":"architecture/vsphere/#vsphere-tags-and-attributes","title":"vSphere tags and attributes","text":"
Similarly to how Datadog allows you to add tags to your different hosts (thins like the os or the instance-type of your machines), vSphere has \"tags\" and \"attributes\".
A lot of details can be found here: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vcenterhost.doc/GUID-E8E854DD-AA97-4E0C-8419-CE84F93C4058.html#:~:text=Tags%20and%20attributes%20allow%20you,that%20tag%20to%20a%20category.
But the overall idea is that both tags and attributes are additional information that you can attach to your vSphere resources and that \"tags\" are newer and more featureful than \"attributes\".
A very flexible filtering system has been implemented with the vSphere integration.
This allows fine-tuned configuration so that:
You only pay for the host and VMs you really want to monitor.
You reduce the load on your vCenter server by running just the queries that you need.
You improve the check runtime which otherwise increases linearly with the size of their infrastructure and that was seen to take up to 10min in some large environments.
We provide two types of filtering, one based on metrics, the other based on resources.
The metric filter is fairly simple, for each resource type, you can provide some regexes. If a metric match any of the filter, it will be fetched and submitted. The configuration looks like this:
The resource filter on the other hand, allows to exclude some vSphere resources (VM, ESXi host, etc.), based on an \"attribute\" of that resource. The possible attributes as of today are: - name, literally the name of the resource (as defined in vCenter) - inventory_path, a path-like string that represents the location of the resource in the inventory tree as each resource only ever has a single parent and recursively up to the root. For example: /my.datacenter.local/vm/staging/myservice/vm_name - tag, see the tags and attributes section. Used to filter resources based on the attached tags. - attribute, see the tags and attributes section. Used to filter resources based on the attached attributes. - hostname (only for VMs), the name of the ESXi host where the VM is running. - guest_hostname (only for VMs), the name of the OS as reported from within the machine. VMware tools have to be installed on the VM otherwise, vCenter is not able to fetch this information.
A possible filtering configuration would look like this:
In vSphere each metric is defined by three \"dimensions\".
The resource on which the metric applies (for example the VM called \"abc1\")
The name of the metric (for example cpu.usage).
An additional available dimension that varies between metrics. (for example the cpu core id)
This is similar to how Datadog represent metrics, except that the context cardinality is limited to two \"keys\", the name of the resource (usually the \"host\" tag), and there is space for one additional tag key.
This available tag key is defined as the \"instance\" property, or \"instance tag\" in vSphere, and this dimension is not collected by default by the Datadog integration as it can have too big performance implications in large systems when compared to their added value from a monitoring perspective.
Also when fetching metrics with the instance tag, vSphere only provides the value of the instance tag, it doesn't expose a human-readable \"key\" for that tag. In the cpu.usage metric with the core_id as the instance tag, the integration has to \"know\" that the meaning of the instance tag and that's why we rely on a hardcoded list in the integration.
Because this instance tag can provide additional visibility, it is possible to enable it for some metrics from the configuration. For example, if we're really interested in getting the usage of the cpu per core, the setup can look like this:
Users set a path with which to collect events from that is the name of a channel like System, Application, etc.
There are 3 ways to select filter criteria rather than collecting all events:
query - A raw XPath or structured XML query used to filter events. This overrides any selected filters.
filters - A mapping of properties to allowed values. Every filter (equivalent to the and operator) must match any value (equivalent to the or operator). This option is a convenience for a query that is relatively basic.
Rather than collect all events and perform filtering within the check, the filters are converted to an XPath expression. This approach offloads all filtering to the kernel (like query), which increases performance and reduces bandwidth usage when connecting to a remote machine.
included_messages/excluded_messages - These are regular expression patterns used to filter by events' messages specifically (if a message is found), with the exclude list taking precedence. These may be used in place of or with query/filters, as there exists no query construct by which to select a message attribute.
A pull subscription model is used. At every check run, the cached event log handle waits to be signaled for a configurable number of seconds. If signaled, the check then polls all available events in batches of a configurable size.
At configurable intervals, the most recently encountered event is saved to the filesystem. This is useful for preventing duplicate events being sent as a consequence of Agent restarts, especially when the start option is set to oldest.
Events may alternatively be configured to be submitted as logs. The code for that resides here.
Only a subset of the check's functionality is available. Namely, each log configuration will collect all events of the given channel without filtering, tagging, nor remote connection options.
This implementation uses the push subscription model. There is a bit of C in charge of rendering the relevant data and registering the Go tailer callback that ultimately sends the log to the backend.
Setting legacy_mode to true in the check will use WMI to collect events, which is significantly more resource intensive. This mode has entirely different configuration options and will be removed in a future release.
Agent 6 can only use this mode as Python 2 does not support the new implementation.
The Base package provides all the functionality and utilities necessary for writing Agent Integrations. Most importantly it provides the AgentCheck base class from which every Check must be inherited.
The check method is what the Datadog Agent will execute.
In this example we created a Check and gave it a namespace of awesome. This means that by default, every submission's name will be prefixed with awesome..
We submitted a gauge metric named awesome.test with a value of 1.23 tagged by foo:bar.
The magic hidden by the usability of the API is that this actually calls a C binding which communicates with the Agent (written in Go).
In general, you don't need to and you should not override anything from the base class except the check method but sometimes it might be useful for a Check to have its own constructor.
When overriding __init__ you have to remember that, depending on the configuration, the Agent might create several different Check instances and the method would be called as many times.
Agent 6,7 signature:
AgentCheck(name, init_config, instances) # instances contain only 1 instance\nAgentCheck.check(instance)\n
Agent 8 signature:
AgentCheck(name, init_config, instance) # one instance\nAgentCheck.check() # no more instance argument for check method\n
Note
when loading a Custom check, the Agent will inspect the module searching for a subclass of AgentCheck. If such a class exists but has been derived in turn, it'll be ignored - you should never derive from an existing Check.
Source code in datadog_checks_base/datadog_checks/base/checks/base.py
@traced_class\nclass AgentCheck(object):\n \"\"\"\n The base class for any Agent based integration.\n\n In general, you don't need to and you should not override anything from the base\n class except the `check` method but sometimes it might be useful for a Check to\n have its own constructor.\n\n When overriding `__init__` you have to remember that, depending on the configuration,\n the Agent might create several different Check instances and the method would be\n called as many times.\n\n Agent 6,7 signature:\n\n AgentCheck(name, init_config, instances) # instances contain only 1 instance\n AgentCheck.check(instance)\n\n Agent 8 signature:\n\n AgentCheck(name, init_config, instance) # one instance\n AgentCheck.check() # no more instance argument for check method\n\n !!! note\n when loading a Custom check, the Agent will inspect the module searching\n for a subclass of `AgentCheck`. If such a class exists but has been derived in\n turn, it'll be ignored - **you should never derive from an existing Check**.\n \"\"\"\n\n # If defined, this will be the prefix of every metric/service check and the source type of events\n __NAMESPACE__ = ''\n\n OK, WARNING, CRITICAL, UNKNOWN = ServiceCheck\n\n # Used by `self.http` for an instance of RequestsWrapper\n HTTP_CONFIG_REMAPPER = None\n\n # Used by `create_tls_context` for an instance of RequestsWrapper\n TLS_CONFIG_REMAPPER = None\n\n # Used by `self.set_metadata` for an instance of MetadataManager\n #\n # This is a mapping of metadata names to functions. When you call `self.set_metadata(name, value, **options)`,\n # if `name` is in this mapping then the corresponding function will be called with the `value`, and the\n # return value(s) will be sent instead.\n #\n # Transformer functions must satisfy the following signature:\n #\n # def transform_<NAME>(value: Any, options: dict) -> Union[str, Dict[str, str]]:\n #\n # If the return type is a string, then it will be sent as the value for `name`. If the return type is\n # a mapping type, then each key will be considered a `name` and will be sent with its (str) value.\n METADATA_TRANSFORMERS = None\n\n FIRST_CAP_RE = re.compile(br'(.)([A-Z][a-z]+)')\n ALL_CAP_RE = re.compile(br'([a-z0-9])([A-Z])')\n METRIC_REPLACEMENT = re.compile(br'([^a-zA-Z0-9_.]+)|(^[^a-zA-Z]+)')\n TAG_REPLACEMENT = re.compile(br'[,\\+\\*\\-/()\\[\\]{}\\s]')\n MULTIPLE_UNDERSCORE_CLEANUP = re.compile(br'__+')\n DOT_UNDERSCORE_CLEANUP = re.compile(br'_*\\._*')\n\n # allows to set a limit on the number of metric name and tags combination\n # this check can send per run. This is useful for checks that have an unbounded\n # number of tag values that depend on the input payload.\n # The logic counts one set of tags per gauge/rate/monotonic_count call, and de-duplicates\n # sets of tags for other metric types. The first N sets of tags in submission order will\n # be sent to the aggregator, the rest are dropped. The state is reset after each run.\n # See https://github.com/DataDog/integrations-core/pull/2093 for more information.\n DEFAULT_METRIC_LIMIT = 0\n\n # Allow tracing for classic integrations\n def __init_subclass__(cls, *args, **kwargs):\n try:\n # https://github.com/python/mypy/issues/4660\n super().__init_subclass__(*args, **kwargs) # type: ignore\n return traced_class(cls)\n except Exception:\n return cls\n\n def __init__(self, *args, **kwargs):\n # type: (*Any, **Any) -> None\n \"\"\"\n Parameters:\n name (str):\n the name of the check\n init_config (dict):\n the `init_config` section of the configuration.\n instance (list[dict]):\n a one-element list containing the instance options from the\n configuration file (a list is used to keep backward compatibility with\n older versions of the Agent).\n \"\"\"\n # NOTE: these variable assignments exist to ease type checking when eventually assigned as attributes.\n name = kwargs.get('name', '')\n init_config = kwargs.get('init_config', {})\n agentConfig = kwargs.get('agentConfig', {})\n instances = kwargs.get('instances', [])\n\n if len(args) > 0:\n name = args[0]\n if len(args) > 1:\n init_config = args[1]\n if len(args) > 2:\n # agent pass instances as tuple but in test we are usually using list, so we are testing for both\n if len(args) > 3 or not isinstance(args[2], (list, tuple)) or 'instances' in kwargs:\n # old-style init: the 3rd argument is `agentConfig`\n agentConfig = args[2]\n if len(args) > 3:\n instances = args[3]\n else:\n # new-style init: the 3rd argument is `instances`\n instances = args[2]\n\n # NOTE: Agent 6+ should pass exactly one instance... But we are not abiding by that rule on our side\n # everywhere just yet. It's complicated... See: https://github.com/DataDog/integrations-core/pull/5573\n instance = instances[0] if instances else None\n\n self.check_id = ''\n self.name = name # type: str\n self.init_config = init_config # type: InitConfigType\n self.agentConfig = agentConfig # type: AgentConfigType\n self.instance = instance # type: InstanceType\n self.instances = instances # type: List[InstanceType]\n self.warnings = [] # type: List[str]\n self.disable_generic_tags = (\n is_affirmative(self.instance.get('disable_generic_tags', False)) if instance else False\n )\n self.debug_metrics = {}\n if self.init_config is not None:\n self.debug_metrics.update(self.init_config.get('debug_metrics', {}))\n if self.instance is not None:\n self.debug_metrics.update(self.instance.get('debug_metrics', {}))\n\n # `self.hostname` is deprecated, use `datadog_agent.get_hostname()` instead\n self.hostname = datadog_agent.get_hostname() # type: str\n\n logger = logging.getLogger('{}.{}'.format(__name__, self.name))\n self.log = CheckLoggingAdapter(logger, self)\n\n metric_patterns = self.instance.get('metric_patterns', {}) if instance else {}\n if not isinstance(metric_patterns, dict):\n raise ConfigurationError('Setting `metric_patterns` must be a mapping')\n\n self.exclude_metrics_pattern = self._create_metrics_pattern(metric_patterns, 'exclude')\n self.include_metrics_pattern = self._create_metrics_pattern(metric_patterns, 'include')\n\n # TODO: Remove with Agent 5\n # Set proxy settings\n self.proxies = self._get_requests_proxy()\n if not self.init_config:\n self._use_agent_proxy = True\n else:\n self._use_agent_proxy = is_affirmative(self.init_config.get('use_agent_proxy', True))\n\n # TODO: Remove with Agent 5\n self.default_integration_http_timeout = float(self.agentConfig.get('default_integration_http_timeout', 9))\n\n self._deprecations = {\n 'increment': (\n False,\n (\n 'DEPRECATION NOTICE: `AgentCheck.increment`/`AgentCheck.decrement` are deprecated, please '\n 'use `AgentCheck.gauge` or `AgentCheck.count` instead, with a different metric name'\n ),\n ),\n 'device_name': (\n False,\n (\n 'DEPRECATION NOTICE: `device_name` is deprecated, please use a `device:` '\n 'tag in the `tags` list instead'\n ),\n ),\n 'in_developer_mode': (\n False,\n 'DEPRECATION NOTICE: `in_developer_mode` is deprecated, please stop using it.',\n ),\n 'no_proxy': (\n False,\n (\n 'DEPRECATION NOTICE: The `no_proxy` config option has been renamed '\n 'to `skip_proxy` and will be removed in a future release.'\n ),\n ),\n 'service_tag': (\n False,\n (\n 'DEPRECATION NOTICE: The `service` tag is deprecated and has been renamed to `%s`. '\n 'Set `disable_legacy_service_tag` to `true` to disable this warning. '\n 'The default will become `true` and cannot be changed in Agent version 8.'\n ),\n ),\n '_config_renamed': (\n False,\n (\n 'DEPRECATION NOTICE: The `%s` config option has been renamed '\n 'to `%s` and will be removed in a future release.'\n ),\n ),\n } # type: Dict[str, Tuple[bool, str]]\n\n # Setup metric limits\n self.metric_limiter = self._get_metric_limiter(self.name, instance=self.instance)\n\n # Lazily load and validate config\n self._config_model_instance = None # type: Any\n self._config_model_shared = None # type: Any\n\n # Functions that will be called exactly once (if successful) before the first check run\n self.check_initializations = deque() # type: Deque[Callable[[], None]]\n\n if not PY2:\n self.check_initializations.append(self.load_configuration_models)\n\n self.__formatted_tags = None\n self.__logs_enabled = None\n\n def _create_metrics_pattern(self, metric_patterns, option_name):\n all_patterns = metric_patterns.get(option_name, [])\n\n if not isinstance(all_patterns, list):\n raise ConfigurationError('Setting `{}` of `metric_patterns` must be an array'.format(option_name))\n\n metrics_patterns = []\n for i, entry in enumerate(all_patterns, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(\n 'Entry #{} of setting `{}` of `metric_patterns` must be a string'.format(i, option_name)\n )\n if not entry:\n self.log.debug(\n 'Entry #%s of setting `%s` of `metric_patterns` must not be empty, ignoring', i, option_name\n )\n continue\n\n metrics_patterns.append(entry)\n\n if metrics_patterns:\n return re.compile('|'.join(metrics_patterns))\n\n return None\n\n def _get_metric_limiter(self, name, instance=None):\n # type: (str, InstanceType) -> Optional[Limiter]\n limit = self._get_metric_limit(instance=instance)\n\n if limit > 0:\n return Limiter(name, 'metrics', limit, self.warning)\n\n return None\n\n def _get_metric_limit(self, instance=None):\n # type: (InstanceType) -> int\n if instance is None:\n # NOTE: Agent 6+ will now always pass an instance when calling into a check, but we still need to\n # account for this case due to some tests not always passing an instance on init.\n self.log.debug(\n \"No instance provided (this is deprecated!). Reverting to the default metric limit: %s\",\n self.DEFAULT_METRIC_LIMIT,\n )\n return self.DEFAULT_METRIC_LIMIT\n\n max_returned_metrics = instance.get('max_returned_metrics', self.DEFAULT_METRIC_LIMIT)\n\n try:\n limit = int(max_returned_metrics)\n except (ValueError, TypeError):\n self.warning(\n \"Configured 'max_returned_metrics' cannot be interpreted as an integer: %s. \"\n \"Reverting to the default limit: %s\",\n max_returned_metrics,\n self.DEFAULT_METRIC_LIMIT,\n )\n return self.DEFAULT_METRIC_LIMIT\n\n # Do not allow to disable limiting if the class has set a non-zero default value.\n if limit == 0 and self.DEFAULT_METRIC_LIMIT > 0:\n self.warning(\n \"Setting 'max_returned_metrics' to zero is not allowed. Reverting to the default metric limit: %s\",\n self.DEFAULT_METRIC_LIMIT,\n )\n return self.DEFAULT_METRIC_LIMIT\n\n return limit\n\n @staticmethod\n def load_config(yaml_str):\n # type: (str) -> Any\n \"\"\"\n Convenience wrapper to ease programmatic use of this class from the C API.\n \"\"\"\n return yaml.safe_load(yaml_str)\n\n @property\n def http(self):\n # type: () -> RequestsWrapper\n \"\"\"\n Provides logic to yield consistent network behavior based on user configuration.\n\n Only new checks or checks on Agent 6.13+ can and should use this for HTTP requests.\n \"\"\"\n if not hasattr(self, '_http'):\n self._http = RequestsWrapper(self.instance or {}, self.init_config, self.HTTP_CONFIG_REMAPPER, self.log)\n\n return self._http\n\n @property\n def logs_enabled(self):\n # type: () -> bool\n \"\"\"\n Returns True if logs are enabled, False otherwise.\n \"\"\"\n if self.__logs_enabled is None:\n self.__logs_enabled = bool(datadog_agent.get_config('logs_enabled'))\n\n return self.__logs_enabled\n\n @property\n def formatted_tags(self):\n # type: () -> str\n if self.__formatted_tags is None:\n normalized_tags = set()\n for tag in self.instance.get('tags', []):\n key, _, value = tag.partition(':')\n if not value:\n continue\n\n if self.disable_generic_tags and key in GENERIC_TAGS:\n key = '{}_{}'.format(self.name, key)\n\n normalized_tags.add('{}:{}'.format(key, value))\n\n self.__formatted_tags = ','.join(sorted(normalized_tags))\n\n return self.__formatted_tags\n\n @property\n def diagnosis(self):\n # type: () -> Diagnosis\n \"\"\"\n A Diagnosis object to register explicit diagnostics and record diagnoses.\n \"\"\"\n if not hasattr(self, '_diagnosis'):\n self._diagnosis = Diagnosis(sanitize=self.sanitize)\n return self._diagnosis\n\n def get_tls_context(self, refresh=False, overrides=None):\n # type: (bool, Dict[AnyStr, Any]) -> ssl.SSLContext\n \"\"\"\n Creates and cache an SSLContext instance based on user configuration.\n Note that user configuration can be overridden by using `overrides`.\n This should only be applied to older integration that manually set config values.\n\n Since: Agent 7.24\n \"\"\"\n if not hasattr(self, '_tls_context_wrapper'):\n self._tls_context_wrapper = TlsContextWrapper(\n self.instance or {}, self.TLS_CONFIG_REMAPPER, overrides=overrides\n )\n\n if refresh:\n self._tls_context_wrapper.refresh_tls_context()\n\n return self._tls_context_wrapper.tls_context\n\n @property\n def metadata_manager(self):\n # type: () -> MetadataManager\n \"\"\"\n Used for sending metadata via Go bindings.\n \"\"\"\n if not hasattr(self, '_metadata_manager'):\n if not self.check_id and AGENT_RUNNING:\n raise RuntimeError('Attribute `check_id` must be set')\n\n self._metadata_manager = MetadataManager(self.name, self.check_id, self.log, self.METADATA_TRANSFORMERS)\n\n return self._metadata_manager\n\n @property\n def check_version(self):\n # type: () -> str\n \"\"\"\n Return the dynamically detected integration version.\n \"\"\"\n if not hasattr(self, '_check_version'):\n # 'datadog_checks.<PACKAGE>.<MODULE>...'\n module_parts = self.__module__.split('.')\n package_path = '.'.join(module_parts[:2])\n package = importlib.import_module(package_path)\n\n # Provide a default just in case\n self._check_version = getattr(package, '__version__', '0.0.0')\n\n return self._check_version\n\n @property\n def in_developer_mode(self):\n # type: () -> bool\n self._log_deprecation('in_developer_mode')\n return False\n\n def log_typos_in_options(self, user_config, models_config, level):\n # only import it when running in python 3\n from jellyfish import jaro_winkler_similarity\n\n user_configs = user_config or {} # type: Dict[str, Any]\n models_config = models_config or {}\n typos = set() # type: Set[str]\n\n known_options = {k for k, _ in models_config} # type: Set[str]\n\n if not PY2:\n\n if isinstance(models_config, BaseModel):\n # Also add aliases, if any\n known_options.update(set(models_config.model_dump(by_alias=True)))\n\n unknown_options = [option for option in user_configs.keys() if option not in known_options] # type: List[str]\n\n for unknown_option in unknown_options:\n similar_known_options = [] # type: List[Tuple[str, int]]\n for known_option in known_options:\n ratio = jaro_winkler_similarity(unknown_option, known_option)\n if ratio > TYPO_SIMILARITY_THRESHOLD:\n similar_known_options.append((known_option, ratio))\n typos.add(unknown_option)\n\n if len(similar_known_options) > 0:\n similar_known_options.sort(key=lambda option: option[1], reverse=True)\n similar_known_options_names = [option[0] for option in similar_known_options] # type: List[str]\n message = (\n 'Detected potential typo in configuration option in {}/{} section: `{}`. Did you mean {}?'\n ).format(self.name, level, unknown_option, ', or '.join(similar_known_options_names))\n self.log.warning(message)\n return typos\n\n def load_configuration_models(self, package_path=None):\n if package_path is None:\n # 'datadog_checks.<PACKAGE>.<MODULE>...'\n module_parts = self.__module__.split('.')\n package_path = '{}.config_models'.format('.'.join(module_parts[:2]))\n if self._config_model_shared is None:\n shared_config = copy.deepcopy(self.init_config)\n context = self._get_config_model_context(shared_config)\n shared_model = self.load_configuration_model(package_path, 'SharedConfig', shared_config, context)\n try:\n self.log_typos_in_options(shared_config, shared_model, 'init_config')\n except Exception as e:\n self.log.debug(\"Failed to detect typos in `init_config` section: %s\", e)\n if shared_model is not None:\n self._config_model_shared = shared_model\n\n if self._config_model_instance is None:\n instance_config = copy.deepcopy(self.instance)\n context = self._get_config_model_context(instance_config)\n instance_model = self.load_configuration_model(package_path, 'InstanceConfig', instance_config, context)\n try:\n self.log_typos_in_options(instance_config, instance_model, 'instances')\n except Exception as e:\n self.log.debug(\"Failed to detect typos in `instances` section: %s\", e)\n if instance_model is not None:\n self._config_model_instance = instance_model\n\n @staticmethod\n def load_configuration_model(import_path, model_name, config, context):\n try:\n package = importlib.import_module(import_path)\n # TODO: remove the type ignore when we drop Python 2\n except ModuleNotFoundError as e: # type: ignore\n # Don't fail if there are no models\n if str(e).startswith('No module named '):\n return\n\n raise\n\n model = getattr(package, model_name, None)\n if model is not None:\n try:\n config_model = model.model_validate(config, context=context)\n # TODO: remove the type ignore when we drop Python 2\n except ValidationError as e: # type: ignore\n errors = e.errors()\n num_errors = len(errors)\n message_lines = [\n 'Detected {} error{} while loading configuration model `{}`:'.format(\n num_errors, 's' if num_errors > 1 else '', model_name\n )\n ]\n\n for error in errors:\n message_lines.append(\n ' -> '.join(\n # Start array indexes at one for user-friendliness\n str(loc + 1) if isinstance(loc, int) else str(loc)\n for loc in error['loc']\n )\n )\n message_lines.append(' {}'.format(error['msg']))\n\n raise_from(ConfigurationError('\\n'.join(message_lines)), None)\n else:\n return config_model\n\n def _get_config_model_context(self, config):\n return {'logger': self.log, 'warning': self.warning, 'configured_fields': frozenset(config)}\n\n def register_secret(self, secret):\n # type: (str) -> None\n \"\"\"\n Register a secret to be scrubbed by `.sanitize()`.\n \"\"\"\n if not hasattr(self, '_sanitizer'):\n # Configure lazily so that checks that don't use sanitization aren't affected.\n self._sanitizer = SecretsSanitizer()\n self.log.setup_sanitization(sanitize=self.sanitize)\n\n self._sanitizer.register(secret)\n\n def sanitize(self, text):\n # type: (str) -> str\n \"\"\"\n Scrub any registered secrets in `text`.\n \"\"\"\n try:\n sanitizer = self._sanitizer\n except AttributeError:\n return text\n else:\n return sanitizer.sanitize(text)\n\n def _context_uid(self, mtype, name, tags=None, hostname=None):\n # type: (int, str, Sequence[str], str) -> str\n return '{}-{}-{}-{}'.format(mtype, name, tags if tags is None else hash(frozenset(tags)), hostname)\n\n def submit_histogram_bucket(\n self, name, value, lower_bound, upper_bound, monotonic, hostname, tags, raw=False, flush_first_value=False\n ):\n # type: (str, float, int, int, bool, str, Sequence[str], bool, bool) -> None\n if value is None:\n # ignore metric sample\n return\n\n # make sure the value (bucket count) is an integer\n try:\n value = int(value)\n except ValueError:\n err_msg = 'Histogram: {} has non integer value: {}. Only integer are valid bucket values (count).'.format(\n repr(name), repr(value)\n )\n if not AGENT_RUNNING:\n raise ValueError(err_msg)\n self.warning(err_msg)\n return\n\n tags = self._normalize_tags_type(tags, metric_name=name)\n if hostname is None:\n hostname = ''\n\n aggregator.submit_histogram_bucket(\n self,\n self.check_id,\n self._format_namespace(name, raw),\n value,\n lower_bound,\n upper_bound,\n monotonic,\n hostname,\n tags,\n flush_first_value,\n )\n\n def database_monitoring_query_sample(self, raw_event):\n # type: (str) -> None\n if raw_event is None:\n return\n\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), \"dbm-samples\")\n\n def database_monitoring_query_metrics(self, raw_event):\n # type: (str) -> None\n if raw_event is None:\n return\n\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), \"dbm-metrics\")\n\n def database_monitoring_query_activity(self, raw_event):\n # type: (str) -> None\n if raw_event is None:\n return\n\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), \"dbm-activity\")\n\n def database_monitoring_metadata(self, raw_event):\n # type: (str) -> None\n if raw_event is None:\n return\n\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), \"dbm-metadata\")\n\n def event_platform_event(self, raw_event, event_track_type):\n # type: (str, str) -> None\n \"\"\"Send an event platform event.\n\n Parameters:\n raw_event (str):\n JSON formatted string representing the event to send\n event_track_type (str):\n type of event ingested and processed by the event platform\n \"\"\"\n if raw_event is None:\n return\n aggregator.submit_event_platform_event(self, self.check_id, to_native_string(raw_event), event_track_type)\n\n def should_send_metric(self, metric_name):\n return not self._metric_excluded(metric_name) and self._metric_included(metric_name)\n\n def _metric_included(self, metric_name):\n if self.include_metrics_pattern is None:\n return True\n\n return self.include_metrics_pattern.search(metric_name) is not None\n\n def _metric_excluded(self, metric_name):\n if self.exclude_metrics_pattern is None:\n return False\n\n return self.exclude_metrics_pattern.search(metric_name) is not None\n\n def _submit_metric(\n self, mtype, name, value, tags=None, hostname=None, device_name=None, raw=False, flush_first_value=False\n ):\n # type: (int, str, float, Sequence[str], str, str, bool, bool) -> None\n if value is None:\n # ignore metric sample\n return\n\n name = self._format_namespace(name, raw)\n if not self.should_send_metric(name):\n return\n\n tags = self._normalize_tags_type(tags or [], device_name, name)\n if hostname is None:\n hostname = ''\n\n if self.metric_limiter:\n if mtype in ONE_PER_CONTEXT_METRIC_TYPES:\n # Fast path for gauges, rates, monotonic counters, assume one set of tags per call\n if self.metric_limiter.is_reached():\n return\n else:\n # Other metric types have a legit use case for several calls per set of tags, track unique sets of tags\n context = self._context_uid(mtype, name, tags, hostname)\n if self.metric_limiter.is_reached(context):\n return\n\n try:\n value = float(value)\n except ValueError:\n err_msg = 'Metric: {} has non float value: {}. Only float values can be submitted as metrics.'.format(\n repr(name), repr(value)\n )\n if not AGENT_RUNNING:\n raise ValueError(err_msg)\n self.warning(err_msg)\n return\n\n aggregator.submit_metric(self, self.check_id, mtype, name, value, tags, hostname, flush_first_value)\n\n def gauge(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a gauge metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.GAUGE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def count(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a raw count metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.COUNT, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def monotonic_count(\n self, name, value, tags=None, hostname=None, device_name=None, raw=False, flush_first_value=False\n ):\n # type: (str, float, Sequence[str], str, str, bool, bool) -> None\n \"\"\"Sample an increasing counter metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n flush_first_value (bool):\n whether to sample the first value\n \"\"\"\n self._submit_metric(\n aggregator.MONOTONIC_COUNT,\n name,\n value,\n tags=tags,\n hostname=hostname,\n device_name=device_name,\n raw=raw,\n flush_first_value=flush_first_value,\n )\n\n def rate(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a point, with the rate calculated at the end of the check.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.RATE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def histogram(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a histogram metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.HISTOGRAM, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def historate(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a histogram based on rate metrics.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.HISTORATE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def increment(self, name, value=1, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Increment a counter metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._log_deprecation('increment')\n self._submit_metric(\n aggregator.COUNTER, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def decrement(self, name, value=-1, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Decrement a counter metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._log_deprecation('increment')\n self._submit_metric(\n aggregator.COUNTER, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n\n def service_check(self, name, status, tags=None, hostname=None, message=None, raw=False):\n # type: (str, ServiceCheckStatus, Sequence[str], str, str, bool) -> None\n \"\"\"Send the status of a service.\n\n Parameters:\n name (str):\n the name of the service check\n status (int):\n a constant describing the service status\n tags (list[str]):\n a list of tags to associate with this service check\n message (str):\n additional information or a description of why this status occurred.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n tags = self._normalize_tags_type(tags or [])\n if hostname is None:\n hostname = ''\n if message is None:\n message = ''\n else:\n message = to_native_string(message)\n\n message = self.sanitize(message)\n\n aggregator.submit_service_check(\n self, self.check_id, self._format_namespace(name, raw), status, tags, hostname, message\n )\n\n def send_log(self, data, cursor=None, stream='default'):\n # type: (dict[str, str], dict[str, Any] | None, str) -> None\n \"\"\"Send a log for submission.\n\n Parameters:\n data (dict[str, str]):\n The log data to send. The following keys are treated specially, if present:\n\n - timestamp: should be an integer or float representing the number of seconds since the Unix epoch\n - ddtags: if not defined, it will automatically be set based on the instance's `tags` option\n cursor (dict[str, Any] or None):\n Metadata associated with the log which will be saved to disk. The most recent value may be\n retrieved with the `get_log_cursor` method.\n stream (str):\n The stream associated with this log, used for accurate cursor persistence.\n Has no effect if `cursor` argument is `None`.\n \"\"\"\n attributes = data.copy()\n if 'ddtags' not in attributes and self.formatted_tags:\n attributes['ddtags'] = self.formatted_tags\n\n timestamp = attributes.get('timestamp')\n if timestamp is not None:\n # convert seconds to milliseconds\n attributes['timestamp'] = int(timestamp * 1000)\n\n datadog_agent.send_log(to_json(attributes), self.check_id)\n if cursor is not None:\n self.write_persistent_cache('log_cursor_{}'.format(stream), to_json(cursor))\n\n def get_log_cursor(self, stream='default'):\n # type: (str) -> dict[str, Any] | None\n \"\"\"Returns the most recent log cursor from disk.\"\"\"\n data = self.read_persistent_cache('log_cursor_{}'.format(stream))\n return from_json(data) if data else None\n\n def _log_deprecation(self, deprecation_key, *args):\n # type: (str, *str) -> None\n \"\"\"\n Logs a deprecation notice at most once per AgentCheck instance, for the pre-defined `deprecation_key`\n \"\"\"\n sent, message = self._deprecations[deprecation_key]\n if sent:\n return\n\n self.warning(message, *args)\n self._deprecations[deprecation_key] = (True, message)\n\n # TODO: Remove once our checks stop calling it\n def service_metadata(self, meta_name, value):\n # type: (str, Any) -> None\n pass\n\n def set_metadata(self, name, value, **options):\n # type: (str, Any, **Any) -> None\n \"\"\"Updates the cached metadata `name` with `value`, which is then sent by the Agent at regular intervals.\n\n Parameters:\n name (str):\n the name of the metadata\n value (Any):\n the value for the metadata. if ``name`` has no transformer defined then the\n raw ``value`` will be submitted and therefore it must be a ``str``\n options (Any):\n keyword arguments to pass to any defined transformer\n \"\"\"\n self.metadata_manager.submit(name, value, options)\n\n @staticmethod\n def is_metadata_collection_enabled():\n # type: () -> bool\n return is_affirmative(datadog_agent.get_config('enable_metadata_collection'))\n\n @classmethod\n def metadata_entrypoint(cls, method):\n # type: (Callable[..., None]) -> Callable[..., None]\n \"\"\"\n Skip execution of the decorated method if metadata collection is disabled on the Agent.\n\n Usage:\n\n ```python\n class MyCheck(AgentCheck):\n @AgentCheck.metadata_entrypoint\n def collect_metadata(self):\n ...\n ```\n \"\"\"\n\n @functools.wraps(method)\n def entrypoint(self, *args, **kwargs):\n # type: (AgentCheck, *Any, **Any) -> None\n if not self.is_metadata_collection_enabled():\n return\n\n # NOTE: error handling still at the discretion of the wrapped method.\n method(self, *args, **kwargs)\n\n return entrypoint\n\n def _persistent_cache_id(self, key):\n # type: (str) -> str\n return '{}_{}'.format(self.check_id, key)\n\n def read_persistent_cache(self, key):\n # type: (str) -> str\n \"\"\"Returns the value previously stored with `write_persistent_cache` for the same `key`.\n\n Parameters:\n key (str):\n the key to retrieve\n \"\"\"\n return datadog_agent.read_persistent_cache(self._persistent_cache_id(key))\n\n def write_persistent_cache(self, key, value):\n # type: (str, str) -> None\n \"\"\"Stores `value` in a persistent cache for this check instance.\n The cache is located in a path where the agent is guaranteed to have read & write permissions. Namely in\n - `%ProgramData%\\\\Datadog\\\\run` on Windows.\n - `/opt/datadog-agent/run` everywhere else.\n The cache is persistent between agent restarts but will be rebuilt if the check instance configuration changes.\n\n Parameters:\n key (str):\n the key to retrieve\n value (str):\n the value to store\n \"\"\"\n datadog_agent.write_persistent_cache(self._persistent_cache_id(key), value)\n\n def set_external_tags(self, external_tags):\n # type: (Sequence[ExternalTagType]) -> None\n # Example of external_tags format\n # [\n # ('hostname', {'src_name': ['test:t1']}),\n # ('hostname2', {'src2_name': ['test2:t3']})\n # ]\n try:\n new_tags = []\n for hostname, source_map in external_tags:\n new_tags.append((to_native_string(hostname), source_map))\n for src_name, tags in iteritems(source_map):\n source_map[src_name] = self._normalize_tags_type(tags)\n datadog_agent.set_external_tags(new_tags)\n except IndexError:\n self.log.exception('Unexpected external tags format: %s', external_tags)\n raise\n\n def convert_to_underscore_separated(self, name):\n # type: (Union[str, bytes]) -> bytes\n \"\"\"\n Convert from CamelCase to camel_case\n And substitute illegal metric characters\n \"\"\"\n name = ensure_bytes(name)\n metric_name = self.FIRST_CAP_RE.sub(br'\\1_\\2', name)\n metric_name = self.ALL_CAP_RE.sub(br'\\1_\\2', metric_name).lower()\n metric_name = self.METRIC_REPLACEMENT.sub(br'_', metric_name)\n return self.DOT_UNDERSCORE_CLEANUP.sub(br'.', metric_name).strip(b'_')\n\n def warning(self, warning_message, *args, **kwargs):\n # type: (str, *Any, **Any) -> None\n \"\"\"Log a warning message, display it in the Agent's status page and in-app.\n\n Using *args is intended to make warning work like log.warn/debug/info/etc\n and make it compliant with flake8 logging format linter.\n\n Parameters:\n warning_message (str):\n the warning message\n args (Any):\n format string args used to format the warning message e.g. `warning_message % args`\n kwargs (Any):\n not used for now, but added to match Python logger's `warning` method signature\n \"\"\"\n warning_message = to_native_string(warning_message)\n # Interpolate message only if args is not empty. Same behavior as python logger:\n # https://github.com/python/cpython/blob/1dbe5373851acb85ba91f0be7b83c69563acd68d/Lib/logging/__init__.py#L368-L369\n if args:\n warning_message = warning_message % args\n frame = inspect.currentframe().f_back # type: ignore\n lineno = frame.f_lineno\n # only log the last part of the filename, not the full path\n filename = basename(frame.f_code.co_filename)\n\n self.log.warning(warning_message, extra={'_lineno': lineno, '_filename': filename, '_check_id': self.check_id})\n self.warnings.append(warning_message)\n\n def get_warnings(self):\n # type: () -> List[str]\n \"\"\"\n Return the list of warnings messages to be displayed in the info page\n \"\"\"\n warnings = self.warnings\n self.warnings = []\n return warnings\n\n def get_diagnoses(self):\n # type: () -> str\n \"\"\"\n Return the list of diagnosis as a JSON encoded string.\n\n The agent calls this method to retrieve diagnostics from integrations. This method\n runs explicit diagnostics if available.\n \"\"\"\n return to_json([d._asdict() for d in (self.diagnosis.diagnoses + self.diagnosis.run_explicit())])\n\n def _get_requests_proxy(self):\n # type: () -> ProxySettings\n # TODO: Remove with Agent 5\n no_proxy_settings = {'http': None, 'https': None, 'no': []} # type: ProxySettings\n\n # First we read the proxy configuration from datadog.conf\n proxies = self.agentConfig.get('proxy', datadog_agent.get_config('proxy'))\n if proxies:\n proxies = proxies.copy()\n\n # requests compliant dict\n if proxies and 'no_proxy' in proxies:\n proxies['no'] = proxies.pop('no_proxy')\n\n return proxies if proxies else no_proxy_settings\n\n def _format_namespace(self, s, raw=False):\n # type: (str, bool) -> str\n if not raw and self.__NAMESPACE__:\n return '{}.{}'.format(self.__NAMESPACE__, to_native_string(s))\n\n return to_native_string(s)\n\n def normalize(self, metric, prefix=None, fix_case=False):\n # type: (Union[str, bytes], Union[str, bytes], bool) -> str\n \"\"\"\n Turn a metric into a well-formed metric name prefix.b.c\n\n Parameters:\n metric: The metric name to normalize\n prefix: A prefix to to add to the normalized name, default None\n fix_case: A boolean, indicating whether to make sure that the metric name returned is in \"snake_case\"\n \"\"\"\n if isinstance(metric, text_type):\n metric = unicodedata.normalize('NFKD', metric).encode('ascii', 'ignore')\n\n if fix_case:\n name = self.convert_to_underscore_separated(metric)\n if prefix is not None:\n prefix = self.convert_to_underscore_separated(prefix)\n else:\n name = self.METRIC_REPLACEMENT.sub(br'_', metric)\n name = self.DOT_UNDERSCORE_CLEANUP.sub(br'.', name).strip(b'_')\n\n name = self.MULTIPLE_UNDERSCORE_CLEANUP.sub(br'_', name)\n\n if prefix is not None:\n name = ensure_bytes(prefix) + b\".\" + name\n\n return to_native_string(name)\n\n def normalize_tag(self, tag):\n # type: (Union[str, bytes]) -> str\n \"\"\"Normalize tag values.\n\n This happens for legacy reasons, when we cleaned up some characters (like '-')\n which are allowed in tags.\n \"\"\"\n if isinstance(tag, text_type):\n tag = tag.encode('utf-8', 'ignore')\n tag = self.TAG_REPLACEMENT.sub(br'_', tag)\n tag = self.MULTIPLE_UNDERSCORE_CLEANUP.sub(br'_', tag)\n tag = self.DOT_UNDERSCORE_CLEANUP.sub(br'.', tag).strip(b'_')\n return to_native_string(tag)\n\n def check(self, instance):\n # type: (InstanceType) -> None\n raise NotImplementedError\n\n def cancel(self):\n # type: () -> None\n \"\"\"\n This method is called when the check in unscheduled by the agent. This\n is SIGNAL that the check is being unscheduled and can be called while\n the check is running. It's up to the python implementation to make sure\n cancel is thread safe and won't block.\n \"\"\"\n pass\n\n def run(self):\n # type: () -> str\n try:\n self.diagnosis.clear()\n # Ignore check initializations if running in a separate process\n if is_affirmative(self.instance.get('process_isolation', self.init_config.get('process_isolation', False))):\n from ..utils.replay.execute import run_with_isolation\n\n run_with_isolation(self, aggregator, datadog_agent)\n else:\n while self.check_initializations:\n initialization = self.check_initializations.popleft()\n try:\n initialization()\n except Exception:\n self.check_initializations.appendleft(initialization)\n raise\n\n instance = copy.deepcopy(self.instances[0])\n\n if 'set_breakpoint' in self.init_config:\n from ..utils.agent.debug import enter_pdb\n\n enter_pdb(self.check, line=self.init_config['set_breakpoint'], args=(instance,))\n elif self.should_profile_memory():\n self.profile_memory(self.check, self.init_config, args=(instance,))\n else:\n self.check(instance)\n\n error_report = ''\n except Exception as e:\n message = self.sanitize(str(e))\n tb = self.sanitize(traceback.format_exc())\n error_report = to_json([{'message': message, 'traceback': tb}])\n finally:\n if self.metric_limiter:\n if is_affirmative(self.debug_metrics.get('metric_contexts', False)):\n debug_metrics = self.metric_limiter.get_debug_metrics()\n\n # Reset so we can actually submit the metrics\n self.metric_limiter.reset()\n\n tags = self.get_debug_metric_tags()\n for metric_name, value in debug_metrics:\n self.gauge(metric_name, value, tags=tags, raw=True)\n\n self.metric_limiter.reset()\n\n return error_report\n\n def event(self, event):\n # type: (Event) -> None\n \"\"\"Send an event.\n\n An event is a dictionary with the following keys and data types:\n\n ```python\n {\n \"timestamp\": int, # the epoch timestamp for the event\n \"event_type\": str, # the event name\n \"api_key\": str, # the api key for your account\n \"msg_title\": str, # the title of the event\n \"msg_text\": str, # the text body of the event\n \"aggregation_key\": str, # a key to use for aggregating events\n \"alert_type\": str, # (optional) one of ('error', 'warning', 'success', 'info'), defaults to 'info'\n \"source_type_name\": str, # (optional) the source type name\n \"host\": str, # (optional) the name of the host\n \"tags\": list, # (optional) a list of tags to associate with this event\n \"priority\": str, # (optional) specifies the priority of the event (\"normal\" or \"low\")\n }\n ```\n\n Parameters:\n event (dict[str, Any]):\n the event to be sent\n \"\"\"\n # Enforce types of some fields, considerably facilitates handling in go bindings downstream\n for key, value in iteritems(event):\n if not isinstance(value, (text_type, binary_type)):\n continue\n\n try:\n event[key] = to_native_string(value) # type: ignore\n # ^ Mypy complains about dynamic key assignment -- arguably for good reason.\n # Ideally we should convert this to a dict literal so that submitted events only include known keys.\n except UnicodeError:\n self.log.warning('Encoding error with field `%s`, cannot submit event', key)\n return\n\n if event.get('tags'):\n event['tags'] = self._normalize_tags_type(event['tags'])\n if event.get('timestamp'):\n event['timestamp'] = int(event['timestamp'])\n if event.get('aggregation_key'):\n event['aggregation_key'] = to_native_string(event['aggregation_key'])\n\n if self.__NAMESPACE__:\n event.setdefault('source_type_name', self.__NAMESPACE__)\n\n aggregator.submit_event(self, self.check_id, event)\n\n def _normalize_tags_type(self, tags, device_name=None, metric_name=None):\n # type: (Sequence[Union[None, str, bytes]], str, str) -> List[str]\n \"\"\"\n Normalize tags contents and type:\n - append `device_name` as `device:` tag\n - normalize tags type\n - doesn't mutate the passed list, returns a new list\n \"\"\"\n normalized_tags = []\n\n if device_name:\n self._log_deprecation('device_name')\n try:\n normalized_tags.append('device:{}'.format(to_native_string(device_name)))\n except UnicodeError:\n self.log.warning(\n 'Encoding error with device name `%r` for metric `%r`, ignoring tag', device_name, metric_name\n )\n\n for tag in tags:\n if tag is None:\n continue\n try:\n tag = to_native_string(tag)\n except UnicodeError:\n self.log.warning('Encoding error with tag `%s` for metric `%s`, ignoring tag', tag, metric_name)\n continue\n if self.disable_generic_tags:\n normalized_tags.append(self.degeneralise_tag(tag))\n else:\n normalized_tags.append(tag)\n return normalized_tags\n\n def degeneralise_tag(self, tag):\n split_tag = tag.split(':', 1)\n if len(split_tag) > 1:\n tag_name, value = split_tag\n else:\n tag_name = tag\n value = None\n\n if tag_name in GENERIC_TAGS:\n new_name = '{}_{}'.format(self.name, tag_name)\n if value:\n return '{}:{}'.format(new_name, value)\n else:\n return new_name\n else:\n return tag\n\n def get_debug_metric_tags(self):\n tags = ['check_name:{}'.format(self.name), 'check_version:{}'.format(self.check_version)]\n tags.extend(self.instance.get('tags', []))\n return tags\n\n def get_memory_profile_tags(self):\n # type: () -> List[str]\n tags = self.get_debug_metric_tags()\n tags.extend(self.instance.get('__memory_profiling_tags', []))\n return tags\n\n def should_profile_memory(self):\n # type: () -> bool\n return 'profile_memory' in self.init_config or (\n datadog_agent.tracemalloc_enabled() and should_profile_memory(datadog_agent, self.name)\n )\n\n def profile_memory(self, func, namespaces=None, args=(), kwargs=None, extra_tags=None):\n # type: (Callable[..., Any], Optional[Sequence[str]], Sequence[Any], Optional[Dict[str, Any]], Optional[List[str]]) -> None # noqa: E501\n from ..utils.agent.memory import profile_memory\n\n if namespaces is None:\n namespaces = self.check_id.split(':', 1)\n\n tags = self.get_memory_profile_tags()\n if extra_tags is not None:\n tags.extend(extra_tags)\n\n metrics = profile_memory(func, self.init_config, namespaces=namespaces, args=args, kwargs=kwargs)\n\n for m in metrics:\n self.gauge(m.name, m.value, tags=tags, raw=True)\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def gauge(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a gauge metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.GAUGE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def count(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a raw count metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.COUNT, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
Falseflush_first_valuebool
whether to sample the first value
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def monotonic_count(\n self, name, value, tags=None, hostname=None, device_name=None, raw=False, flush_first_value=False\n):\n # type: (str, float, Sequence[str], str, str, bool, bool) -> None\n \"\"\"Sample an increasing counter metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n flush_first_value (bool):\n whether to sample the first value\n \"\"\"\n self._submit_metric(\n aggregator.MONOTONIC_COUNT,\n name,\n value,\n tags=tags,\n hostname=hostname,\n device_name=device_name,\n raw=raw,\n flush_first_value=flush_first_value,\n )\n
Sample a point, with the rate calculated at the end of the check.
Parameters:
Name Type Description Default namestr
the name of the metric
required valuefloat
the value for the metric
required tagslist[str]
a list of tags to associate with this metric
Nonehostnamestr
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def rate(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a point, with the rate calculated at the end of the check.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.RATE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def histogram(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a histogram metric.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.HISTOGRAM, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a hostname to associate with this metric. Defaults to the current host.
Nonedevice_namestr
deprecated add a tag in the form device:<device_name> to the tags list instead.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def historate(self, name, value, tags=None, hostname=None, device_name=None, raw=False):\n # type: (str, float, Sequence[str], str, str, bool) -> None\n \"\"\"Sample a histogram based on rate metrics.\n\n Parameters:\n name (str):\n the name of the metric\n value (float):\n the value for the metric\n tags (list[str]):\n a list of tags to associate with this metric\n hostname (str):\n a hostname to associate with this metric. Defaults to the current host.\n device_name (str):\n **deprecated** add a tag in the form `device:<device_name>` to the `tags` list instead.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n self._submit_metric(\n aggregator.HISTORATE, name, value, tags=tags, hostname=hostname, device_name=device_name, raw=raw\n )\n
a list of tags to associate with this service check
Nonemessagestr
additional information or a description of why this status occurred.
Nonerawbool
whether to ignore any defined namespace prefix
False Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def service_check(self, name, status, tags=None, hostname=None, message=None, raw=False):\n # type: (str, ServiceCheckStatus, Sequence[str], str, str, bool) -> None\n \"\"\"Send the status of a service.\n\n Parameters:\n name (str):\n the name of the service check\n status (int):\n a constant describing the service status\n tags (list[str]):\n a list of tags to associate with this service check\n message (str):\n additional information or a description of why this status occurred.\n raw (bool):\n whether to ignore any defined namespace prefix\n \"\"\"\n tags = self._normalize_tags_type(tags or [])\n if hostname is None:\n hostname = ''\n if message is None:\n message = ''\n else:\n message = to_native_string(message)\n\n message = self.sanitize(message)\n\n aggregator.submit_service_check(\n self, self.check_id, self._format_namespace(name, raw), status, tags, hostname, message\n )\n
An event is a dictionary with the following keys and data types:
{\n \"timestamp\": int, # the epoch timestamp for the event\n \"event_type\": str, # the event name\n \"api_key\": str, # the api key for your account\n \"msg_title\": str, # the title of the event\n \"msg_text\": str, # the text body of the event\n \"aggregation_key\": str, # a key to use for aggregating events\n \"alert_type\": str, # (optional) one of ('error', 'warning', 'success', 'info'), defaults to 'info'\n \"source_type_name\": str, # (optional) the source type name\n \"host\": str, # (optional) the name of the host\n \"tags\": list, # (optional) a list of tags to associate with this event\n \"priority\": str, # (optional) specifies the priority of the event (\"normal\" or \"low\")\n}\n
Parameters:
Name Type Description Default eventdict[str, Any]
the event to be sent
required Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def event(self, event):\n # type: (Event) -> None\n \"\"\"Send an event.\n\n An event is a dictionary with the following keys and data types:\n\n ```python\n {\n \"timestamp\": int, # the epoch timestamp for the event\n \"event_type\": str, # the event name\n \"api_key\": str, # the api key for your account\n \"msg_title\": str, # the title of the event\n \"msg_text\": str, # the text body of the event\n \"aggregation_key\": str, # a key to use for aggregating events\n \"alert_type\": str, # (optional) one of ('error', 'warning', 'success', 'info'), defaults to 'info'\n \"source_type_name\": str, # (optional) the source type name\n \"host\": str, # (optional) the name of the host\n \"tags\": list, # (optional) a list of tags to associate with this event\n \"priority\": str, # (optional) specifies the priority of the event (\"normal\" or \"low\")\n }\n ```\n\n Parameters:\n event (dict[str, Any]):\n the event to be sent\n \"\"\"\n # Enforce types of some fields, considerably facilitates handling in go bindings downstream\n for key, value in iteritems(event):\n if not isinstance(value, (text_type, binary_type)):\n continue\n\n try:\n event[key] = to_native_string(value) # type: ignore\n # ^ Mypy complains about dynamic key assignment -- arguably for good reason.\n # Ideally we should convert this to a dict literal so that submitted events only include known keys.\n except UnicodeError:\n self.log.warning('Encoding error with field `%s`, cannot submit event', key)\n return\n\n if event.get('tags'):\n event['tags'] = self._normalize_tags_type(event['tags'])\n if event.get('timestamp'):\n event['timestamp'] = int(event['timestamp'])\n if event.get('aggregation_key'):\n event['aggregation_key'] = to_native_string(event['aggregation_key'])\n\n if self.__NAMESPACE__:\n event.setdefault('source_type_name', self.__NAMESPACE__)\n\n aggregator.submit_event(self, self.check_id, event)\n
Updates the cached metadata name with value, which is then sent by the Agent at regular intervals.
Parameters:
Name Type Description Default namestr
the name of the metadata
required valueAny
the value for the metadata. if name has no transformer defined then the raw value will be submitted and therefore it must be a str
required optionsAny
keyword arguments to pass to any defined transformer
{} Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def set_metadata(self, name, value, **options):\n # type: (str, Any, **Any) -> None\n \"\"\"Updates the cached metadata `name` with `value`, which is then sent by the Agent at regular intervals.\n\n Parameters:\n name (str):\n the name of the metadata\n value (Any):\n the value for the metadata. if ``name`` has no transformer defined then the\n raw ``value`` will be submitted and therefore it must be a ``str``\n options (Any):\n keyword arguments to pass to any defined transformer\n \"\"\"\n self.metadata_manager.submit(name, value, options)\n
Skip execution of the decorated method if metadata collection is disabled on the Agent.
Usage:
class MyCheck(AgentCheck):\n @AgentCheck.metadata_entrypoint\n def collect_metadata(self):\n ...\n
Source code in datadog_checks_base/datadog_checks/base/checks/base.py
@classmethod\ndef metadata_entrypoint(cls, method):\n # type: (Callable[..., None]) -> Callable[..., None]\n \"\"\"\n Skip execution of the decorated method if metadata collection is disabled on the Agent.\n\n Usage:\n\n ```python\n class MyCheck(AgentCheck):\n @AgentCheck.metadata_entrypoint\n def collect_metadata(self):\n ...\n ```\n \"\"\"\n\n @functools.wraps(method)\n def entrypoint(self, *args, **kwargs):\n # type: (AgentCheck, *Any, **Any) -> None\n if not self.is_metadata_collection_enabled():\n return\n\n # NOTE: error handling still at the discretion of the wrapped method.\n method(self, *args, **kwargs)\n\n return entrypoint\n
Returns the value previously stored with write_persistent_cache for the same key.
Parameters:
Name Type Description Default keystr
the key to retrieve
required Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def read_persistent_cache(self, key):\n # type: (str) -> str\n \"\"\"Returns the value previously stored with `write_persistent_cache` for the same `key`.\n\n Parameters:\n key (str):\n the key to retrieve\n \"\"\"\n return datadog_agent.read_persistent_cache(self._persistent_cache_id(key))\n
Stores value in a persistent cache for this check instance. The cache is located in a path where the agent is guaranteed to have read & write permissions. Namely in - %ProgramData%\\Datadog\\run on Windows. - /opt/datadog-agent/run everywhere else. The cache is persistent between agent restarts but will be rebuilt if the check instance configuration changes.
Parameters:
Name Type Description Default keystr
the key to retrieve
required valuestr
the value to store
required Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def write_persistent_cache(self, key, value):\n # type: (str, str) -> None\n \"\"\"Stores `value` in a persistent cache for this check instance.\n The cache is located in a path where the agent is guaranteed to have read & write permissions. Namely in\n - `%ProgramData%\\\\Datadog\\\\run` on Windows.\n - `/opt/datadog-agent/run` everywhere else.\n The cache is persistent between agent restarts but will be rebuilt if the check instance configuration changes.\n\n Parameters:\n key (str):\n the key to retrieve\n value (str):\n the value to store\n \"\"\"\n datadog_agent.write_persistent_cache(self._persistent_cache_id(key), value)\n
The log data to send. The following keys are treated specially, if present:
timestamp: should be an integer or float representing the number of seconds since the Unix epoch
ddtags: if not defined, it will automatically be set based on the instance's tags option
required cursordict[str, Any] or None
Metadata associated with the log which will be saved to disk. The most recent value may be retrieved with the get_log_cursor method.
Nonestreamstr
The stream associated with this log, used for accurate cursor persistence. Has no effect if cursor argument is None.
'default' Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def send_log(self, data, cursor=None, stream='default'):\n # type: (dict[str, str], dict[str, Any] | None, str) -> None\n \"\"\"Send a log for submission.\n\n Parameters:\n data (dict[str, str]):\n The log data to send. The following keys are treated specially, if present:\n\n - timestamp: should be an integer or float representing the number of seconds since the Unix epoch\n - ddtags: if not defined, it will automatically be set based on the instance's `tags` option\n cursor (dict[str, Any] or None):\n Metadata associated with the log which will be saved to disk. The most recent value may be\n retrieved with the `get_log_cursor` method.\n stream (str):\n The stream associated with this log, used for accurate cursor persistence.\n Has no effect if `cursor` argument is `None`.\n \"\"\"\n attributes = data.copy()\n if 'ddtags' not in attributes and self.formatted_tags:\n attributes['ddtags'] = self.formatted_tags\n\n timestamp = attributes.get('timestamp')\n if timestamp is not None:\n # convert seconds to milliseconds\n attributes['timestamp'] = int(timestamp * 1000)\n\n datadog_agent.send_log(to_json(attributes), self.check_id)\n if cursor is not None:\n self.write_persistent_cache('log_cursor_{}'.format(stream), to_json(cursor))\n
Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def get_log_cursor(self, stream='default'):\n # type: (str) -> dict[str, Any] | None\n \"\"\"Returns the most recent log cursor from disk.\"\"\"\n data = self.read_persistent_cache('log_cursor_{}'.format(stream))\n return from_json(data) if data else None\n
Log a warning message, display it in the Agent's status page and in-app.
Using *args is intended to make warning work like log.warn/debug/info/etc and make it compliant with flake8 logging format linter.
Parameters:
Name Type Description Default warning_messagestr
the warning message
required argsAny
format string args used to format the warning message e.g. warning_message % args
()kwargsAny
not used for now, but added to match Python logger's warning method signature
{} Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def warning(self, warning_message, *args, **kwargs):\n # type: (str, *Any, **Any) -> None\n \"\"\"Log a warning message, display it in the Agent's status page and in-app.\n\n Using *args is intended to make warning work like log.warn/debug/info/etc\n and make it compliant with flake8 logging format linter.\n\n Parameters:\n warning_message (str):\n the warning message\n args (Any):\n format string args used to format the warning message e.g. `warning_message % args`\n kwargs (Any):\n not used for now, but added to match Python logger's `warning` method signature\n \"\"\"\n warning_message = to_native_string(warning_message)\n # Interpolate message only if args is not empty. Same behavior as python logger:\n # https://github.com/python/cpython/blob/1dbe5373851acb85ba91f0be7b83c69563acd68d/Lib/logging/__init__.py#L368-L369\n if args:\n warning_message = warning_message % args\n frame = inspect.currentframe().f_back # type: ignore\n lineno = frame.f_lineno\n # only log the last part of the filename, not the full path\n filename = basename(frame.f_code.co_filename)\n\n self.log.warning(warning_message, extra={'_lineno': lineno, '_filename': filename, '_check_id': self.check_id})\n self.warnings.append(warning_message)\n
This implements the methods defined by the Agent's C bindings which in turn call the Go backend.
It also provides utility methods for test assertions.
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
class AggregatorStub(object):\n \"\"\"\n This implements the methods defined by the Agent's\n [C bindings](https://github.com/DataDog/datadog-agent/blob/master/rtloader/common/builtins/aggregator.c)\n which in turn call the\n [Go backend](https://github.com/DataDog/datadog-agent/blob/master/pkg/collector/python/aggregator.go).\n\n It also provides utility methods for test assertions.\n \"\"\"\n\n # Replicate the Enum we have on the Agent\n METRIC_ENUM_MAP = OrderedDict(\n (\n ('gauge', 0),\n ('rate', 1),\n ('count', 2),\n ('monotonic_count', 3),\n ('counter', 4),\n ('histogram', 5),\n ('historate', 6),\n )\n )\n METRIC_ENUM_MAP_REV = {v: k for k, v in iteritems(METRIC_ENUM_MAP)}\n GAUGE, RATE, COUNT, MONOTONIC_COUNT, COUNTER, HISTOGRAM, HISTORATE = list(METRIC_ENUM_MAP.values())\n AGGREGATE_TYPES = {COUNT, COUNTER}\n IGNORED_METRICS = {'datadog.agent.profile.memory.check_run_alloc'}\n METRIC_TYPE_SUBMISSION_TO_BACKEND_MAP = {\n 'gauge': 'gauge',\n 'rate': 'gauge',\n 'count': 'count',\n 'monotonic_count': 'count',\n 'counter': 'rate',\n 'histogram': 'rate', # Checking .count only, the other are gauges\n 'historate': 'rate', # Checking .count only, the other are gauges\n }\n\n def __init__(self):\n self.reset()\n\n @classmethod\n def is_aggregate(cls, mtype):\n return mtype in cls.AGGREGATE_TYPES\n\n @classmethod\n def ignore_metric(cls, name):\n return name in cls.IGNORED_METRICS\n\n def submit_metric(self, check, check_id, mtype, name, value, tags, hostname, flush_first_value):\n check_tag_names(name, tags)\n if not self.ignore_metric(name):\n self._metrics[name].append(MetricStub(name, mtype, value, tags, hostname, None, flush_first_value))\n\n def submit_metric_e2e(\n self, check, check_id, mtype, name, value, tags, hostname, device=None, flush_first_value=False\n ):\n check_tag_names(name, tags)\n # Device is only present in metrics read from the real agent in e2e tests. Normally it is submitted as a tag\n if not self.ignore_metric(name):\n self._metrics[name].append(MetricStub(name, mtype, value, tags, hostname, device, flush_first_value))\n\n def submit_service_check(self, check, check_id, name, status, tags, hostname, message):\n if status == ServiceCheck.OK and message:\n raise Exception(\"Expected empty message on OK service check\")\n\n check_tag_names(name, tags)\n self._service_checks[name].append(ServiceCheckStub(check_id, name, status, tags, hostname, message))\n\n def submit_event(self, check, check_id, event):\n self._events.append(event)\n\n def submit_event_platform_event(self, check, check_id, raw_event, event_type):\n self._event_platform_events[event_type].append(raw_event)\n\n def submit_histogram_bucket(\n self,\n check,\n check_id,\n name,\n value,\n lower_bound,\n upper_bound,\n monotonic,\n hostname,\n tags,\n flush_first_value=False,\n ):\n check_tag_names(name, tags)\n self._histogram_buckets[name].append(\n HistogramBucketStub(name, value, lower_bound, upper_bound, monotonic, hostname, tags, flush_first_value)\n )\n\n def metrics(self, name):\n \"\"\"\n Return the metrics received under the given name\n \"\"\"\n return [\n MetricStub(\n ensure_unicode(stub.name),\n stub.type,\n stub.value,\n normalize_tags(stub.tags),\n ensure_unicode(stub.hostname),\n stub.device,\n stub.flush_first_value,\n )\n for stub in self._metrics.get(to_native_string(name), [])\n ]\n\n def service_checks(self, name):\n \"\"\"\n Return the service checks received under the given name\n \"\"\"\n return [\n ServiceCheckStub(\n ensure_unicode(stub.check_id),\n ensure_unicode(stub.name),\n stub.status,\n normalize_tags(stub.tags),\n ensure_unicode(stub.hostname),\n ensure_unicode(stub.message),\n )\n for stub in self._service_checks.get(to_native_string(name), [])\n ]\n\n @property\n def events(self):\n \"\"\"\n Return all events\n \"\"\"\n return self._events\n\n def get_event_platform_events(self, event_type, parse_json=True):\n \"\"\"\n Return all event platform events for the event_type\n \"\"\"\n return [json.loads(e) if parse_json else e for e in self._event_platform_events[event_type]]\n\n def histogram_bucket(self, name):\n \"\"\"\n Return the histogram buckets received under the given name\n \"\"\"\n return [\n HistogramBucketStub(\n ensure_unicode(stub.name),\n stub.value,\n stub.lower_bound,\n stub.upper_bound,\n stub.monotonic,\n ensure_unicode(stub.hostname),\n normalize_tags(stub.tags),\n stub.flush_first_value,\n )\n for stub in self._histogram_buckets.get(to_native_string(name), [])\n ]\n\n def assert_metric_has_tags(self, metric_name, tags, count=None, at_least=1):\n for tag in tags:\n self.assert_metric_has_tag(metric_name, tag, count, at_least)\n\n def assert_metric_has_tag(self, metric_name, tag, count=None, at_least=1):\n \"\"\"\n Assert a metric is tagged with tag\n \"\"\"\n self._asserted.add(metric_name)\n\n candidates = []\n candidates_with_tag = []\n for metric in self.metrics(metric_name):\n candidates.append(metric)\n if tag in metric.tags:\n candidates_with_tag.append(metric)\n\n if candidates_with_tag: # The metric was found with the tag but not enough times\n msg = \"The metric '{}' with tag '{}' was only found {}/{} times\".format(metric_name, tag, count, at_least)\n elif candidates:\n msg = (\n \"The metric '{}' was found but not with the tag '{}'.\\n\".format(metric_name, tag)\n + \"Similar submitted:\\n\"\n + \"\\n\".join([\" {}\".format(m) for m in candidates])\n )\n else:\n expected_stub = MetricStub(metric_name, type=None, value=None, tags=[tag], hostname=None, device=None)\n msg = \"Metric '{}' not found\".format(metric_name)\n msg = \"{}\\n{}\".format(msg, build_similar_elements_msg(expected_stub, self._metrics))\n\n if count is not None:\n assert len(candidates_with_tag) == count, msg\n else:\n assert len(candidates_with_tag) >= at_least, msg\n\n # Potential kwargs: aggregation_key, alert_type, event_type,\n # msg_title, source_type_name\n def assert_event(self, msg_text, count=None, at_least=1, exact_match=True, tags=None, **kwargs):\n candidates = []\n for e in self.events:\n if exact_match and msg_text != e['msg_text'] or msg_text not in e['msg_text']:\n continue\n if tags and set(tags) != set(e['tags']):\n continue\n for name, value in iteritems(kwargs):\n if e[name] != value:\n break\n else:\n candidates.append(e)\n\n msg = \"Candidates size assertion for `{}`, count: {}, at_least: {}) failed\".format(msg_text, count, at_least)\n if count is not None:\n assert len(candidates) == count, msg\n else:\n assert len(candidates) >= at_least, msg\n\n def assert_histogram_bucket(\n self,\n name,\n value,\n lower_bound,\n upper_bound,\n monotonic,\n hostname,\n tags,\n count=None,\n at_least=1,\n flush_first_value=None,\n ):\n expected_tags = normalize_tags(tags, sort=True)\n\n candidates = []\n for bucket in self.histogram_bucket(name):\n if value is not None and value != bucket.value:\n continue\n\n if expected_tags and expected_tags != sorted(bucket.tags):\n continue\n\n if hostname and hostname != bucket.hostname:\n continue\n\n if monotonic != bucket.monotonic:\n continue\n\n if flush_first_value is not None and flush_first_value != bucket.flush_first_value:\n continue\n\n candidates.append(bucket)\n\n expected_bucket = HistogramBucketStub(\n name, value, lower_bound, upper_bound, monotonic, hostname, tags, flush_first_value\n )\n\n if count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(\n condition=condition, msg=msg, expected_stub=expected_bucket, submitted_elements=self._histogram_buckets\n )\n\n def assert_metric(\n self,\n name,\n value=None,\n tags=None,\n count=None,\n at_least=1,\n hostname=None,\n metric_type=None,\n device=None,\n flush_first_value=None,\n ):\n \"\"\"\n Assert a metric was processed by this stub\n \"\"\"\n\n self._asserted.add(name)\n expected_tags = normalize_tags(tags, sort=True)\n\n candidates = []\n for metric in self.metrics(name):\n if value is not None and not self.is_aggregate(metric.type) and value != metric.value:\n continue\n\n if expected_tags and expected_tags != sorted(metric.tags):\n continue\n\n if hostname is not None and hostname != metric.hostname:\n continue\n\n if metric_type is not None and metric_type != metric.type:\n continue\n\n if device is not None and device != metric.device:\n continue\n\n if flush_first_value is not None and flush_first_value != metric.flush_first_value:\n continue\n\n candidates.append(metric)\n\n expected_metric = MetricStub(name, metric_type, value, expected_tags, hostname, device, flush_first_value)\n\n if value is not None and candidates and all(self.is_aggregate(m.type) for m in candidates):\n got = sum(m.value for m in candidates)\n msg = \"Expected count value for '{}': {}, got {}\".format(name, value, got)\n condition = value == got\n elif count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(condition, msg=msg, expected_stub=expected_metric, submitted_elements=self._metrics)\n\n def assert_service_check(self, name, status=None, tags=None, count=None, at_least=1, hostname=None, message=None):\n \"\"\"\n Assert a service check was processed by this stub\n \"\"\"\n tags = normalize_tags(tags, sort=True)\n candidates = []\n for sc in self.service_checks(name):\n if status is not None and status != sc.status:\n continue\n\n if tags and tags != sorted(sc.tags):\n continue\n\n if hostname is not None and hostname != sc.hostname:\n continue\n\n if message is not None and message != sc.message:\n continue\n\n candidates.append(sc)\n\n expected_service_check = ServiceCheckStub(\n None, name=name, status=status, tags=tags, hostname=hostname, message=message\n )\n\n if count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(\n condition=condition, msg=msg, expected_stub=expected_service_check, submitted_elements=self._service_checks\n )\n\n @staticmethod\n def _assert(condition, msg, expected_stub, submitted_elements):\n new_msg = msg\n if not condition: # It's costly to build the message with similar metrics, so it's built only on failure.\n new_msg = \"{}\\n{}\".format(msg, build_similar_elements_msg(expected_stub, submitted_elements))\n assert condition, new_msg\n\n def assert_all_metrics_covered(self):\n # use `condition` to avoid building the `msg` if not needed\n condition = self.metrics_asserted_pct >= 100.0\n msg = ''\n if not condition:\n prefix = '\\n\\t- '\n msg = 'Some metrics are collected but not asserted:'\n msg += '\\nAsserted Metrics:{}{}'.format(prefix, prefix.join(sorted(self._asserted)))\n msg += '\\nFound metrics that are not asserted:{}{}'.format(prefix, prefix.join(sorted(self.not_asserted())))\n assert condition, msg\n\n def assert_metrics_using_metadata(\n self, metadata_metrics, check_metric_type=True, check_submission_type=False, exclude=None\n ):\n \"\"\"\n Assert metrics using metadata.csv\n\n Checking type: By default we are asserting the in-app metric type (`check_submission_type=False`),\n asserting this type make sense for e2e (metrics collected from agent).\n For integrations tests, we can check the submission type with `check_submission_type=True`, or\n use `check_metric_type=False` not to check types.\n\n Usage:\n\n from datadog_checks.dev.utils import get_metadata_metrics\n aggregator.assert_metrics_using_metadata(get_metadata_metrics())\n\n \"\"\"\n\n exclude = exclude or []\n errors = set()\n for metric_name, metric_stubs in iteritems(self._metrics):\n if metric_name in exclude:\n continue\n for metric_stub in metric_stubs:\n metric_stub_name = backend_normalize_metric_name(metric_stub.name)\n actual_metric_type = AggregatorStub.METRIC_ENUM_MAP_REV[metric_stub.type]\n\n # We only check `*.count` metrics for histogram and historate submissions\n # Note: all Openmetrics histogram and summary metrics are actually separately submitted\n if check_submission_type and actual_metric_type in ['histogram', 'historate']:\n metric_stub_name += '.count'\n\n # Checking the metric is in `metadata.csv`\n if metric_stub_name not in metadata_metrics:\n errors.add(\"Expect `{}` to be in metadata.csv.\".format(metric_stub_name))\n continue\n\n expected_metric_type = metadata_metrics[metric_stub_name]['metric_type']\n if check_submission_type:\n # Integration tests type mapping\n actual_metric_type = AggregatorStub.METRIC_TYPE_SUBMISSION_TO_BACKEND_MAP[actual_metric_type]\n else:\n # E2E tests\n if actual_metric_type == 'monotonic_count' and expected_metric_type == 'count':\n actual_metric_type = 'count'\n\n if check_metric_type:\n if expected_metric_type != actual_metric_type:\n errors.add(\n \"Expect `{}` to have type `{}` but got `{}`.\".format(\n metric_stub_name, expected_metric_type, actual_metric_type\n )\n )\n\n assert not errors, \"Metadata assertion errors using metadata.csv:\" + \"\\n\\t- \".join([''] + sorted(errors))\n\n def assert_service_checks(self, service_checks):\n \"\"\"\n Assert service checks using service_checks.json\n\n Usage:\n\n from datadog_checks.dev.utils import get_service_checks\n aggregator.assert_service_checks(get_service_checks())\n\n \"\"\"\n\n errors = set()\n\n for service_check_name, service_check_stubs in iteritems(self._service_checks):\n for service_check_stub in service_check_stubs:\n # Checking the metric is in `service_checks.json`\n if service_check_name not in [sc['check'] for sc in service_checks]:\n errors.add(\"Expect `{}` to be in service_check.json.\".format(service_check_name))\n continue\n\n status_string = {value: key for key, value in iteritems(ServiceCheck._asdict())}[\n service_check_stub.status\n ].lower()\n service_check = [c for c in service_checks if c['check'] == service_check_name][0]\n\n if status_string not in service_check['statuses']:\n errors.add(\n \"Expect `{}` value to be in service_check.json for service check {}.\".format(\n status_string, service_check_stub.name\n )\n )\n\n assert not errors, \"Service checks assertion errors using service_checks.json:\" + \"\\n\\t- \".join(\n [''] + sorted(errors)\n )\n\n def assert_no_duplicate_all(self):\n \"\"\"\n Assert no duplicate metrics and service checks have been submitted.\n \"\"\"\n self.assert_no_duplicate_metrics()\n self.assert_no_duplicate_service_checks()\n\n def assert_no_duplicate_metrics(self):\n \"\"\"\n Assert no duplicate metrics have been submitted.\n\n Metrics are considered duplicate when all following fields match:\n\n - metric name\n - type (gauge, rate, etc)\n - tags\n - hostname\n \"\"\"\n # metric types that intended to be called multiple times are ignored\n ignored_types = [self.COUNT, self.COUNTER]\n metric_stubs = [m for metrics in self._metrics.values() for m in metrics if m.type not in ignored_types]\n\n def stub_to_key_fn(stub):\n return stub.name, stub.type, str(sorted(stub.tags)), stub.hostname\n\n self._assert_no_duplicate_stub('metric', metric_stubs, stub_to_key_fn)\n\n def assert_no_duplicate_service_checks(self):\n \"\"\"\n Assert no duplicate service checks have been submitted.\n\n Service checks are considered duplicate when all following fields match:\n - metric name\n - status\n - tags\n - hostname\n \"\"\"\n service_check_stubs = [m for metrics in self._service_checks.values() for m in metrics]\n\n def stub_to_key_fn(stub):\n return stub.name, stub.status, str(sorted(stub.tags)), stub.hostname\n\n self._assert_no_duplicate_stub('service_check', service_check_stubs, stub_to_key_fn)\n\n @staticmethod\n def _assert_no_duplicate_stub(stub_type, all_metrics, stub_to_key_fn):\n all_contexts = defaultdict(list)\n for metric in all_metrics:\n context = stub_to_key_fn(metric)\n all_contexts[context].append(metric)\n\n dup_contexts = defaultdict(list)\n for context, metrics in iteritems(all_contexts):\n if len(metrics) > 1:\n dup_contexts[context] = metrics\n\n err_msg_lines = [\"Duplicate {}s found:\".format(stub_type)]\n for key in sorted(dup_contexts):\n contexts = dup_contexts[key]\n err_msg_lines.append('- {}'.format(contexts[0].name))\n for metric in contexts:\n err_msg_lines.append(' ' + str(metric))\n\n assert len(dup_contexts) == 0, \"\\n\".join(err_msg_lines)\n\n def reset(self):\n \"\"\"\n Set the stub to its initial state\n \"\"\"\n self._metrics = defaultdict(list)\n self._asserted = set()\n self._service_checks = defaultdict(list)\n self._events = []\n # dict[event_type, [events]]\n self._event_platform_events = defaultdict(list)\n self._histogram_buckets = defaultdict(list)\n\n def all_metrics_asserted(self):\n assert self.metrics_asserted_pct >= 100.0\n\n def not_asserted(self):\n present_metrics = {ensure_unicode(m) for m in self._metrics}\n return present_metrics - set(self._asserted)\n\n def assert_metric_has_tag_prefix(self, metric_name, tag_prefix, count=None, at_least=1):\n candidates = []\n self._asserted.add(metric_name)\n\n for metric in self.metrics(metric_name):\n tags = metric.tags\n gtags = [t for t in tags if t.startswith(tag_prefix)]\n if len(gtags) > 0:\n candidates.append(metric)\n\n msg = \"Candidates size assertion for `{}`, count: {}, at_least: {}) failed\".format(metric_name, count, at_least)\n if count is not None:\n assert len(candidates) == count, msg\n else:\n assert len(candidates) >= at_least, msg\n\n @property\n def metrics_asserted_pct(self):\n \"\"\"\n Return the metrics assertion coverage\n \"\"\"\n num_metrics = len(self._metrics)\n num_asserted = len(self._asserted)\n\n if num_metrics == 0:\n if num_asserted == 0:\n return 100\n else:\n return 0\n\n # If it there have been assertions with at_least=0 the length of the num_metrics and num_asserted can match\n # even if there are different metrics in each set\n not_asserted = self.not_asserted()\n return (num_metrics - len(not_asserted)) / num_metrics * 100\n\n @property\n def metric_names(self):\n \"\"\"\n Return all the metric names we've seen so far\n \"\"\"\n return [ensure_unicode(name) for name in self._metrics.keys()]\n\n @property\n def service_check_names(self):\n \"\"\"\n Return all the service checks names seen so far\n \"\"\"\n return [ensure_unicode(name) for name in self._service_checks.keys()]\n
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_metric(\n self,\n name,\n value=None,\n tags=None,\n count=None,\n at_least=1,\n hostname=None,\n metric_type=None,\n device=None,\n flush_first_value=None,\n):\n \"\"\"\n Assert a metric was processed by this stub\n \"\"\"\n\n self._asserted.add(name)\n expected_tags = normalize_tags(tags, sort=True)\n\n candidates = []\n for metric in self.metrics(name):\n if value is not None and not self.is_aggregate(metric.type) and value != metric.value:\n continue\n\n if expected_tags and expected_tags != sorted(metric.tags):\n continue\n\n if hostname is not None and hostname != metric.hostname:\n continue\n\n if metric_type is not None and metric_type != metric.type:\n continue\n\n if device is not None and device != metric.device:\n continue\n\n if flush_first_value is not None and flush_first_value != metric.flush_first_value:\n continue\n\n candidates.append(metric)\n\n expected_metric = MetricStub(name, metric_type, value, expected_tags, hostname, device, flush_first_value)\n\n if value is not None and candidates and all(self.is_aggregate(m.type) for m in candidates):\n got = sum(m.value for m in candidates)\n msg = \"Expected count value for '{}': {}, got {}\".format(name, value, got)\n condition = value == got\n elif count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(condition, msg=msg, expected_stub=expected_metric, submitted_elements=self._metrics)\n
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_metric_has_tag(self, metric_name, tag, count=None, at_least=1):\n \"\"\"\n Assert a metric is tagged with tag\n \"\"\"\n self._asserted.add(metric_name)\n\n candidates = []\n candidates_with_tag = []\n for metric in self.metrics(metric_name):\n candidates.append(metric)\n if tag in metric.tags:\n candidates_with_tag.append(metric)\n\n if candidates_with_tag: # The metric was found with the tag but not enough times\n msg = \"The metric '{}' with tag '{}' was only found {}/{} times\".format(metric_name, tag, count, at_least)\n elif candidates:\n msg = (\n \"The metric '{}' was found but not with the tag '{}'.\\n\".format(metric_name, tag)\n + \"Similar submitted:\\n\"\n + \"\\n\".join([\" {}\".format(m) for m in candidates])\n )\n else:\n expected_stub = MetricStub(metric_name, type=None, value=None, tags=[tag], hostname=None, device=None)\n msg = \"Metric '{}' not found\".format(metric_name)\n msg = \"{}\\n{}\".format(msg, build_similar_elements_msg(expected_stub, self._metrics))\n\n if count is not None:\n assert len(candidates_with_tag) == count, msg\n else:\n assert len(candidates_with_tag) >= at_least, msg\n
"},{"location":"base/api/#datadog_checks.base.stubs.aggregator.AggregatorStub.assert_metric_has_tag_prefix","title":"assert_metric_has_tag_prefix(metric_name, tag_prefix, count=None, at_least=1)","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_metric_has_tag_prefix(self, metric_name, tag_prefix, count=None, at_least=1):\n candidates = []\n self._asserted.add(metric_name)\n\n for metric in self.metrics(metric_name):\n tags = metric.tags\n gtags = [t for t in tags if t.startswith(tag_prefix)]\n if len(gtags) > 0:\n candidates.append(metric)\n\n msg = \"Candidates size assertion for `{}`, count: {}, at_least: {}) failed\".format(metric_name, count, at_least)\n if count is not None:\n assert len(candidates) == count, msg\n else:\n assert len(candidates) >= at_least, msg\n
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_service_check(self, name, status=None, tags=None, count=None, at_least=1, hostname=None, message=None):\n \"\"\"\n Assert a service check was processed by this stub\n \"\"\"\n tags = normalize_tags(tags, sort=True)\n candidates = []\n for sc in self.service_checks(name):\n if status is not None and status != sc.status:\n continue\n\n if tags and tags != sorted(sc.tags):\n continue\n\n if hostname is not None and hostname != sc.hostname:\n continue\n\n if message is not None and message != sc.message:\n continue\n\n candidates.append(sc)\n\n expected_service_check = ServiceCheckStub(\n None, name=name, status=status, tags=tags, hostname=hostname, message=message\n )\n\n if count is not None:\n msg = \"Needed exactly {} candidates for '{}', got {}\".format(count, name, len(candidates))\n condition = len(candidates) == count\n else:\n msg = \"Needed at least {} candidates for '{}', got {}\".format(at_least, name, len(candidates))\n condition = len(candidates) >= at_least\n self._assert(\n condition=condition, msg=msg, expected_stub=expected_service_check, submitted_elements=self._service_checks\n )\n
"},{"location":"base/api/#datadog_checks.base.stubs.aggregator.AggregatorStub.assert_event","title":"assert_event(msg_text, count=None, at_least=1, exact_match=True, tags=None, **kwargs)","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_event(self, msg_text, count=None, at_least=1, exact_match=True, tags=None, **kwargs):\n candidates = []\n for e in self.events:\n if exact_match and msg_text != e['msg_text'] or msg_text not in e['msg_text']:\n continue\n if tags and set(tags) != set(e['tags']):\n continue\n for name, value in iteritems(kwargs):\n if e[name] != value:\n break\n else:\n candidates.append(e)\n\n msg = \"Candidates size assertion for `{}`, count: {}, at_least: {}) failed\".format(msg_text, count, at_least)\n if count is not None:\n assert len(candidates) == count, msg\n else:\n assert len(candidates) >= at_least, msg\n
Checking type: By default we are asserting the in-app metric type (check_submission_type=False), asserting this type make sense for e2e (metrics collected from agent). For integrations tests, we can check the submission type with check_submission_type=True, or use check_metric_type=False not to check types.
Usage:
from datadog_checks.dev.utils import get_metadata_metrics\naggregator.assert_metrics_using_metadata(get_metadata_metrics())\n
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_metrics_using_metadata(\n self, metadata_metrics, check_metric_type=True, check_submission_type=False, exclude=None\n):\n \"\"\"\n Assert metrics using metadata.csv\n\n Checking type: By default we are asserting the in-app metric type (`check_submission_type=False`),\n asserting this type make sense for e2e (metrics collected from agent).\n For integrations tests, we can check the submission type with `check_submission_type=True`, or\n use `check_metric_type=False` not to check types.\n\n Usage:\n\n from datadog_checks.dev.utils import get_metadata_metrics\n aggregator.assert_metrics_using_metadata(get_metadata_metrics())\n\n \"\"\"\n\n exclude = exclude or []\n errors = set()\n for metric_name, metric_stubs in iteritems(self._metrics):\n if metric_name in exclude:\n continue\n for metric_stub in metric_stubs:\n metric_stub_name = backend_normalize_metric_name(metric_stub.name)\n actual_metric_type = AggregatorStub.METRIC_ENUM_MAP_REV[metric_stub.type]\n\n # We only check `*.count` metrics for histogram and historate submissions\n # Note: all Openmetrics histogram and summary metrics are actually separately submitted\n if check_submission_type and actual_metric_type in ['histogram', 'historate']:\n metric_stub_name += '.count'\n\n # Checking the metric is in `metadata.csv`\n if metric_stub_name not in metadata_metrics:\n errors.add(\"Expect `{}` to be in metadata.csv.\".format(metric_stub_name))\n continue\n\n expected_metric_type = metadata_metrics[metric_stub_name]['metric_type']\n if check_submission_type:\n # Integration tests type mapping\n actual_metric_type = AggregatorStub.METRIC_TYPE_SUBMISSION_TO_BACKEND_MAP[actual_metric_type]\n else:\n # E2E tests\n if actual_metric_type == 'monotonic_count' and expected_metric_type == 'count':\n actual_metric_type = 'count'\n\n if check_metric_type:\n if expected_metric_type != actual_metric_type:\n errors.add(\n \"Expect `{}` to have type `{}` but got `{}`.\".format(\n metric_stub_name, expected_metric_type, actual_metric_type\n )\n )\n\n assert not errors, \"Metadata assertion errors using metadata.csv:\" + \"\\n\\t- \".join([''] + sorted(errors))\n
"},{"location":"base/api/#datadog_checks.base.stubs.aggregator.AggregatorStub.assert_all_metrics_covered","title":"assert_all_metrics_covered()","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_all_metrics_covered(self):\n # use `condition` to avoid building the `msg` if not needed\n condition = self.metrics_asserted_pct >= 100.0\n msg = ''\n if not condition:\n prefix = '\\n\\t- '\n msg = 'Some metrics are collected but not asserted:'\n msg += '\\nAsserted Metrics:{}{}'.format(prefix, prefix.join(sorted(self._asserted)))\n msg += '\\nFound metrics that are not asserted:{}{}'.format(prefix, prefix.join(sorted(self.not_asserted())))\n assert condition, msg\n
Metrics are considered duplicate when all following fields match:
metric name
type (gauge, rate, etc)
tags
hostname
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_no_duplicate_metrics(self):\n \"\"\"\n Assert no duplicate metrics have been submitted.\n\n Metrics are considered duplicate when all following fields match:\n\n - metric name\n - type (gauge, rate, etc)\n - tags\n - hostname\n \"\"\"\n # metric types that intended to be called multiple times are ignored\n ignored_types = [self.COUNT, self.COUNTER]\n metric_stubs = [m for metrics in self._metrics.values() for m in metrics if m.type not in ignored_types]\n\n def stub_to_key_fn(stub):\n return stub.name, stub.type, str(sorted(stub.tags)), stub.hostname\n\n self._assert_no_duplicate_stub('metric', metric_stubs, stub_to_key_fn)\n
Assert no duplicate service checks have been submitted.
Service checks are considered duplicate when all following fields match
metric name
status
tags
hostname
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_no_duplicate_service_checks(self):\n \"\"\"\n Assert no duplicate service checks have been submitted.\n\n Service checks are considered duplicate when all following fields match:\n - metric name\n - status\n - tags\n - hostname\n \"\"\"\n service_check_stubs = [m for metrics in self._service_checks.values() for m in metrics]\n\n def stub_to_key_fn(stub):\n return stub.name, stub.status, str(sorted(stub.tags)), stub.hostname\n\n self._assert_no_duplicate_stub('service_check', service_check_stubs, stub_to_key_fn)\n
Assert no duplicate metrics and service checks have been submitted.
Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
def assert_no_duplicate_all(self):\n \"\"\"\n Assert no duplicate metrics and service checks have been submitted.\n \"\"\"\n self.assert_no_duplicate_metrics()\n self.assert_no_duplicate_service_checks()\n
"},{"location":"base/api/#datadog_checks.base.stubs.aggregator.AggregatorStub.all_metrics_asserted","title":"all_metrics_asserted()","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/aggregator.py
This implements the methods defined by the Agent's C bindings which in turn call the Go backend.
It also provides utility methods for test assertions.
Source code in datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py
class DatadogAgentStub(object):\n \"\"\"\n This implements the methods defined by the Agent's\n [C bindings](https://github.com/DataDog/datadog-agent/blob/master/rtloader/common/builtins/datadog_agent.c)\n which in turn call the\n [Go backend](https://github.com/DataDog/datadog-agent/blob/master/pkg/collector/python/datadog_agent.go).\n\n It also provides utility methods for test assertions.\n \"\"\"\n\n def __init__(self):\n self._sent_logs = defaultdict(list)\n self._metadata = {}\n self._cache = {}\n self._config = self.get_default_config()\n self._hostname = 'stubbed.hostname'\n self._process_start_time = 0\n self._external_tags = []\n self._host_tags = \"{}\"\n\n def get_default_config(self):\n return {'enable_metadata_collection': True, 'disable_unsafe_yaml': True}\n\n def reset(self):\n self._sent_logs.clear()\n self._metadata.clear()\n self._cache.clear()\n self._config = self.get_default_config()\n self._process_start_time = 0\n self._external_tags = []\n self._host_tags = \"{}\"\n\n def assert_logs(self, check_id, logs):\n sent_logs = self._sent_logs[check_id]\n assert sent_logs == logs, 'Expected {} logs for check {}, found {}. Submitted logs: {}'.format(\n len(logs), check_id, len(self._sent_logs[check_id]), repr(self._sent_logs)\n )\n\n def assert_metadata(self, check_id, data):\n actual = {}\n for name in data:\n key = (check_id, name)\n if key in self._metadata:\n actual[name] = self._metadata[key]\n assert data == actual\n\n def assert_metadata_count(self, count):\n metadata_items = len(self._metadata)\n assert metadata_items == count, 'Expected {} metadata items, found {}. Submitted metadata: {}'.format(\n count, metadata_items, repr(self._metadata)\n )\n\n def assert_external_tags(self, hostname, external_tags, match_tags_order=False):\n for h, tags in self._external_tags:\n if h == hostname:\n if not match_tags_order:\n external_tags = {k: sorted(v) for (k, v) in iteritems(external_tags)}\n tags = {k: sorted(v) for (k, v) in iteritems(tags)}\n\n assert (\n external_tags == tags\n ), 'Expected {} external tags for hostname {}, found {}. Submitted external tags: {}'.format(\n external_tags, hostname, tags, repr(self._external_tags)\n )\n return\n\n raise AssertionError('Hostname {} not found in external tags {}'.format(hostname, repr(self._external_tags)))\n\n def assert_external_tags_count(self, count):\n tags_count = len(self._external_tags)\n assert tags_count == count, 'Expected {} external tags items, found {}. Submitted external tags: {}'.format(\n count, tags_count, repr(self._external_tags)\n )\n\n def get_hostname(self):\n return self._hostname\n\n def set_hostname(self, hostname):\n self._hostname = hostname\n\n def reset_hostname(self):\n self._hostname = 'stubbed.hostname'\n\n def get_host_tags(self):\n return self._host_tags\n\n def _set_host_tags(self, tags_dict):\n self._host_tags = json.dumps(tags_dict)\n\n def _reset_host_tags(self):\n self._host_tags = \"{}\"\n\n def get_config(self, config_option):\n return self._config.get(config_option, '')\n\n def get_version(self):\n return '0.0.0'\n\n def log(self, *args, **kwargs):\n pass\n\n def set_check_metadata(self, check_id, name, value):\n self._metadata[(check_id, name)] = value\n\n def send_log(self, log_line, check_id):\n self._sent_logs[check_id].append(from_json(log_line))\n\n def set_external_tags(self, external_tags):\n self._external_tags = external_tags\n\n def tracemalloc_enabled(self, *args, **kwargs):\n return False\n\n def write_persistent_cache(self, key, value):\n self._cache[key] = value\n\n def read_persistent_cache(self, key):\n return self._cache.get(key, '')\n\n def obfuscate_sql(self, query, options=None):\n # Full obfuscation implementation is in go code.\n if options:\n # Options provided is a JSON string because the Go stub requires it, whereas\n # the python stub does not for things such as testing.\n if from_json(options).get('return_json_metadata', False):\n return to_json({'query': re.sub(r'\\s+', ' ', query or '').strip(), 'metadata': {}})\n return re.sub(r'\\s+', ' ', query or '').strip()\n\n def obfuscate_sql_exec_plan(self, plan, normalize=False):\n # Passthrough stub: obfuscation implementation is in Go code.\n return plan\n\n def get_process_start_time(self):\n return self._process_start_time\n\n def set_process_start_time(self, time):\n self._process_start_time = time\n\n def obfuscate_mongodb_string(self, command):\n # Passthrough stub: obfuscation implementation is in Go code.\n return command\n
"},{"location":"base/api/#datadog_checks.base.stubs.datadog_agent.DatadogAgentStub.assert_metadata","title":"assert_metadata(check_id, data)","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py
def assert_metadata(self, check_id, data):\n actual = {}\n for name in data:\n key = (check_id, name)\n if key in self._metadata:\n actual[name] = self._metadata[key]\n assert data == actual\n
"},{"location":"base/api/#datadog_checks.base.stubs.datadog_agent.DatadogAgentStub.assert_metadata_count","title":"assert_metadata_count(count)","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py
"},{"location":"base/api/#datadog_checks.base.stubs.datadog_agent.DatadogAgentStub.reset","title":"reset()","text":"Source code in datadog_checks_base/datadog_checks/base/stubs/datadog_agent.py
This list enumerates what is collected from your system by each integration. For more information on metrics, see the Metric Types documentation. You can find the metrics for each integration in that integration's metadata.csv file. You can also set up custom metrics, so if the integration doesn\u2019t offer a metric out of the box, you can usually add it.
The gauge metric submission type represents a snapshot of events in one time interval. This representative snapshot value is the last value submitted to the Agent during a time interval. A gauge can be used to take a measure of something reporting continuously\u2014like the available disk space or memory used.
The count metric submission type represents the total number of event occurrences in one time interval. A count can be used to track the total number of connections made to a database or the total number of requests to an endpoint. This number of events can increase or decrease over time\u2014it is not monotonically increasing.
The rate metric submission type represents the total number of event occurrences per second in one time interval. A rate can be used to track how often something is happening\u2014like the frequency of connections made to a database or the flow of requests made to an endpoint.
The histogram metric submission type represents the statistical distribution of a set of values calculated Agent-side in one time interval. Datadog\u2019s histogram metric type is an extension of the StatsD timing metric type: the Agent aggregates the values that are sent in a defined time interval and produces different metrics which represent the set of values.
Within every integration, you can specify the value of __NAMESPACE__:
from datadog_checks.base import AgentCheck\n\n\nclass AwesomeCheck(AgentCheck):\n __NAMESPACE__ = 'awesome'\n\n...\n
This is an optional addition, but it makes submissions easier since it prefixes every metric with the __NAMESPACE__ automatically. In this case it would append awesome. to each metric submitted to Datadog.
If you wish to ignore the namespace for any reason, you can append an optional Boolean raw=True to each submission:
In the AgentCheck class, there is a useful property called check_initializations, which you can use to execute functions that are called once before the first check run. You can fill up check_initializations with instructions in the __init__ function of an integration. For example, you could use it to parse configuration information before running a check. Listed below is an example with Airflow:
class AirflowCheck(AgentCheck):\n def __init__(self, name, init_config, instances):\n super(AirflowCheck, self).__init__(name, init_config, instances)\n\n self._url = self.instance.get('url', '')\n self._tags = self.instance.get('tags', [])\n\n # The Agent only makes one attempt to instantiate each AgentCheck so any errors occurring\n # in `__init__` are logged just once, making it difficult to spot. Therefore,\n # potential configuration errors are emitted as part of the check run phase.\n # The configuration is only parsed once if it succeed, otherwise it's retried.\n self.check_initializations.append(self._parse_config)\n\n...\n
This class accepts a single dict argument which is necessary to run the query. The representation is based on our custom_queries format originally designed and implemented in #1528.
It is now part of all our database integrations and other products have since adopted this format.
Source code in datadog_checks_base/datadog_checks/base/utils/db/query.py
class Query(object):\n \"\"\"\n This class accepts a single `dict` argument which is necessary to run the query. The representation\n is based on our `custom_queries` format originally designed and implemented in !1528.\n\n It is now part of all our database integrations and\n [other](https://cloud.google.com/solutions/sap/docs/sap-hana-monitoring-agent-planning-guide#defining_custom_queries)\n products have since adopted this format.\n \"\"\"\n\n def __init__(self, query_data):\n '''\n Parameters:\n query_data (Dict[str, Any]): The query data to run the query. It should contain the following fields:\n - name (str): The name of the query.\n - query (str): The query to run.\n - columns (List[Dict[str, Any]]): Each column should contain the following fields:\n - name (str): The name of the column.\n - type (str): The type of the column.\n - (Optional) Any other field that the column transformer for the type requires.\n - (Optional) extras (List[Dict[str, Any]]): Each extra transformer should contain the following fields:\n - name (str): The name of the extra transformer.\n - type (str): The type of the extra transformer.\n - (Optional) Any other field that the extra transformer for the type requires.\n - (Optional) tags (List[str]): The tags to add to the query result.\n - (Optional) collection_interval (int): The collection interval (in seconds) of the query.\n Note:\n If collection_interval is None, the query will be run every check run.\n If the collection interval is less than check collection interval,\n the query will be run every check run.\n If the collection interval is greater than check collection interval,\n the query will NOT BE RUN exactly at the collection interval.\n The query will be run at the next check run after the collection interval has passed.\n - (Optional) metric_prefix (str): The prefix to add to the metric name.\n Note: If the metric prefix is None, the default metric prefix `<INTEGRATION>.` will be used.\n '''\n # Contains the data to fill the rest of the attributes\n self.query_data = deepcopy(query_data or {}) # type: Dict[str, Any]\n self.name = None # type: str\n # The actual query\n self.query = None # type: str\n # Contains a mapping of column_name -> column_type, transformer\n self.column_transformers = None # type: Tuple[Tuple[str, Tuple[str, Transformer]]]\n # These transformers are used to collect extra metrics calculated from the query result\n self.extra_transformers = None # type: List[Tuple[str, Transformer]]\n # Contains the tags defined in query_data, more tags can be added later from the query result\n self.base_tags = None # type: List[str]\n # The collecton interval (in seconds) of the query. If None, the query will be run every check run.\n self.collection_interval = None # type: int\n # The last time the query was executed. If None, the query has never been executed.\n # This is only used when the collection_interval is not None.\n self.__last_execution_time = None # type: float\n # whether to ignore any defined namespace prefix. True when `metric_prefix` is defined.\n self.metric_name_raw = False # type: bool\n\n def compile(\n self,\n column_transformers, # type: Dict[str, TransformerFactory]\n extra_transformers, # type: Dict[str, TransformerFactory]\n ):\n # type: (...) -> None\n\n \"\"\"\n This idempotent method will be called by `QueryManager.compile_queries` so you\n should never need to call it directly.\n \"\"\"\n # Check for previous compilation\n if self.name is not None:\n return\n\n query_name = self.query_data.get('name')\n if not query_name:\n raise ValueError('query field `name` is required')\n elif not isinstance(query_name, str):\n raise ValueError('query field `name` must be a string')\n\n metric_prefix = self.query_data.get('metric_prefix')\n if metric_prefix is not None:\n if not isinstance(metric_prefix, str):\n raise ValueError('field `metric_prefix` for {} must be a string'.format(query_name))\n elif not metric_prefix:\n raise ValueError('field `metric_prefix` for {} must not be empty'.format(query_name))\n\n query = self.query_data.get('query')\n if not query:\n raise ValueError('field `query` for {} is required'.format(query_name))\n elif query_name.startswith('custom query #') and not isinstance(query, str):\n raise ValueError('field `query` for {} must be a string'.format(query_name))\n\n columns = self.query_data.get('columns')\n if not columns:\n raise ValueError('field `columns` for {} is required'.format(query_name))\n elif not isinstance(columns, list):\n raise ValueError('field `columns` for {} must be a list'.format(query_name))\n\n tags = self.query_data.get('tags', [])\n if tags is not None and not isinstance(tags, list):\n raise ValueError('field `tags` for {} must be a list'.format(query_name))\n\n # Keep track of all defined names\n sources = {}\n\n column_data = []\n for i, column in enumerate(columns, 1):\n # Columns can be ignored via configuration.\n if not column:\n column_data.append((None, None))\n continue\n elif not isinstance(column, dict):\n raise ValueError('column #{} of {} is not a mapping'.format(i, query_name))\n\n column_name = column.get('name')\n if not column_name:\n raise ValueError('field `name` for column #{} of {} is required'.format(i, query_name))\n elif not isinstance(column_name, str):\n raise ValueError('field `name` for column #{} of {} must be a string'.format(i, query_name))\n elif column_name in sources:\n raise ValueError(\n 'the name {} of {} was already defined in {} #{}'.format(\n column_name, query_name, sources[column_name]['type'], sources[column_name]['index']\n )\n )\n\n sources[column_name] = {'type': 'column', 'index': i}\n\n column_type = column.get('type')\n if not column_type:\n raise ValueError('field `type` for column {} of {} is required'.format(column_name, query_name))\n elif not isinstance(column_type, str):\n raise ValueError('field `type` for column {} of {} must be a string'.format(column_name, query_name))\n elif column_type == 'source':\n column_data.append((column_name, (None, None)))\n continue\n elif column_type not in column_transformers:\n raise ValueError('unknown type `{}` for column {} of {}'.format(column_type, column_name, query_name))\n\n __column_type_is_tag = column_type in ('tag', 'tag_list', 'tag_not_null')\n modifiers = {key: value for key, value in column.items() if key not in ('name', 'type')}\n\n try:\n if not __column_type_is_tag and metric_prefix:\n # if metric_prefix is defined, we prepend it to the column name\n column_name = \"{}.{}\".format(metric_prefix, column_name)\n transformer = column_transformers[column_type](column_transformers, column_name, **modifiers)\n except Exception as e:\n error = 'error compiling type `{}` for column {} of {}: {}'.format(\n column_type, column_name, query_name, e\n )\n\n # Prepend helpful error text.\n #\n # When an exception is raised in the context of another one, both will be printed. To avoid\n # this we set the context to None. https://www.python.org/dev/peps/pep-0409/\n raise_from(type(e)(error), None)\n else:\n if __column_type_is_tag:\n column_data.append((column_name, (column_type, transformer)))\n else:\n # All these would actually submit data. As that is the default case, we represent it as\n # a reference to None since if we use e.g. `value` it would never be checked anyway.\n column_data.append((column_name, (None, transformer)))\n\n submission_transformers = column_transformers.copy() # type: Dict[str, Transformer]\n submission_transformers.pop('tag')\n submission_transformers.pop('tag_list')\n submission_transformers.pop('tag_not_null')\n\n extras = self.query_data.get('extras', []) # type: List[Dict[str, Any]]\n if not isinstance(extras, list):\n raise ValueError('field `extras` for {} must be a list'.format(query_name))\n\n extra_data = [] # type: List[Tuple[str, Transformer]]\n for i, extra in enumerate(extras, 1):\n if not isinstance(extra, dict):\n raise ValueError('extra #{} of {} is not a mapping'.format(i, query_name))\n\n extra_type = extra.get('type') # type: str\n extra_name = extra.get('name') # type: str\n if extra_type == 'log':\n # The name is unused\n extra_name = 'log'\n elif not extra_name:\n raise ValueError('field `name` for extra #{} of {} is required'.format(i, query_name))\n elif not isinstance(extra_name, str):\n raise ValueError('field `name` for extra #{} of {} must be a string'.format(i, query_name))\n elif extra_name in sources:\n raise ValueError(\n 'the name {} of {} was already defined in {} #{}'.format(\n extra_name, query_name, sources[extra_name]['type'], sources[extra_name]['index']\n )\n )\n\n sources[extra_name] = {'type': 'extra', 'index': i}\n\n if not extra_type:\n if 'expression' in extra:\n extra_type = 'expression'\n else:\n raise ValueError('field `type` for extra {} of {} is required'.format(extra_name, query_name))\n elif not isinstance(extra_type, str):\n raise ValueError('field `type` for extra {} of {} must be a string'.format(extra_name, query_name))\n elif extra_type not in extra_transformers and extra_type not in submission_transformers:\n raise ValueError('unknown type `{}` for extra {} of {}'.format(extra_type, extra_name, query_name))\n\n transformer_factory = extra_transformers.get(\n extra_type, submission_transformers.get(extra_type)\n ) # type: TransformerFactory\n\n extra_source = extra.get('source')\n if extra_type in submission_transformers:\n if not extra_source:\n raise ValueError('field `source` for extra {} of {} is required'.format(extra_name, query_name))\n\n modifiers = {key: value for key, value in extra.items() if key not in ('name', 'type', 'source')}\n else:\n modifiers = {key: value for key, value in extra.items() if key not in ('name', 'type')}\n modifiers['sources'] = sources\n\n try:\n transformer = transformer_factory(submission_transformers, extra_name, **modifiers)\n except Exception as e:\n error = 'error compiling type `{}` for extra {} of {}: {}'.format(extra_type, extra_name, query_name, e)\n\n raise_from(type(e)(error), None)\n else:\n if extra_type in submission_transformers:\n transformer = create_extra_transformer(transformer, extra_source)\n\n extra_data.append((extra_name, transformer))\n\n collection_interval = self.query_data.get('collection_interval')\n if collection_interval is not None:\n if not isinstance(collection_interval, (int, float)):\n raise ValueError('field `collection_interval` for {} must be a number'.format(query_name))\n elif int(collection_interval) <= 0:\n raise ValueError(\n 'field `collection_interval` for {} must be a positive number after rounding'.format(query_name)\n )\n collection_interval = int(collection_interval)\n\n self.name = query_name\n self.query = query\n self.column_transformers = tuple(column_data)\n self.extra_transformers = tuple(extra_data)\n self.base_tags = tags\n self.collection_interval = collection_interval\n self.metric_name_raw = metric_prefix is not None\n del self.query_data\n\n def should_execute(self):\n '''\n Check if the query should be executed based on the collection interval.\n\n :return: True if the query should be executed, False otherwise.\n '''\n if self.collection_interval is None:\n # if the collection interval is None, the query should always be executed.\n return True\n\n now = get_timestamp()\n if self.__last_execution_time is None or now - self.__last_execution_time >= self.collection_interval:\n # if the last execution time is None (the query has never been executed),\n # if the time since the last execution is greater than or equal to the collection interval,\n # the query should be executed.\n self.__last_execution_time = now\n return True\n\n return False\n
Name Type Description Default query_dataDict[str, Any]
The query data to run the query. It should contain the following fields: - name (str): The name of the query. - query (str): The query to run. - columns (List[Dict[str, Any]]): Each column should contain the following fields: - name (str): The name of the column. - type (str): The type of the column. - (Optional) Any other field that the column transformer for the type requires. - (Optional) extras (List[Dict[str, Any]]): Each extra transformer should contain the following fields: - name (str): The name of the extra transformer. - type (str): The type of the extra transformer. - (Optional) Any other field that the extra transformer for the type requires. - (Optional) tags (List[str]): The tags to add to the query result. - (Optional) collection_interval (int): The collection interval (in seconds) of the query. Note: If collection_interval is None, the query will be run every check run. If the collection interval is less than check collection interval, the query will be run every check run. If the collection interval is greater than check collection interval, the query will NOT BE RUN exactly at the collection interval. The query will be run at the next check run after the collection interval has passed. - (Optional) metric_prefix (str): The prefix to add to the metric name. Note: If the metric prefix is None, the default metric prefix <INTEGRATION>. will be used.
required Source code in datadog_checks_base/datadog_checks/base/utils/db/query.py
def __init__(self, query_data):\n '''\n Parameters:\n query_data (Dict[str, Any]): The query data to run the query. It should contain the following fields:\n - name (str): The name of the query.\n - query (str): The query to run.\n - columns (List[Dict[str, Any]]): Each column should contain the following fields:\n - name (str): The name of the column.\n - type (str): The type of the column.\n - (Optional) Any other field that the column transformer for the type requires.\n - (Optional) extras (List[Dict[str, Any]]): Each extra transformer should contain the following fields:\n - name (str): The name of the extra transformer.\n - type (str): The type of the extra transformer.\n - (Optional) Any other field that the extra transformer for the type requires.\n - (Optional) tags (List[str]): The tags to add to the query result.\n - (Optional) collection_interval (int): The collection interval (in seconds) of the query.\n Note:\n If collection_interval is None, the query will be run every check run.\n If the collection interval is less than check collection interval,\n the query will be run every check run.\n If the collection interval is greater than check collection interval,\n the query will NOT BE RUN exactly at the collection interval.\n The query will be run at the next check run after the collection interval has passed.\n - (Optional) metric_prefix (str): The prefix to add to the metric name.\n Note: If the metric prefix is None, the default metric prefix `<INTEGRATION>.` will be used.\n '''\n # Contains the data to fill the rest of the attributes\n self.query_data = deepcopy(query_data or {}) # type: Dict[str, Any]\n self.name = None # type: str\n # The actual query\n self.query = None # type: str\n # Contains a mapping of column_name -> column_type, transformer\n self.column_transformers = None # type: Tuple[Tuple[str, Tuple[str, Transformer]]]\n # These transformers are used to collect extra metrics calculated from the query result\n self.extra_transformers = None # type: List[Tuple[str, Transformer]]\n # Contains the tags defined in query_data, more tags can be added later from the query result\n self.base_tags = None # type: List[str]\n # The collecton interval (in seconds) of the query. If None, the query will be run every check run.\n self.collection_interval = None # type: int\n # The last time the query was executed. If None, the query has never been executed.\n # This is only used when the collection_interval is not None.\n self.__last_execution_time = None # type: float\n # whether to ignore any defined namespace prefix. True when `metric_prefix` is defined.\n self.metric_name_raw = False # type: bool\n
This idempotent method will be called by QueryManager.compile_queries so you should never need to call it directly.
Source code in datadog_checks_base/datadog_checks/base/utils/db/query.py
def compile(\n self,\n column_transformers, # type: Dict[str, TransformerFactory]\n extra_transformers, # type: Dict[str, TransformerFactory]\n):\n # type: (...) -> None\n\n \"\"\"\n This idempotent method will be called by `QueryManager.compile_queries` so you\n should never need to call it directly.\n \"\"\"\n # Check for previous compilation\n if self.name is not None:\n return\n\n query_name = self.query_data.get('name')\n if not query_name:\n raise ValueError('query field `name` is required')\n elif not isinstance(query_name, str):\n raise ValueError('query field `name` must be a string')\n\n metric_prefix = self.query_data.get('metric_prefix')\n if metric_prefix is not None:\n if not isinstance(metric_prefix, str):\n raise ValueError('field `metric_prefix` for {} must be a string'.format(query_name))\n elif not metric_prefix:\n raise ValueError('field `metric_prefix` for {} must not be empty'.format(query_name))\n\n query = self.query_data.get('query')\n if not query:\n raise ValueError('field `query` for {} is required'.format(query_name))\n elif query_name.startswith('custom query #') and not isinstance(query, str):\n raise ValueError('field `query` for {} must be a string'.format(query_name))\n\n columns = self.query_data.get('columns')\n if not columns:\n raise ValueError('field `columns` for {} is required'.format(query_name))\n elif not isinstance(columns, list):\n raise ValueError('field `columns` for {} must be a list'.format(query_name))\n\n tags = self.query_data.get('tags', [])\n if tags is not None and not isinstance(tags, list):\n raise ValueError('field `tags` for {} must be a list'.format(query_name))\n\n # Keep track of all defined names\n sources = {}\n\n column_data = []\n for i, column in enumerate(columns, 1):\n # Columns can be ignored via configuration.\n if not column:\n column_data.append((None, None))\n continue\n elif not isinstance(column, dict):\n raise ValueError('column #{} of {} is not a mapping'.format(i, query_name))\n\n column_name = column.get('name')\n if not column_name:\n raise ValueError('field `name` for column #{} of {} is required'.format(i, query_name))\n elif not isinstance(column_name, str):\n raise ValueError('field `name` for column #{} of {} must be a string'.format(i, query_name))\n elif column_name in sources:\n raise ValueError(\n 'the name {} of {} was already defined in {} #{}'.format(\n column_name, query_name, sources[column_name]['type'], sources[column_name]['index']\n )\n )\n\n sources[column_name] = {'type': 'column', 'index': i}\n\n column_type = column.get('type')\n if not column_type:\n raise ValueError('field `type` for column {} of {} is required'.format(column_name, query_name))\n elif not isinstance(column_type, str):\n raise ValueError('field `type` for column {} of {} must be a string'.format(column_name, query_name))\n elif column_type == 'source':\n column_data.append((column_name, (None, None)))\n continue\n elif column_type not in column_transformers:\n raise ValueError('unknown type `{}` for column {} of {}'.format(column_type, column_name, query_name))\n\n __column_type_is_tag = column_type in ('tag', 'tag_list', 'tag_not_null')\n modifiers = {key: value for key, value in column.items() if key not in ('name', 'type')}\n\n try:\n if not __column_type_is_tag and metric_prefix:\n # if metric_prefix is defined, we prepend it to the column name\n column_name = \"{}.{}\".format(metric_prefix, column_name)\n transformer = column_transformers[column_type](column_transformers, column_name, **modifiers)\n except Exception as e:\n error = 'error compiling type `{}` for column {} of {}: {}'.format(\n column_type, column_name, query_name, e\n )\n\n # Prepend helpful error text.\n #\n # When an exception is raised in the context of another one, both will be printed. To avoid\n # this we set the context to None. https://www.python.org/dev/peps/pep-0409/\n raise_from(type(e)(error), None)\n else:\n if __column_type_is_tag:\n column_data.append((column_name, (column_type, transformer)))\n else:\n # All these would actually submit data. As that is the default case, we represent it as\n # a reference to None since if we use e.g. `value` it would never be checked anyway.\n column_data.append((column_name, (None, transformer)))\n\n submission_transformers = column_transformers.copy() # type: Dict[str, Transformer]\n submission_transformers.pop('tag')\n submission_transformers.pop('tag_list')\n submission_transformers.pop('tag_not_null')\n\n extras = self.query_data.get('extras', []) # type: List[Dict[str, Any]]\n if not isinstance(extras, list):\n raise ValueError('field `extras` for {} must be a list'.format(query_name))\n\n extra_data = [] # type: List[Tuple[str, Transformer]]\n for i, extra in enumerate(extras, 1):\n if not isinstance(extra, dict):\n raise ValueError('extra #{} of {} is not a mapping'.format(i, query_name))\n\n extra_type = extra.get('type') # type: str\n extra_name = extra.get('name') # type: str\n if extra_type == 'log':\n # The name is unused\n extra_name = 'log'\n elif not extra_name:\n raise ValueError('field `name` for extra #{} of {} is required'.format(i, query_name))\n elif not isinstance(extra_name, str):\n raise ValueError('field `name` for extra #{} of {} must be a string'.format(i, query_name))\n elif extra_name in sources:\n raise ValueError(\n 'the name {} of {} was already defined in {} #{}'.format(\n extra_name, query_name, sources[extra_name]['type'], sources[extra_name]['index']\n )\n )\n\n sources[extra_name] = {'type': 'extra', 'index': i}\n\n if not extra_type:\n if 'expression' in extra:\n extra_type = 'expression'\n else:\n raise ValueError('field `type` for extra {} of {} is required'.format(extra_name, query_name))\n elif not isinstance(extra_type, str):\n raise ValueError('field `type` for extra {} of {} must be a string'.format(extra_name, query_name))\n elif extra_type not in extra_transformers and extra_type not in submission_transformers:\n raise ValueError('unknown type `{}` for extra {} of {}'.format(extra_type, extra_name, query_name))\n\n transformer_factory = extra_transformers.get(\n extra_type, submission_transformers.get(extra_type)\n ) # type: TransformerFactory\n\n extra_source = extra.get('source')\n if extra_type in submission_transformers:\n if not extra_source:\n raise ValueError('field `source` for extra {} of {} is required'.format(extra_name, query_name))\n\n modifiers = {key: value for key, value in extra.items() if key not in ('name', 'type', 'source')}\n else:\n modifiers = {key: value for key, value in extra.items() if key not in ('name', 'type')}\n modifiers['sources'] = sources\n\n try:\n transformer = transformer_factory(submission_transformers, extra_name, **modifiers)\n except Exception as e:\n error = 'error compiling type `{}` for extra {} of {}: {}'.format(extra_type, extra_name, query_name, e)\n\n raise_from(type(e)(error), None)\n else:\n if extra_type in submission_transformers:\n transformer = create_extra_transformer(transformer, extra_source)\n\n extra_data.append((extra_name, transformer))\n\n collection_interval = self.query_data.get('collection_interval')\n if collection_interval is not None:\n if not isinstance(collection_interval, (int, float)):\n raise ValueError('field `collection_interval` for {} must be a number'.format(query_name))\n elif int(collection_interval) <= 0:\n raise ValueError(\n 'field `collection_interval` for {} must be a positive number after rounding'.format(query_name)\n )\n collection_interval = int(collection_interval)\n\n self.name = query_name\n self.query = query\n self.column_transformers = tuple(column_data)\n self.extra_transformers = tuple(extra_data)\n self.base_tags = tags\n self.collection_interval = collection_interval\n self.metric_name_raw = metric_prefix is not None\n del self.query_data\n
Note: This class is not in charge of opening or closing connections, just running queries.
Source code in datadog_checks_base/datadog_checks/base/utils/db/core.py
class QueryManager(QueryExecutor):\n \"\"\"\n This class is in charge of running any number of `Query` instances for a single Check instance.\n\n You will most often see it created during Check initialization like this:\n\n ```python\n self._query_manager = QueryManager(\n self,\n self.execute_query,\n queries=[\n queries.SomeQuery1,\n queries.SomeQuery2,\n queries.SomeQuery3,\n queries.SomeQuery4,\n queries.SomeQuery5,\n ],\n tags=self.instance.get('tags', []),\n error_handler=self._error_sanitizer,\n )\n self.check_initializations.append(self._query_manager.compile_queries)\n ```\n\n Note: This class is not in charge of opening or closing connections, just running queries.\n \"\"\"\n\n def __init__(\n self,\n check, # type: AgentCheck\n executor, # type: QueriesExecutor\n queries=None, # type: List[Dict[str, Any]]\n tags=None, # type: List[str]\n error_handler=None, # type: Callable[[str], str]\n hostname=None, # type: str\n ): # type: (...) -> QueryManager\n \"\"\"\n - **check** (_AgentCheck_) - an instance of a Check\n - **executor** (_callable_) - a callable accepting a `str` query as its sole argument and returning\n a sequence representing either the full result set or an iterator over the result set\n - **queries** (_List[Dict]_) - a list of queries in dict format\n - **tags** (_List[str]_) - a list of tags to associate with every submission\n - **error_handler** (_callable_) - a callable accepting a `str` error as its sole argument and returning\n a sanitized string, useful for scrubbing potentially sensitive information libraries emit\n \"\"\"\n super(QueryManager, self).__init__(\n executor=executor,\n submitter=check,\n queries=queries,\n tags=tags,\n error_handler=error_handler,\n hostname=hostname,\n logger=check.log,\n )\n self.check = check # type: AgentCheck\n\n only_custom_queries = is_affirmative(self.check.instance.get('only_custom_queries', False)) # type: bool\n custom_queries = list(self.check.instance.get('custom_queries', [])) # type: List[str]\n use_global_custom_queries = self.check.instance.get('use_global_custom_queries', True) # type: str\n\n # Handle overrides\n if use_global_custom_queries == 'extend':\n custom_queries.extend(self.check.init_config.get('global_custom_queries', []))\n elif (\n not custom_queries\n and 'global_custom_queries' in self.check.init_config\n and is_affirmative(use_global_custom_queries)\n ):\n custom_queries = self.check.init_config.get('global_custom_queries', [])\n\n # Override statement queries if only running custom queries\n if only_custom_queries:\n self.queries = []\n\n # Deduplicate\n for i, custom_query in enumerate(iter_unique(custom_queries), 1):\n query = Query(custom_query)\n query.query_data.setdefault('name', 'custom query #{}'.format(i))\n self.queries.append(query)\n\n if len(self.queries) == 0:\n self.logger.warning('QueryManager initialized with no query')\n\n def execute(self, extra_tags=None):\n # This needs to stay here b/c when we construct a QueryManager in a check's __init__\n # there is no check ID at that point\n self.logger = self.check.log\n\n return super(QueryManager, self).execute(extra_tags)\n
executor (callable) - a callable accepting a str query as its sole argument and returning a sequence representing either the full result set or an iterator over the result set
queries (List[Dict]) - a list of queries in dict format
tags (List[str]) - a list of tags to associate with every submission
error_handler (callable) - a callable accepting a str error as its sole argument and returning a sanitized string, useful for scrubbing potentially sensitive information libraries emit
Source code in datadog_checks_base/datadog_checks/base/utils/db/core.py
def __init__(\n self,\n check, # type: AgentCheck\n executor, # type: QueriesExecutor\n queries=None, # type: List[Dict[str, Any]]\n tags=None, # type: List[str]\n error_handler=None, # type: Callable[[str], str]\n hostname=None, # type: str\n): # type: (...) -> QueryManager\n \"\"\"\n - **check** (_AgentCheck_) - an instance of a Check\n - **executor** (_callable_) - a callable accepting a `str` query as its sole argument and returning\n a sequence representing either the full result set or an iterator over the result set\n - **queries** (_List[Dict]_) - a list of queries in dict format\n - **tags** (_List[str]_) - a list of tags to associate with every submission\n - **error_handler** (_callable_) - a callable accepting a `str` error as its sole argument and returning\n a sanitized string, useful for scrubbing potentially sensitive information libraries emit\n \"\"\"\n super(QueryManager, self).__init__(\n executor=executor,\n submitter=check,\n queries=queries,\n tags=tags,\n error_handler=error_handler,\n hostname=hostname,\n logger=check.log,\n )\n self.check = check # type: AgentCheck\n\n only_custom_queries = is_affirmative(self.check.instance.get('only_custom_queries', False)) # type: bool\n custom_queries = list(self.check.instance.get('custom_queries', [])) # type: List[str]\n use_global_custom_queries = self.check.instance.get('use_global_custom_queries', True) # type: str\n\n # Handle overrides\n if use_global_custom_queries == 'extend':\n custom_queries.extend(self.check.init_config.get('global_custom_queries', []))\n elif (\n not custom_queries\n and 'global_custom_queries' in self.check.init_config\n and is_affirmative(use_global_custom_queries)\n ):\n custom_queries = self.check.init_config.get('global_custom_queries', [])\n\n # Override statement queries if only running custom queries\n if only_custom_queries:\n self.queries = []\n\n # Deduplicate\n for i, custom_query in enumerate(iter_unique(custom_queries), 1):\n query = Query(custom_query)\n query.query_data.setdefault('name', 'custom query #{}'.format(i))\n self.queries.append(query)\n\n if len(self.queries) == 0:\n self.logger.warning('QueryManager initialized with no query')\n
"},{"location":"base/databases/#datadog_checks.base.utils.db.core.QueryManager.execute","title":"execute(extra_tags=None)","text":"Source code in datadog_checks_base/datadog_checks/base/utils/db/core.py
def execute(self, extra_tags=None):\n # This needs to stay here b/c when we construct a QueryManager in a check's __init__\n # there is no check ID at that point\n self.logger = self.check.log\n\n return super(QueryManager, self).execute(extra_tags)\n
For example, say you want to collect the fields named foo and bar. Typically, they would be stored like:
foo bar 4 2
and would be queried like:
SELECT foo, bar FROM ...\n
Often, you will instead find data stored in the following format:
metric value foo 4 bar 2
and would be queried like:
SELECT metric, value FROM ...\n
In this case, the metric column stores the name with which to match on and its value is stored in a separate column.
The required items modifier is a mapping of matched names to column data values. Consider the values to be exactly the same as the entries in the columns top level field. You must also define a source modifier either for this transformer itself or in the values of items (which will take precedence). The source will be treated as the value of the match.
bar - test.bar.total as a gauge and test.bar.count as a monotonic_count, both with a value of 5
baz - nothing since it was not defined as a match item
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_match(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n This is used for querying unstructured data.\n\n For example, say you want to collect the fields named `foo` and `bar`. Typically, they would be stored like:\n\n | foo | bar |\n | --- | --- |\n | 4 | 2 |\n\n and would be queried like:\n\n ```sql\n SELECT foo, bar FROM ...\n ```\n\n Often, you will instead find data stored in the following format:\n\n | metric | value |\n | ------ | ----- |\n | foo | 4 |\n | bar | 2 |\n\n and would be queried like:\n\n ```sql\n SELECT metric, value FROM ...\n ```\n\n In this case, the `metric` column stores the name with which to match on and its `value` is\n stored in a separate column.\n\n The required `items` modifier is a mapping of matched names to column data values. Consider the values\n to be exactly the same as the entries in the `columns` top level field. You must also define a `source`\n modifier either for this transformer itself or in the values of `items` (which will take precedence).\n The source will be treated as the value of the match.\n\n Say this is your configuration:\n\n ```yaml\n query: SELECT source1, source2, metric FROM TABLE\n columns:\n - name: value1\n type: source\n - name: value2\n type: source\n - name: metric_name\n type: match\n source: value1\n items:\n foo:\n name: test.foo\n type: gauge\n source: value2\n bar:\n name: test.bar\n type: monotonic_gauge\n ```\n\n and the result set is:\n\n | source1 | source2 | metric |\n | ------- | ------- | ------ |\n | 1 | 2 | foo |\n | 3 | 4 | baz |\n | 5 | 6 | bar |\n\n Here's what would be submitted:\n\n - `foo` - `test.foo` as a `gauge` with a value of `2`\n - `bar` - `test.bar.total` as a `gauge` and `test.bar.count` as a `monotonic_count`, both with a value of `5`\n - `baz` - nothing since it was not defined as a match item\n \"\"\"\n # Do work in a separate function to avoid having to `del` a bunch of variables\n compiled_items = _compile_match_items(transformers, modifiers) # type: Dict[str, Tuple[str, Transformer]]\n\n def match(sources, value, **kwargs):\n # type: (Dict[str, Any], str, Dict[str, Any]) -> None\n if value in compiled_items:\n source, transformer = compiled_items[value] # type: str, Transformer\n transformer(sources, sources[source], **kwargs)\n\n return match\n
Send the result as percentage of time since the last check run as a rate.
For example, say the result is a forever increasing counter representing the total time spent pausing for garbage collection since start up. That number by itself is quite useless, but as a percentage of time spent pausing since the previous collection interval it becomes a useful metric.
There is one required parameter called scale that indicates what unit of time the result should be considered. Valid values are:
second
millisecond
microsecond
nanosecond
You may also define the unit as an integer number of parts compared to seconds e.g. millisecond is equivalent to 1000.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_temporal_percent(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Send the result as percentage of time since the last check run as a `rate`.\n\n For example, say the result is a forever increasing counter representing the total time spent pausing for\n garbage collection since start up. That number by itself is quite useless, but as a percentage of time spent\n pausing since the previous collection interval it becomes a useful metric.\n\n There is one required parameter called `scale` that indicates what unit of time the result should be considered.\n Valid values are:\n\n - `second`\n - `millisecond`\n - `microsecond`\n - `nanosecond`\n\n You may also define the unit as an integer number of parts compared to seconds e.g. `millisecond` is\n equivalent to `1000`.\n \"\"\"\n scale = modifiers.pop('scale', None)\n if scale is None:\n raise ValueError('the `scale` parameter is required')\n\n if isinstance(scale, str):\n scale = constants.TIME_UNITS.get(scale.lower())\n if scale is None:\n raise ValueError(\n 'the `scale` parameter must be one of: {}'.format(' | '.join(sorted(constants.TIME_UNITS)))\n )\n elif not isinstance(scale, int):\n raise ValueError(\n 'the `scale` parameter must be an integer representing parts of a second e.g. 1000 for millisecond'\n )\n\n rate = transformers['rate'](transformers, column_name, **modifiers) # type: Callable\n\n def temporal_percent(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> None\n rate(_, total_time_to_temporal_percent(float(value), scale=scale), **kwargs)\n\n return temporal_percent\n
Send the number of seconds elapsed from a time in the past as a gauge.
For example, if the result is an instance of datetime.datetime representing 5 seconds ago, then this would submit with a value of 5.
The optional modifier format indicates what format the result is in. By default it is native, assuming the underlying library provides timestamps as datetime objects.
If the value is a UNIX timestamp you can set the format modifier to unix_time.
If the value is a string representation of a date, you must provide the expected timestamp format using the supported codes.
Example:
columns:\n - name: time_since_x\n type: time_elapsed\n format: native # default value and can be omitted\n - name: time_since_y\n type: time_elapsed\n format: unix_time\n - name: time_since_z\n type: time_elapsed\n format: \"%d/%m/%Y %H:%M:%S\"\n
Note
The code %z (lower case) is not supported on Windows.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_time_elapsed(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Send the number of seconds elapsed from a time in the past as a `gauge`.\n\n For example, if the result is an instance of\n [datetime.datetime](https://docs.python.org/3/library/datetime.html#datetime.datetime) representing 5 seconds ago,\n then this would submit with a value of `5`.\n\n The optional modifier `format` indicates what format the result is in. By default it is `native`, assuming the\n underlying library provides timestamps as `datetime` objects.\n\n If the value is a UNIX timestamp you can set the `format` modifier to `unix_time`.\n\n If the value is a string representation of a date, you must provide the expected timestamp format using the\n [supported codes](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).\n\n Example:\n\n ```yaml\n columns:\n - name: time_since_x\n type: time_elapsed\n format: native # default value and can be omitted\n - name: time_since_y\n type: time_elapsed\n format: unix_time\n - name: time_since_z\n type: time_elapsed\n format: \"%d/%m/%Y %H:%M:%S\"\n ```\n !!! note\n The code `%z` (lower case) is not supported on Windows.\n \"\"\"\n time_format = modifiers.pop('format', 'native')\n if not isinstance(time_format, str):\n raise ValueError('the `format` parameter must be a string')\n\n gauge = transformers['gauge'](transformers, column_name, **modifiers)\n\n if time_format == 'native':\n\n def time_elapsed(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> None\n value = ensure_aware_datetime(value)\n gauge(_, (datetime.now(value.tzinfo) - value).total_seconds(), **kwargs)\n\n elif time_format == 'unix_time':\n\n def time_elapsed(_, value, **kwargs):\n gauge(_, time.time() - value, **kwargs)\n\n else:\n\n def time_elapsed(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> None\n value = ensure_aware_datetime(datetime.strptime(value, time_format))\n gauge(_, (datetime.now(value.tzinfo) - value).total_seconds(), **kwargs)\n\n return time_elapsed\n
The required modifier status_map is a mapping of values to statuses. Valid statuses include:
OK
WARNING
CRITICAL
UNKNOWN
Any encountered values that are not defined will be sent as UNKNOWN.
In addition, a message modifier can be passed which can contain placeholders (based on Python's str.format) for other column names from the same query to add a message dynamically to the service_check.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_service_check(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Submit a service check.\n\n The required modifier `status_map` is a mapping of values to statuses. Valid statuses include:\n\n - `OK`\n - `WARNING`\n - `CRITICAL`\n - `UNKNOWN`\n\n Any encountered values that are not defined will be sent as `UNKNOWN`.\n\n In addition, a `message` modifier can be passed which can contain placeholders\n (based on Python's str.format) for other column names from the same query to add a message\n dynamically to the service_check.\n \"\"\"\n # Do work in a separate function to avoid having to `del` a bunch of variables\n status_map = _compile_service_check_statuses(modifiers)\n message_field = modifiers.pop('message', None)\n\n service_check_method = transformers['__service_check'](transformers, column_name, **modifiers) # type: Callable\n\n def service_check(sources, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> None\n check_status = status_map.get(value, ServiceCheck.UNKNOWN)\n if not message_field or check_status == ServiceCheck.OK:\n message = None\n else:\n message = message_field.format(**sources)\n\n service_check_method(sources, check_status, message=message, **kwargs)\n\n return service_check\n
Convert a column to a tag that will be used in every subsequent submission.
For example, if you named the column env and the column returned the value prod1, all submissions from that row will be tagged by env:prod1.
This also accepts an optional modifier called boolean that when set to true will transform the result to the string true or false. So for example if you named the column alive and the result was the number 0 the tag will be alive:false.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_tag(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Convert a column to a tag that will be used in every subsequent submission.\n\n For example, if you named the column `env` and the column returned the value `prod1`, all submissions\n from that row will be tagged by `env:prod1`.\n\n This also accepts an optional modifier called `boolean` that when set to `true` will transform the result\n to the string `true` or `false`. So for example if you named the column `alive` and the result was the\n number `0` the tag will be `alive:false`.\n \"\"\"\n template = '{}:{{}}'.format(column_name)\n boolean = is_affirmative(modifiers.pop('boolean', None))\n\n def tag(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> str\n if boolean:\n value = str(is_affirmative(value)).lower()\n\n return template.format(value)\n\n return tag\n
Convert a column to a list of tags that will be used in every submission.
Tag name is determined by column_name. The column value represents a list of values. It is expected to be either a list of strings, or a comma-separated string.
For example, if the column is named server_tag and the column returned the value us,primary, then all submissions for that row will be tagged by server_tag:us and server_tag:primary.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_tag_list(transformers, column_name, **modifiers):\n # type: (Dict[str, Transformer], str, Any) -> Transformer\n \"\"\"\n Convert a column to a list of tags that will be used in every submission.\n\n Tag name is determined by `column_name`. The column value represents a list of values. It is expected to be either\n a list of strings, or a comma-separated string.\n\n For example, if the column is named `server_tag` and the column returned the value `us,primary`, then all\n submissions for that row will be tagged by `server_tag:us` and `server_tag:primary`.\n \"\"\"\n template = '%s:{}' % column_name\n\n def tag_list(_, value, **kwargs):\n # type: (List, str, Dict[str, Any]) -> List[str]\n if isinstance(value, str):\n value = [v.strip() for v in value.split(',')]\n\n return [template.format(v) for v in value]\n\n return tag_list\n
then the extra metric disk.utilized would be sent as a gauge calculated as disk.used / disk.total * 100.
If the source of total is 0, then the submitted value will always be sent as 0 too.
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_percent(transformers, name, **modifiers):\n # type: (Dict[str, Callable], str, Any) -> Transformer\n \"\"\"\n Send a percentage based on 2 sources as a `gauge`.\n\n The required modifiers are `part` and `total`.\n\n For example, if you have this configuration:\n\n ```yaml\n columns:\n - name: disk.total\n type: gauge\n - name: disk.used\n type: gauge\n extras:\n - name: disk.utilized\n type: percent\n part: disk.used\n total: disk.total\n ```\n\n then the extra metric `disk.utilized` would be sent as a `gauge` calculated as `disk.used / disk.total * 100`.\n\n If the source of `total` is `0`, then the submitted value will always be sent as `0` too.\n \"\"\"\n available_sources = modifiers.pop('sources')\n\n part = modifiers.pop('part', None)\n if part is None:\n raise ValueError('the `part` parameter is required')\n elif not isinstance(part, str):\n raise ValueError('the `part` parameter must be a string')\n elif part not in available_sources:\n raise ValueError('the `part` parameter `{}` is not an available source'.format(part))\n\n total = modifiers.pop('total', None)\n if total is None:\n raise ValueError('the `total` parameter is required')\n elif not isinstance(total, str):\n raise ValueError('the `total` parameter must be a string')\n elif total not in available_sources:\n raise ValueError('the `total` parameter `{}` is not an available source'.format(total))\n\n del available_sources\n gauge = transformers['gauge'](transformers, name, **modifiers)\n gauge = create_extra_transformer(gauge)\n\n def percent(sources, **kwargs):\n gauge(sources, compute_percent(sources[part], sources[total]), **kwargs)\n\n return percent\n
For brevity, if the expression attribute exists and type does not then it is assumed the type is expression. The submit_type can be any transformer and any extra options are passed down to it.
The result of every expression is stored, so in lieu of a submit_type the above example could also be written as:
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_expression(transformers, name, **modifiers):\n # type: (Dict[str, Transformer], str, Dict[str, Any]) -> Transformer\n \"\"\"\n This allows the evaluation of a limited subset of Python syntax and built-in functions.\n\n ```yaml\n columns:\n - name: disk.total\n type: gauge\n - name: disk.used\n type: gauge\n extras:\n - name: disk.free\n expression: disk.total - disk.used\n submit_type: gauge\n ```\n\n For brevity, if the `expression` attribute exists and `type` does not then it is assumed the type is\n `expression`. The `submit_type` can be any transformer and any extra options are passed down to it.\n\n The result of every expression is stored, so in lieu of a `submit_type` the above example could also be written as:\n\n ```yaml\n columns:\n - name: disk.total\n type: gauge\n - name: disk.used\n type: gauge\n extras:\n - name: free\n expression: disk.total - disk.used\n - name: disk.free\n type: gauge\n source: free\n ```\n\n The order matters though, so for example the following will fail:\n\n ```yaml\n columns:\n - name: disk.total\n type: gauge\n - name: disk.used\n type: gauge\n extras:\n - name: disk.free\n type: gauge\n source: free\n - name: free\n expression: disk.total - disk.used\n ```\n\n since the source `free` does not yet exist.\n \"\"\"\n available_sources = modifiers.pop('sources')\n\n expression = modifiers.pop('expression', None)\n if expression is None:\n raise ValueError('the `expression` parameter is required')\n elif not isinstance(expression, str):\n raise ValueError('the `expression` parameter must be a string')\n elif not expression:\n raise ValueError('the `expression` parameter must not be empty')\n\n if not modifiers.pop('verbose', False):\n # Sort the sources in reverse order of length to prevent greedy matching\n available_sources = sorted(available_sources, key=lambda s: -len(s))\n\n # Escape special characters, mostly for the possible dots in metric names\n available_sources = list(map(re.escape, available_sources))\n\n # Finally, utilize the order by relying on the guarantees provided by the alternation operator\n available_sources = '|'.join(available_sources)\n\n expression = re.sub(\n SOURCE_PATTERN.format(available_sources),\n # Replace by the particular source that matched\n lambda match_obj: 'SOURCES[\"{}\"]'.format(match_obj.group(1)),\n expression,\n )\n\n expression = compile(expression, filename=name, mode='eval')\n\n del available_sources\n\n if 'submit_type' in modifiers:\n if modifiers['submit_type'] not in transformers:\n raise ValueError('unknown submit_type `{}`'.format(modifiers['submit_type']))\n\n submit_method = transformers[modifiers.pop('submit_type')](transformers, name, **modifiers) # type: Transformer\n submit_method = create_extra_transformer(submit_method) # type: Callable\n\n def execute_expression(sources, **kwargs):\n # type: (Dict[str, Any], Dict[str, Any]) -> float\n result = eval(expression, ALLOWED_GLOBALS, {'SOURCES': sources}) # type: float\n submit_method(sources, result, **kwargs)\n return result\n\n else:\n\n def execute_expression(sources, **kwargs):\n # type: (Dict[str, Any], Dict[str, Any]) -> Any\n return eval(expression, ALLOWED_GLOBALS, {'SOURCES': sources})\n\n return execute_expression\n
then a log will be sent with the following attributes:
message: value of the msg column
status: value of the level column
date: value of the time column
foo: value of the bar column
Source code in datadog_checks_base/datadog_checks/base/utils/db/transform.py
def get_log(transformers, name, **modifiers):\n # type: (Dict[str, Callable], str, Any) -> Transformer\n \"\"\"\n Send a log.\n\n The only required modifier is `attributes`.\n\n For example, if you have this configuration:\n\n ```yaml\n columns:\n - name: msg\n type: source\n - name: level\n type: source\n - name: time\n type: source\n - name: bar\n type: source\n extras:\n - type: log\n attributes:\n message: msg\n status: level\n date: time\n foo: bar\n ```\n\n then a log will be sent with the following attributes:\n\n - `message`: value of the `msg` column\n - `status`: value of the `level` column\n - `date`: value of the `time` column\n - `foo`: value of the `bar` column\n \"\"\"\n available_sources = modifiers.pop('sources')\n attributes = _compile_log_attributes(modifiers, available_sources)\n\n del available_sources\n send_log = transformers['__send_log'](transformers, **modifiers)\n send_log = create_extra_transformer(send_log)\n\n def log(sources, **kwargs):\n data = {attribute: sources[source] for attribute, source in attributes.items()}\n if kwargs['tags']:\n data['ddtags'] = ','.join(kwargs['tags'])\n\n send_log(sources, data)\n\n return log\n
Whenever you need to make HTTP requests, the base class provides a convenience member that has the same interface as the popular requests library and ensures consistent behavior across all integrations.
The wrapper automatically parses and uses configuration from the instance, init_config, and Agent config. Also, this is only done once during initialization and cached to reduce the overhead of every call.
For example, to make a GET request you would use:
response = self.http.get(url)\n
and the wrapper will pass the right things to requests. All methods accept optional keyword arguments like stream, etc.
Any method-level option will override configuration. So for example if tls_verify was set to false and you do self.http.get(url, verify=True), then SSL certificates will be verified on that particular request. You can use the keyword argument persist to override persist_connections.
There is also support for non-standard or legacy configurations with the HTTP_CONFIG_REMAPPER class attribute. For example:
Support for Unix socket is provided via requests-unixsocket and allows making UDS requests on the unix:// scheme (not supported on Windows until Python adds support for AF_UNIX, see ticket):
Some options can be set globally in init_config (with instances taking precedence). For complete documentation of every option, see the associated configuration templates for the instances and init_config sections.
Support for configuring cookies! Since they can be set globally, per-domain, and even per-path, the configuration may be complex if not thought out adequately. We'll discuss options for what that might look like. Only our spark and cisco_aci checks currently set cookies, and that is based on code logic, not configuration.
Some systems expose their logs from HTTP endpoints instead of files that the Logs Agent can tail. In such cases, you can create an Agent integration to crawl the endpoints and submit the logs.
The following diagram illustrates how crawling logs integrates into the Datadog Agent.
"},{"location":"base/logs-crawlers/#interface","title":"Interface","text":""},{"location":"base/logs-crawlers/#datadog_checks.base.checks.logs.crawler.base.LogCrawlerCheck","title":"datadog_checks.base.checks.logs.crawler.base.LogCrawlerCheck","text":"Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/base.py
class LogCrawlerCheck(AgentCheck, ABC):\n @abstractmethod\n def get_log_streams(self) -> Iterable[LogStream]:\n \"\"\"\n Yields the log streams associated with this check.\n \"\"\"\n\n def process_streams(self) -> None:\n \"\"\"\n Process the log streams and send the collected logs.\n\n Crawler checks that need more functionality can implement the `check` method and call this directly.\n \"\"\"\n for stream in self.get_log_streams():\n last_cursor = self.get_log_cursor(stream.name)\n for record in stream.records(cursor=last_cursor):\n self.send_log(record.data, cursor=record.cursor, stream=stream.name)\n\n def check(self, _) -> None:\n self.process_streams()\n
Process the log streams and send the collected logs.
Crawler checks that need more functionality can implement the check method and call this directly.
Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/base.py
def process_streams(self) -> None:\n \"\"\"\n Process the log streams and send the collected logs.\n\n Crawler checks that need more functionality can implement the `check` method and call this directly.\n \"\"\"\n for stream in self.get_log_streams():\n last_cursor = self.get_log_cursor(stream.name)\n for record in stream.records(cursor=last_cursor):\n self.send_log(record.data, cursor=record.cursor, stream=stream.name)\n
"},{"location":"base/logs-crawlers/#datadog_checks.base.checks.logs.crawler.base.LogCrawlerCheck.check","title":"check(_)","text":"Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/base.py
"},{"location":"base/logs-crawlers/#datadog_checks.base.checks.logs.crawler.stream.LogStream","title":"datadog_checks.base.checks.logs.crawler.stream.LogStream","text":"Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/stream.py
class LogStream(ABC):\n def __init__(self, *, check: AgentCheck, name: str):\n self.__check = check\n self.__name = name\n\n @property\n def check(self) -> AgentCheck:\n \"\"\"\n The AgentCheck instance associated with this LogStream.\n \"\"\"\n return self.__check\n\n @property\n def name(self) -> str:\n \"\"\"\n The name of this LogStream.\n \"\"\"\n return self.__name\n\n def construct_tags(self, tags: list[str]) -> list[str]:\n \"\"\"\n Returns a formatted string of tags which may be used directly as the `ddtags` field of logs.\n This will include the `tags` from the integration instance config.\n \"\"\"\n formatted_tags = ','.join(tags)\n return f'{self.check.formatted_tags},{formatted_tags}' if self.check.formatted_tags else formatted_tags\n\n @abstractmethod\n def records(self, *, cursor: dict[str, Any] | None = None) -> Iterable[LogRecord]:\n \"\"\"\n Yields log records as they are received.\n \"\"\"\n
"},{"location":"base/logs-crawlers/#datadog_checks.base.checks.logs.crawler.stream.LogRecord","title":"datadog_checks.base.checks.logs.crawler.stream.LogRecord","text":"Source code in datadog_checks_base/datadog_checks/base/checks/logs/crawler/stream.py
Often, you will want to collect mostly unstructured data that doesn't map well to tags, like fine-grained product version information.
The base class provides a method that handles such cases. The collected data is captured by flares, displayed on the Agent's status page, and will eventually be queryable in-app.
Custom transformers may be defined via a class level attribute METADATA_TRANSFORMERS.
This is a mapping of metadata names to functions. When you call self.set_metadata(name, value, **options), if name is in this mapping then the corresponding function will be called with the value, and the return value(s) will be collected instead.
Transformer functions must satisfy the following signature:
If the return type is str, then it will be sent as the value for name. If the return type is a mapping type, then each key will be considered a name and will be sent with its (str) value.
For example, the following would collect an entity named square with a value of '25':
There are a few default transformers, which can be overridden by custom transformers.
Source code in datadog_checks_base/datadog_checks/base/utils/metadata/core.py
class MetadataManager(object):\n \"\"\"\n Custom transformers may be defined via a class level attribute `METADATA_TRANSFORMERS`.\n\n This is a mapping of metadata names to functions. When you call\n `#!python self.set_metadata(name, value, **options)`, if `name` is in this mapping then\n the corresponding function will be called with the `value`, and the return\n value(s) will be collected instead.\n\n Transformer functions must satisfy the following signature:\n\n ```python\n def transform_<NAME>(value: Any, options: dict) -> Union[str, Dict[str, str]]:\n ```\n\n If the return type is `str`, then it will be sent as the value for `name`. If the return type is a mapping type,\n then each key will be considered a `name` and will be sent with its (`str`) value.\n\n For example, the following would collect an entity named `square` with a value of `'25'`:\n\n ```python\n from datadog_checks.base import AgentCheck\n\n\n class AwesomeCheck(AgentCheck):\n METADATA_TRANSFORMERS = {\n 'square': lambda value, options: str(int(value) ** 2)\n }\n\n def check(self, instance):\n self.set_metadata('square', '5')\n ```\n\n There are a few default transformers, which can be overridden by custom transformers.\n \"\"\"\n\n __slots__ = ('check_id', 'check_name', 'logger', 'metadata_transformers')\n\n def __init__(self, check_name, check_id, logger=None, metadata_transformers=None):\n self.check_name = check_name\n self.check_id = check_id\n self.logger = logger or LOGGER\n self.metadata_transformers = {'version': self.transform_version}\n\n if metadata_transformers:\n self.metadata_transformers.update(metadata_transformers)\n\n def submit_raw(self, name, value):\n datadog_agent.set_check_metadata(self.check_id, to_native_string(name), to_native_string(value))\n\n def submit(self, name, value, options):\n transformer = self.metadata_transformers.get(name)\n if transformer:\n try:\n transformed = transformer(value, options)\n except Exception as e:\n if is_primitive(value):\n self.logger.debug('Unable to transform `%s` metadata value `%s`: %s', name, value, e)\n else:\n self.logger.debug('Unable to transform `%s` metadata: %s', name, e)\n\n return\n\n if isinstance(transformed, str):\n self.submit_raw(name, transformed)\n else:\n for transformed_name, transformed_value in iteritems(transformed):\n self.submit_raw(transformed_name, transformed_value)\n else:\n self.submit_raw(name, value)\n\n def transform_version(self, version, options):\n \"\"\"\n Transforms a version like `1.2.3-rc.4+5` to its constituent parts. In all cases,\n the metadata names `version.raw` and `version.scheme` will be collected.\n\n If a `scheme` is defined then it will be looked up from our known schemes. If no\n scheme is defined then it will default to `semver`. The supported schemes are:\n\n - `regex` - A `pattern` must also be defined. The pattern must be a `str` or a pre-compiled\n `re.Pattern`. Any matching named subgroups will then be sent as `version.<GROUP_NAME>`. In this case,\n the check name will be used as the value of `version.scheme` unless `final_scheme` is also set, which\n will take precedence.\n - `parts` - A `part_map` must also be defined. Each key in this mapping will be considered\n a `name` and will be sent with its (`str`) value.\n - `semver` - This is essentially the same as `regex` with the `pattern` set to the standard regular\n expression for semantic versioning.\n\n Taking the example above, calling `#!python self.set_metadata('version', '1.2.3-rc.4+5')` would produce:\n\n | name | value |\n | --- | --- |\n | `version.raw` | `1.2.3-rc.4+5` |\n | `version.scheme` | `semver` |\n | `version.major` | `1` |\n | `version.minor` | `2` |\n | `version.patch` | `3` |\n | `version.release` | `rc.4` |\n | `version.build` | `5` |\n \"\"\"\n scheme, version_parts = parse_version(version, options)\n if scheme == 'regex' or scheme == 'parts':\n scheme = options.get('final_scheme', self.check_name)\n\n data = {'version.{}'.format(part_name): part_value for part_name, part_value in iteritems(version_parts)}\n data['version.raw'] = version\n data['version.scheme'] = scheme\n\n return data\n
Transforms a version like 1.2.3-rc.4+5 to its constituent parts. In all cases, the metadata names version.raw and version.scheme will be collected.
If a scheme is defined then it will be looked up from our known schemes. If no scheme is defined then it will default to semver. The supported schemes are:
regex - A pattern must also be defined. The pattern must be a str or a pre-compiled re.Pattern. Any matching named subgroups will then be sent as version.<GROUP_NAME>. In this case, the check name will be used as the value of version.scheme unless final_scheme is also set, which will take precedence.
parts - A part_map must also be defined. Each key in this mapping will be considered a name and will be sent with its (str) value.
semver - This is essentially the same as regex with the pattern set to the standard regular expression for semantic versioning.
Taking the example above, calling self.set_metadata('version', '1.2.3-rc.4+5') would produce:
name value version.raw1.2.3-rc.4+5version.schemesemverversion.major1version.minor2version.patch3version.releaserc.4version.build5 Source code in datadog_checks_base/datadog_checks/base/utils/metadata/core.py
def transform_version(self, version, options):\n \"\"\"\n Transforms a version like `1.2.3-rc.4+5` to its constituent parts. In all cases,\n the metadata names `version.raw` and `version.scheme` will be collected.\n\n If a `scheme` is defined then it will be looked up from our known schemes. If no\n scheme is defined then it will default to `semver`. The supported schemes are:\n\n - `regex` - A `pattern` must also be defined. The pattern must be a `str` or a pre-compiled\n `re.Pattern`. Any matching named subgroups will then be sent as `version.<GROUP_NAME>`. In this case,\n the check name will be used as the value of `version.scheme` unless `final_scheme` is also set, which\n will take precedence.\n - `parts` - A `part_map` must also be defined. Each key in this mapping will be considered\n a `name` and will be sent with its (`str`) value.\n - `semver` - This is essentially the same as `regex` with the `pattern` set to the standard regular\n expression for semantic versioning.\n\n Taking the example above, calling `#!python self.set_metadata('version', '1.2.3-rc.4+5')` would produce:\n\n | name | value |\n | --- | --- |\n | `version.raw` | `1.2.3-rc.4+5` |\n | `version.scheme` | `semver` |\n | `version.major` | `1` |\n | `version.minor` | `2` |\n | `version.patch` | `3` |\n | `version.release` | `rc.4` |\n | `version.build` | `5` |\n \"\"\"\n scheme, version_parts = parse_version(version, options)\n if scheme == 'regex' or scheme == 'parts':\n scheme = options.get('final_scheme', self.check_name)\n\n data = {'version.{}'.format(part_name): part_value for part_name, part_value in iteritems(version_parts)}\n data['version.raw'] = version\n data['version.scheme'] = scheme\n\n return data\n
OpenMetrics is used for collecting metrics using the CNCF-backed OpenMetrics format. This version is the default version for all new OpenMetric-checks, and it is compatible with Python 3 only.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
class OpenMetricsBaseCheckV2(AgentCheck):\n \"\"\"\n OpenMetricsBaseCheckV2 is an updated class of OpenMetricsBaseCheck to scrape endpoints that emit Prometheus metrics.\n\n Minimal example configuration:\n\n ```yaml\n instances:\n - openmetrics_endpoint: http://example.com/endpoint\n namespace: \"foobar\"\n metrics:\n - bar\n - foo\n ```\n\n \"\"\"\n\n DEFAULT_METRIC_LIMIT = 2000\n\n # Allow tracing for openmetrics integrations\n def __init_subclass__(cls, **kwargs):\n super().__init_subclass__(**kwargs)\n return traced_class(cls)\n\n def __init__(self, name, init_config, instances):\n \"\"\"\n The base class for any OpenMetrics-based integration.\n\n Subclasses are expected to override this to add their custom scrapers or transformers.\n When overriding, make sure to call this (the parent's) __init__ first!\n \"\"\"\n super(OpenMetricsBaseCheckV2, self).__init__(name, init_config, instances)\n\n # All desired scraper configurations, which subclasses can override as needed\n self.scraper_configs = [self.instance]\n\n # All configured scrapers keyed by the endpoint\n self.scrapers = {}\n\n self.check_initializations.append(self.configure_scrapers)\n\n def check(self, _):\n \"\"\"\n Perform an openmetrics-based check.\n\n Subclasses should typically not need to override this, as most common customization\n needs are covered by the use of custom scrapers.\n Another thing to note is that this check ignores its instance argument completely.\n We take care of instance-level customization at initialization time.\n \"\"\"\n self.refresh_scrapers()\n\n for endpoint, scraper in self.scrapers.items():\n self.log.debug('Scraping OpenMetrics endpoint: %s', endpoint)\n\n with self.adopt_namespace(scraper.namespace):\n try:\n scraper.scrape()\n except (ConnectionError, RequestException) as e:\n self.log.error(\"There was an error scraping endpoint %s: %s\", endpoint, str(e))\n raise_from(type(e)(\"There was an error scraping endpoint {}: {}\".format(endpoint, e)), None)\n\n def configure_scrapers(self):\n \"\"\"\n Creates a scraper configuration for each instance.\n \"\"\"\n\n scrapers = {}\n\n for config in self.scraper_configs:\n endpoint = config.get('openmetrics_endpoint', '')\n if not isinstance(endpoint, str):\n raise ConfigurationError('The setting `openmetrics_endpoint` must be a string')\n elif not endpoint:\n raise ConfigurationError('The setting `openmetrics_endpoint` is required')\n\n scrapers[endpoint] = self.create_scraper(config)\n\n self.scrapers.clear()\n self.scrapers.update(scrapers)\n\n def create_scraper(self, config):\n \"\"\"\n Subclasses can override to return a custom scraper based on instance configuration.\n \"\"\"\n return OpenMetricsScraper(self, self.get_config_with_defaults(config))\n\n def set_dynamic_tags(self, *tags):\n for scraper in self.scrapers.values():\n scraper.set_dynamic_tags(*tags)\n\n def get_config_with_defaults(self, config):\n return ChainMap(config, self.get_default_config())\n\n def get_default_config(self):\n return {}\n\n def refresh_scrapers(self):\n pass\n\n @contextmanager\n def adopt_namespace(self, namespace):\n old_namespace = self.__NAMESPACE__\n\n try:\n self.__NAMESPACE__ = namespace or old_namespace\n yield\n finally:\n self.__NAMESPACE__ = old_namespace\n
The base class for any OpenMetrics-based integration.
Subclasses are expected to override this to add their custom scrapers or transformers. When overriding, make sure to call this (the parent's) init first!
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
def __init__(self, name, init_config, instances):\n \"\"\"\n The base class for any OpenMetrics-based integration.\n\n Subclasses are expected to override this to add their custom scrapers or transformers.\n When overriding, make sure to call this (the parent's) __init__ first!\n \"\"\"\n super(OpenMetricsBaseCheckV2, self).__init__(name, init_config, instances)\n\n # All desired scraper configurations, which subclasses can override as needed\n self.scraper_configs = [self.instance]\n\n # All configured scrapers keyed by the endpoint\n self.scrapers = {}\n\n self.check_initializations.append(self.configure_scrapers)\n
Subclasses should typically not need to override this, as most common customization needs are covered by the use of custom scrapers. Another thing to note is that this check ignores its instance argument completely. We take care of instance-level customization at initialization time.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
def check(self, _):\n \"\"\"\n Perform an openmetrics-based check.\n\n Subclasses should typically not need to override this, as most common customization\n needs are covered by the use of custom scrapers.\n Another thing to note is that this check ignores its instance argument completely.\n We take care of instance-level customization at initialization time.\n \"\"\"\n self.refresh_scrapers()\n\n for endpoint, scraper in self.scrapers.items():\n self.log.debug('Scraping OpenMetrics endpoint: %s', endpoint)\n\n with self.adopt_namespace(scraper.namespace):\n try:\n scraper.scrape()\n except (ConnectionError, RequestException) as e:\n self.log.error(\"There was an error scraping endpoint %s: %s\", endpoint, str(e))\n raise_from(type(e)(\"There was an error scraping endpoint {}: {}\".format(endpoint, e)), None)\n
Creates a scraper configuration for each instance.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
def configure_scrapers(self):\n \"\"\"\n Creates a scraper configuration for each instance.\n \"\"\"\n\n scrapers = {}\n\n for config in self.scraper_configs:\n endpoint = config.get('openmetrics_endpoint', '')\n if not isinstance(endpoint, str):\n raise ConfigurationError('The setting `openmetrics_endpoint` must be a string')\n elif not endpoint:\n raise ConfigurationError('The setting `openmetrics_endpoint` is required')\n\n scrapers[endpoint] = self.create_scraper(config)\n\n self.scrapers.clear()\n self.scrapers.update(scrapers)\n
Subclasses can override to return a custom scraper based on instance configuration.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/base.py
def create_scraper(self, config):\n \"\"\"\n Subclasses can override to return a custom scraper based on instance configuration.\n \"\"\"\n return OpenMetricsScraper(self, self.get_config_with_defaults(config))\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
class OpenMetricsScraper:\n \"\"\"\n OpenMetricsScraper is a class that can be used to override the default scraping behavior for OpenMetricsBaseCheckV2.\n\n Minimal example configuration:\n\n ```yaml\n - openmetrics_endpoint: http://example.com/endpoint\n namespace: \"foobar\"\n metrics:\n - bar\n - foo\n raw_metric_prefix: \"test\"\n telemetry: \"true\"\n hostname_label: node\n ```\n\n \"\"\"\n\n SERVICE_CHECK_HEALTH = 'openmetrics.health'\n\n def __init__(self, check, config):\n \"\"\"\n The base class for any scraper overrides.\n \"\"\"\n\n self.config = config\n\n # Save a reference to the check instance\n self.check = check\n\n # Parse the configuration\n self.endpoint = config['openmetrics_endpoint']\n\n self.metric_transformer = MetricTransformer(self.check, config)\n self.label_aggregator = LabelAggregator(self.check, config)\n\n self.enable_telemetry = is_affirmative(config.get('telemetry', False))\n # Make every telemetry submission method a no-op to avoid many lookups of `self.enable_telemetry`\n if not self.enable_telemetry:\n for name, _ in inspect.getmembers(self, predicate=inspect.ismethod):\n if name.startswith('submit_telemetry_'):\n setattr(self, name, no_op)\n\n # Prevent overriding an integration's defined namespace\n self.namespace = check.__NAMESPACE__ or config.get('namespace', '')\n if not isinstance(self.namespace, str):\n raise ConfigurationError('Setting `namespace` must be a string')\n\n self.raw_metric_prefix = config.get('raw_metric_prefix', '')\n if not isinstance(self.raw_metric_prefix, str):\n raise ConfigurationError('Setting `raw_metric_prefix` must be a string')\n\n self.enable_health_service_check = is_affirmative(config.get('enable_health_service_check', True))\n self.ignore_connection_errors = is_affirmative(config.get('ignore_connection_errors', False))\n\n self.hostname_label = config.get('hostname_label', '')\n if not isinstance(self.hostname_label, str):\n raise ConfigurationError('Setting `hostname_label` must be a string')\n\n hostname_format = config.get('hostname_format', '')\n if not isinstance(hostname_format, str):\n raise ConfigurationError('Setting `hostname_format` must be a string')\n\n self.hostname_formatter = None\n if self.hostname_label and hostname_format:\n placeholder = '<HOSTNAME>'\n if placeholder not in hostname_format:\n raise ConfigurationError(f'Setting `hostname_format` does not contain the placeholder `{placeholder}`')\n\n self.hostname_formatter = lambda hostname: hostname_format.replace('<HOSTNAME>', hostname, 1)\n\n exclude_labels = config.get('exclude_labels', [])\n if not isinstance(exclude_labels, list):\n raise ConfigurationError('Setting `exclude_labels` must be an array')\n\n self.exclude_labels = set()\n for i, entry in enumerate(exclude_labels, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `exclude_labels` must be a string')\n\n self.exclude_labels.add(entry)\n\n include_labels = config.get('include_labels', [])\n if not isinstance(include_labels, list):\n raise ConfigurationError('Setting `include_labels` must be an array')\n self.include_labels = set()\n for i, entry in enumerate(include_labels, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `include_labels` must be a string')\n if entry in self.exclude_labels:\n self.log.debug(\n 'Label `%s` is set in both `exclude_labels` and `include_labels`. Excluding label.', entry\n )\n self.include_labels.add(entry)\n\n self.rename_labels = config.get('rename_labels', {})\n if not isinstance(self.rename_labels, dict):\n raise ConfigurationError('Setting `rename_labels` must be a mapping')\n\n for key, value in self.rename_labels.items():\n if not isinstance(value, str):\n raise ConfigurationError(f'Value for label `{key}` of setting `rename_labels` must be a string')\n\n exclude_metrics = config.get('exclude_metrics', [])\n if not isinstance(exclude_metrics, list):\n raise ConfigurationError('Setting `exclude_metrics` must be an array')\n\n self.exclude_metrics = set()\n self.exclude_metrics_pattern = None\n exclude_metrics_patterns = []\n for i, entry in enumerate(exclude_metrics, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `exclude_metrics` must be a string')\n\n escaped_entry = re.escape(entry)\n if entry == escaped_entry:\n self.exclude_metrics.add(entry)\n else:\n exclude_metrics_patterns.append(entry)\n\n if exclude_metrics_patterns:\n self.exclude_metrics_pattern = re.compile('|'.join(exclude_metrics_patterns))\n\n self.exclude_metrics_by_labels = {}\n exclude_metrics_by_labels = config.get('exclude_metrics_by_labels', {})\n if not isinstance(exclude_metrics_by_labels, dict):\n raise ConfigurationError('Setting `exclude_metrics_by_labels` must be a mapping')\n elif exclude_metrics_by_labels:\n for label, values in exclude_metrics_by_labels.items():\n if values is True:\n self.exclude_metrics_by_labels[label] = return_true\n elif isinstance(values, list):\n for i, value in enumerate(values, 1):\n if not isinstance(value, str):\n raise ConfigurationError(\n f'Value #{i} for label `{label}` of setting `exclude_metrics_by_labels` '\n f'must be a string'\n )\n\n self.exclude_metrics_by_labels[label] = (\n lambda label_value, pattern=re.compile('|'.join(values)): pattern.search( # noqa: B008\n label_value\n ) # noqa: B008, E501\n is not None\n )\n else:\n raise ConfigurationError(\n f'Label `{label}` of setting `exclude_metrics_by_labels` must be an array or set to `true`'\n )\n\n custom_tags = config.get('tags', []) # type: List[str]\n if not isinstance(custom_tags, list):\n raise ConfigurationError('Setting `tags` must be an array')\n\n for i, entry in enumerate(custom_tags, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `tags` must be a string')\n\n # Some tags can be ignored to reduce the cardinality.\n # This can be useful for cost optimization in containerized environments\n # when the openmetrics check is configured to collect custom metrics.\n # Even when the Agent's Tagger is configured to add low-cardinality tags only,\n # some tags can still generate unwanted metric contexts (e.g pod annotations as tags).\n ignore_tags = config.get('ignore_tags', [])\n if ignore_tags:\n ignored_tags_re = re.compile('|'.join(set(ignore_tags)))\n custom_tags = [tag for tag in custom_tags if not ignored_tags_re.search(tag)]\n\n self.static_tags = copy(custom_tags)\n if is_affirmative(self.config.get('tag_by_endpoint', True)):\n self.static_tags.append(f'endpoint:{self.endpoint}')\n\n # These will be applied only to service checks\n self.static_tags = tuple(self.static_tags)\n # These will be applied to everything except service checks\n self.tags = self.static_tags\n\n self.raw_line_filter = None\n raw_line_filters = config.get('raw_line_filters', [])\n if not isinstance(raw_line_filters, list):\n raise ConfigurationError('Setting `raw_line_filters` must be an array')\n elif raw_line_filters:\n for i, entry in enumerate(raw_line_filters, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `raw_line_filters` must be a string')\n\n self.raw_line_filter = re.compile('|'.join(raw_line_filters))\n\n self.http = RequestsWrapper(config, self.check.init_config, self.check.HTTP_CONFIG_REMAPPER, self.check.log)\n\n self._content_type = ''\n self._use_latest_spec = is_affirmative(config.get('use_latest_spec', False))\n if self._use_latest_spec:\n accept_header = 'application/openmetrics-text;version=1.0.0,application/openmetrics-text;version=0.0.1'\n else:\n accept_header = 'text/plain'\n\n # Request the appropriate exposition format\n if self.http.options['headers'].get('Accept') == '*/*':\n self.http.options['headers']['Accept'] = accept_header\n\n self.use_process_start_time = is_affirmative(config.get('use_process_start_time'))\n\n # Used for monotonic counts\n self.flush_first_value = False\n\n def scrape(self):\n \"\"\"\n Execute a scrape, and for each metric collected, transform the metric.\n \"\"\"\n runtime_data = {'flush_first_value': self.flush_first_value, 'static_tags': self.static_tags}\n\n for metric in self.consume_metrics(runtime_data):\n transformer = self.metric_transformer.get(metric)\n if transformer is None:\n continue\n\n transformer(metric, self.generate_sample_data(metric), runtime_data)\n\n self.flush_first_value = True\n\n def consume_metrics(self, runtime_data):\n \"\"\"\n Yield the processed metrics and filter out excluded metrics.\n \"\"\"\n\n metric_parser = self.parse_metrics()\n if not self.flush_first_value and self.use_process_start_time:\n metric_parser = first_scrape_handler(metric_parser, runtime_data, datadog_agent.get_process_start_time())\n if self.label_aggregator.configured:\n metric_parser = self.label_aggregator(metric_parser)\n\n for metric in metric_parser:\n if metric.name in self.exclude_metrics or (\n self.exclude_metrics_pattern is not None and self.exclude_metrics_pattern.search(metric.name)\n ):\n self.submit_telemetry_number_of_ignored_metric_samples(metric)\n continue\n\n yield metric\n\n def parse_metrics(self):\n \"\"\"\n Get the line streamer and yield processed metrics.\n \"\"\"\n\n line_streamer = self.stream_connection_lines()\n if self.raw_line_filter is not None:\n line_streamer = self.filter_connection_lines(line_streamer)\n\n # Since we determine `self.parse_metric_families` dynamically from the response and that's done as a\n # side effect inside the `line_streamer` generator, we need to consume the first line in order to\n # trigger that side effect.\n try:\n line_streamer = chain([next(line_streamer)], line_streamer)\n except StopIteration:\n # If line_streamer is an empty iterator, next(line_streamer) fails.\n return\n\n for metric in self.parse_metric_families(line_streamer):\n self.submit_telemetry_number_of_total_metric_samples(metric)\n\n # It is critical that the prefix is removed immediately so that\n # all other configuration may reference the trimmed metric name\n if self.raw_metric_prefix and metric.name.startswith(self.raw_metric_prefix):\n metric.name = metric.name[len(self.raw_metric_prefix) :]\n\n yield metric\n\n @property\n def parse_metric_families(self):\n media_type = self._content_type.split(';')[0]\n # Setting `use_latest_spec` forces the use of the OpenMetrics format, otherwise\n # the format will be chosen based on the media type specified in the response's content-header.\n # The selection is based on what Prometheus does:\n # https://github.com/prometheus/prometheus/blob/v2.43.0/model/textparse/interface.go#L83-L90\n return (\n parse_openmetrics\n if self._use_latest_spec or media_type == 'application/openmetrics-text'\n else parse_prometheus\n )\n\n def generate_sample_data(self, metric):\n \"\"\"\n Yield a sample of processed data.\n \"\"\"\n\n label_normalizer = get_label_normalizer(metric.type)\n\n for sample in metric.samples:\n value = sample.value\n if isnan(value) or isinf(value):\n self.log.debug('Ignoring sample for metric `%s` as it has an invalid value: %s', metric.name, value)\n continue\n\n tags = []\n skip_sample = False\n labels = sample.labels\n self.label_aggregator.populate(labels)\n label_normalizer(labels)\n\n for label_name, label_value in labels.items():\n sample_excluder = self.exclude_metrics_by_labels.get(label_name)\n if sample_excluder is not None and sample_excluder(label_value):\n skip_sample = True\n break\n elif label_name in self.exclude_labels:\n continue\n elif self.include_labels and label_name not in self.include_labels:\n continue\n\n label_name = self.rename_labels.get(label_name, label_name)\n tags.append(f'{label_name}:{label_value}')\n\n if skip_sample:\n continue\n\n tags.extend(self.tags)\n\n hostname = \"\"\n if self.hostname_label and self.hostname_label in labels:\n hostname = labels[self.hostname_label]\n if self.hostname_formatter is not None:\n hostname = self.hostname_formatter(hostname)\n\n self.submit_telemetry_number_of_processed_metric_samples()\n yield sample, tags, hostname\n\n def stream_connection_lines(self):\n \"\"\"\n Yield the connection line.\n \"\"\"\n\n try:\n with self.get_connection() as connection:\n # Media type will be used to select parser dynamically\n self._content_type = connection.headers.get('Content-Type', '')\n for line in connection.iter_lines(decode_unicode=True):\n yield line\n except ConnectionError as e:\n if self.ignore_connection_errors:\n self.log.warning(\"OpenMetrics endpoint %s is not accessible\", self.endpoint)\n else:\n raise e\n\n def filter_connection_lines(self, line_streamer):\n \"\"\"\n Filter connection lines in the line streamer.\n \"\"\"\n\n for line in line_streamer:\n if self.raw_line_filter.search(line):\n self.submit_telemetry_number_of_ignored_lines()\n else:\n yield line\n\n def get_connection(self):\n \"\"\"\n Send a request to scrape metrics. Return the response or throw an exception.\n \"\"\"\n\n try:\n response = self.send_request()\n except Exception as e:\n self.submit_health_check(ServiceCheck.CRITICAL, message=str(e))\n raise\n else:\n try:\n response.raise_for_status()\n except Exception as e:\n self.submit_health_check(ServiceCheck.CRITICAL, message=str(e))\n response.close()\n raise\n else:\n self.submit_health_check(ServiceCheck.OK)\n\n # Never derive the encoding from the locale\n if response.encoding is None:\n response.encoding = 'utf-8'\n\n self.submit_telemetry_endpoint_response_size(response)\n\n return response\n\n def send_request(self, **kwargs):\n \"\"\"\n Send an HTTP GET request to the `openmetrics_endpoint` value.\n \"\"\"\n\n kwargs['stream'] = True\n return self.http.get(self.endpoint, **kwargs)\n\n def set_dynamic_tags(self, *tags):\n \"\"\"\n Set dynamic tags.\n \"\"\"\n\n self.tags = tuple(chain(self.static_tags, tags))\n\n def submit_health_check(self, status, **kwargs):\n \"\"\"\n If health service check is enabled, send an `openmetrics.health` service check.\n \"\"\"\n\n if self.enable_health_service_check:\n self.service_check(self.SERVICE_CHECK_HEALTH, status, tags=self.static_tags, **kwargs)\n\n def submit_telemetry_number_of_total_metric_samples(self, metric):\n self.count('telemetry.metrics.input.count', len(metric.samples), tags=self.tags)\n\n def submit_telemetry_number_of_ignored_metric_samples(self, metric):\n self.count('telemetry.metrics.ignored.count', len(metric.samples), tags=self.tags)\n\n def submit_telemetry_number_of_processed_metric_samples(self):\n self.count('telemetry.metrics.processed.count', 1, tags=self.tags)\n\n def submit_telemetry_number_of_ignored_lines(self):\n self.count('telemetry.metrics.blacklist.count', 1, tags=self.tags)\n\n def submit_telemetry_endpoint_response_size(self, response):\n content_length = response.headers.get('Content-Length')\n if content_length is not None:\n content_length = int(content_length)\n else:\n content_length = len(response.content)\n\n self.gauge('telemetry.payload.size', content_length, tags=self.tags)\n\n def __getattr__(self, name):\n # Forward all unknown attribute lookups to the check instance for access to submission methods, hostname, etc.\n attribute = getattr(self.check, name)\n setattr(self, name, attribute)\n return attribute\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def __init__(self, check, config):\n \"\"\"\n The base class for any scraper overrides.\n \"\"\"\n\n self.config = config\n\n # Save a reference to the check instance\n self.check = check\n\n # Parse the configuration\n self.endpoint = config['openmetrics_endpoint']\n\n self.metric_transformer = MetricTransformer(self.check, config)\n self.label_aggregator = LabelAggregator(self.check, config)\n\n self.enable_telemetry = is_affirmative(config.get('telemetry', False))\n # Make every telemetry submission method a no-op to avoid many lookups of `self.enable_telemetry`\n if not self.enable_telemetry:\n for name, _ in inspect.getmembers(self, predicate=inspect.ismethod):\n if name.startswith('submit_telemetry_'):\n setattr(self, name, no_op)\n\n # Prevent overriding an integration's defined namespace\n self.namespace = check.__NAMESPACE__ or config.get('namespace', '')\n if not isinstance(self.namespace, str):\n raise ConfigurationError('Setting `namespace` must be a string')\n\n self.raw_metric_prefix = config.get('raw_metric_prefix', '')\n if not isinstance(self.raw_metric_prefix, str):\n raise ConfigurationError('Setting `raw_metric_prefix` must be a string')\n\n self.enable_health_service_check = is_affirmative(config.get('enable_health_service_check', True))\n self.ignore_connection_errors = is_affirmative(config.get('ignore_connection_errors', False))\n\n self.hostname_label = config.get('hostname_label', '')\n if not isinstance(self.hostname_label, str):\n raise ConfigurationError('Setting `hostname_label` must be a string')\n\n hostname_format = config.get('hostname_format', '')\n if not isinstance(hostname_format, str):\n raise ConfigurationError('Setting `hostname_format` must be a string')\n\n self.hostname_formatter = None\n if self.hostname_label and hostname_format:\n placeholder = '<HOSTNAME>'\n if placeholder not in hostname_format:\n raise ConfigurationError(f'Setting `hostname_format` does not contain the placeholder `{placeholder}`')\n\n self.hostname_formatter = lambda hostname: hostname_format.replace('<HOSTNAME>', hostname, 1)\n\n exclude_labels = config.get('exclude_labels', [])\n if not isinstance(exclude_labels, list):\n raise ConfigurationError('Setting `exclude_labels` must be an array')\n\n self.exclude_labels = set()\n for i, entry in enumerate(exclude_labels, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `exclude_labels` must be a string')\n\n self.exclude_labels.add(entry)\n\n include_labels = config.get('include_labels', [])\n if not isinstance(include_labels, list):\n raise ConfigurationError('Setting `include_labels` must be an array')\n self.include_labels = set()\n for i, entry in enumerate(include_labels, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `include_labels` must be a string')\n if entry in self.exclude_labels:\n self.log.debug(\n 'Label `%s` is set in both `exclude_labels` and `include_labels`. Excluding label.', entry\n )\n self.include_labels.add(entry)\n\n self.rename_labels = config.get('rename_labels', {})\n if not isinstance(self.rename_labels, dict):\n raise ConfigurationError('Setting `rename_labels` must be a mapping')\n\n for key, value in self.rename_labels.items():\n if not isinstance(value, str):\n raise ConfigurationError(f'Value for label `{key}` of setting `rename_labels` must be a string')\n\n exclude_metrics = config.get('exclude_metrics', [])\n if not isinstance(exclude_metrics, list):\n raise ConfigurationError('Setting `exclude_metrics` must be an array')\n\n self.exclude_metrics = set()\n self.exclude_metrics_pattern = None\n exclude_metrics_patterns = []\n for i, entry in enumerate(exclude_metrics, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `exclude_metrics` must be a string')\n\n escaped_entry = re.escape(entry)\n if entry == escaped_entry:\n self.exclude_metrics.add(entry)\n else:\n exclude_metrics_patterns.append(entry)\n\n if exclude_metrics_patterns:\n self.exclude_metrics_pattern = re.compile('|'.join(exclude_metrics_patterns))\n\n self.exclude_metrics_by_labels = {}\n exclude_metrics_by_labels = config.get('exclude_metrics_by_labels', {})\n if not isinstance(exclude_metrics_by_labels, dict):\n raise ConfigurationError('Setting `exclude_metrics_by_labels` must be a mapping')\n elif exclude_metrics_by_labels:\n for label, values in exclude_metrics_by_labels.items():\n if values is True:\n self.exclude_metrics_by_labels[label] = return_true\n elif isinstance(values, list):\n for i, value in enumerate(values, 1):\n if not isinstance(value, str):\n raise ConfigurationError(\n f'Value #{i} for label `{label}` of setting `exclude_metrics_by_labels` '\n f'must be a string'\n )\n\n self.exclude_metrics_by_labels[label] = (\n lambda label_value, pattern=re.compile('|'.join(values)): pattern.search( # noqa: B008\n label_value\n ) # noqa: B008, E501\n is not None\n )\n else:\n raise ConfigurationError(\n f'Label `{label}` of setting `exclude_metrics_by_labels` must be an array or set to `true`'\n )\n\n custom_tags = config.get('tags', []) # type: List[str]\n if not isinstance(custom_tags, list):\n raise ConfigurationError('Setting `tags` must be an array')\n\n for i, entry in enumerate(custom_tags, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `tags` must be a string')\n\n # Some tags can be ignored to reduce the cardinality.\n # This can be useful for cost optimization in containerized environments\n # when the openmetrics check is configured to collect custom metrics.\n # Even when the Agent's Tagger is configured to add low-cardinality tags only,\n # some tags can still generate unwanted metric contexts (e.g pod annotations as tags).\n ignore_tags = config.get('ignore_tags', [])\n if ignore_tags:\n ignored_tags_re = re.compile('|'.join(set(ignore_tags)))\n custom_tags = [tag for tag in custom_tags if not ignored_tags_re.search(tag)]\n\n self.static_tags = copy(custom_tags)\n if is_affirmative(self.config.get('tag_by_endpoint', True)):\n self.static_tags.append(f'endpoint:{self.endpoint}')\n\n # These will be applied only to service checks\n self.static_tags = tuple(self.static_tags)\n # These will be applied to everything except service checks\n self.tags = self.static_tags\n\n self.raw_line_filter = None\n raw_line_filters = config.get('raw_line_filters', [])\n if not isinstance(raw_line_filters, list):\n raise ConfigurationError('Setting `raw_line_filters` must be an array')\n elif raw_line_filters:\n for i, entry in enumerate(raw_line_filters, 1):\n if not isinstance(entry, str):\n raise ConfigurationError(f'Entry #{i} of setting `raw_line_filters` must be a string')\n\n self.raw_line_filter = re.compile('|'.join(raw_line_filters))\n\n self.http = RequestsWrapper(config, self.check.init_config, self.check.HTTP_CONFIG_REMAPPER, self.check.log)\n\n self._content_type = ''\n self._use_latest_spec = is_affirmative(config.get('use_latest_spec', False))\n if self._use_latest_spec:\n accept_header = 'application/openmetrics-text;version=1.0.0,application/openmetrics-text;version=0.0.1'\n else:\n accept_header = 'text/plain'\n\n # Request the appropriate exposition format\n if self.http.options['headers'].get('Accept') == '*/*':\n self.http.options['headers']['Accept'] = accept_header\n\n self.use_process_start_time = is_affirmative(config.get('use_process_start_time'))\n\n # Used for monotonic counts\n self.flush_first_value = False\n
Execute a scrape, and for each metric collected, transform the metric.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def scrape(self):\n \"\"\"\n Execute a scrape, and for each metric collected, transform the metric.\n \"\"\"\n runtime_data = {'flush_first_value': self.flush_first_value, 'static_tags': self.static_tags}\n\n for metric in self.consume_metrics(runtime_data):\n transformer = self.metric_transformer.get(metric)\n if transformer is None:\n continue\n\n transformer(metric, self.generate_sample_data(metric), runtime_data)\n\n self.flush_first_value = True\n
Yield the processed metrics and filter out excluded metrics.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def consume_metrics(self, runtime_data):\n \"\"\"\n Yield the processed metrics and filter out excluded metrics.\n \"\"\"\n\n metric_parser = self.parse_metrics()\n if not self.flush_first_value and self.use_process_start_time:\n metric_parser = first_scrape_handler(metric_parser, runtime_data, datadog_agent.get_process_start_time())\n if self.label_aggregator.configured:\n metric_parser = self.label_aggregator(metric_parser)\n\n for metric in metric_parser:\n if metric.name in self.exclude_metrics or (\n self.exclude_metrics_pattern is not None and self.exclude_metrics_pattern.search(metric.name)\n ):\n self.submit_telemetry_number_of_ignored_metric_samples(metric)\n continue\n\n yield metric\n
Get the line streamer and yield processed metrics.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def parse_metrics(self):\n \"\"\"\n Get the line streamer and yield processed metrics.\n \"\"\"\n\n line_streamer = self.stream_connection_lines()\n if self.raw_line_filter is not None:\n line_streamer = self.filter_connection_lines(line_streamer)\n\n # Since we determine `self.parse_metric_families` dynamically from the response and that's done as a\n # side effect inside the `line_streamer` generator, we need to consume the first line in order to\n # trigger that side effect.\n try:\n line_streamer = chain([next(line_streamer)], line_streamer)\n except StopIteration:\n # If line_streamer is an empty iterator, next(line_streamer) fails.\n return\n\n for metric in self.parse_metric_families(line_streamer):\n self.submit_telemetry_number_of_total_metric_samples(metric)\n\n # It is critical that the prefix is removed immediately so that\n # all other configuration may reference the trimmed metric name\n if self.raw_metric_prefix and metric.name.startswith(self.raw_metric_prefix):\n metric.name = metric.name[len(self.raw_metric_prefix) :]\n\n yield metric\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def generate_sample_data(self, metric):\n \"\"\"\n Yield a sample of processed data.\n \"\"\"\n\n label_normalizer = get_label_normalizer(metric.type)\n\n for sample in metric.samples:\n value = sample.value\n if isnan(value) or isinf(value):\n self.log.debug('Ignoring sample for metric `%s` as it has an invalid value: %s', metric.name, value)\n continue\n\n tags = []\n skip_sample = False\n labels = sample.labels\n self.label_aggregator.populate(labels)\n label_normalizer(labels)\n\n for label_name, label_value in labels.items():\n sample_excluder = self.exclude_metrics_by_labels.get(label_name)\n if sample_excluder is not None and sample_excluder(label_value):\n skip_sample = True\n break\n elif label_name in self.exclude_labels:\n continue\n elif self.include_labels and label_name not in self.include_labels:\n continue\n\n label_name = self.rename_labels.get(label_name, label_name)\n tags.append(f'{label_name}:{label_value}')\n\n if skip_sample:\n continue\n\n tags.extend(self.tags)\n\n hostname = \"\"\n if self.hostname_label and self.hostname_label in labels:\n hostname = labels[self.hostname_label]\n if self.hostname_formatter is not None:\n hostname = self.hostname_formatter(hostname)\n\n self.submit_telemetry_number_of_processed_metric_samples()\n yield sample, tags, hostname\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def stream_connection_lines(self):\n \"\"\"\n Yield the connection line.\n \"\"\"\n\n try:\n with self.get_connection() as connection:\n # Media type will be used to select parser dynamically\n self._content_type = connection.headers.get('Content-Type', '')\n for line in connection.iter_lines(decode_unicode=True):\n yield line\n except ConnectionError as e:\n if self.ignore_connection_errors:\n self.log.warning(\"OpenMetrics endpoint %s is not accessible\", self.endpoint)\n else:\n raise e\n
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def filter_connection_lines(self, line_streamer):\n \"\"\"\n Filter connection lines in the line streamer.\n \"\"\"\n\n for line in line_streamer:\n if self.raw_line_filter.search(line):\n self.submit_telemetry_number_of_ignored_lines()\n else:\n yield line\n
Send a request to scrape metrics. Return the response or throw an exception.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def get_connection(self):\n \"\"\"\n Send a request to scrape metrics. Return the response or throw an exception.\n \"\"\"\n\n try:\n response = self.send_request()\n except Exception as e:\n self.submit_health_check(ServiceCheck.CRITICAL, message=str(e))\n raise\n else:\n try:\n response.raise_for_status()\n except Exception as e:\n self.submit_health_check(ServiceCheck.CRITICAL, message=str(e))\n response.close()\n raise\n else:\n self.submit_health_check(ServiceCheck.OK)\n\n # Never derive the encoding from the locale\n if response.encoding is None:\n response.encoding = 'utf-8'\n\n self.submit_telemetry_endpoint_response_size(response)\n\n return response\n
If health service check is enabled, send an openmetrics.health service check.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/scraper.py
def submit_health_check(self, status, **kwargs):\n \"\"\"\n If health service check is enabled, send an `openmetrics.health` service check.\n \"\"\"\n\n if self.enable_health_service_check:\n self.service_check(self.SERVICE_CHECK_HEALTH, status, tags=self.static_tags, **kwargs)\n
"},{"location":"base/openmetrics/#transformers","title":"Transformers","text":""},{"location":"base/openmetrics/#datadog_checks.base.checks.openmetrics.v2.transform.Transformers","title":"datadog_checks.base.checks.openmetrics.v2.transform.Transformers","text":"Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/v2/transform.py
This OpenMetrics implementation is the updated version of the original Prometheus/OpenMetrics implementation. The docs for the deprecated implementation are still available as a reference.
TLS/SSL is widely used to provide communications over a secure network. Many of the software that Datadog supports has features to allow TLS/SSL. Therefore, the Datadog Agent may need to connect with TLS/SSL to get metrics.
For Agent v7.24+, checks compatible with TLS/SSL should not manually create a raw ssl.SSLContext. Instead, check implementations should use AgentCheck.get_tls_context() to obtain a TLS/SSL context.
get_tls_context() allows a few optional parameters which may be helpful when developing integrations.
Creates and cache an SSLContext instance based on user configuration. Note that user configuration can be overridden by using overrides. This should only be applied to older integration that manually set config values.
Since: Agent 7.24
Source code in datadog_checks_base/datadog_checks/base/checks/base.py
def get_tls_context(self, refresh=False, overrides=None):\n # type: (bool, Dict[AnyStr, Any]) -> ssl.SSLContext\n \"\"\"\n Creates and cache an SSLContext instance based on user configuration.\n Note that user configuration can be overridden by using `overrides`.\n This should only be applied to older integration that manually set config values.\n\n Since: Agent 7.24\n \"\"\"\n if not hasattr(self, '_tls_context_wrapper'):\n self._tls_context_wrapper = TlsContextWrapper(\n self.instance or {}, self.TLS_CONFIG_REMAPPER, overrides=overrides\n )\n\n if refresh:\n self._tls_context_wrapper.refresh_tls_context()\n\n return self._tls_context_wrapper.tls_context\n
"},{"location":"ddev/about/","title":"What's in the box?","text":"
The Dev package, often referred to as its CLI entrypoint ddev, is fundamentally split into 2 parts.
The test framework provides everything necessary to test integrations, such as:
Dependencies like pytest, mock, requests, etc.
Utilities for consistently handling complex logic or common operations
An orchestrator for arbitrary E2E environments
Python 2 Alert!
Some integrations still support Python version 2.7 and must be tested with it. As a consequence, so must parts of our test framework, for example the pytest plugin.
The CLI provides the interface through which tests are invoked, E2E environments are managed, and general repository maintenance (such as dependency management) occurs.
As the dependencies of the test framework are a subset of what is required for the CLI, the CLI tooling may import from the test framework, but not vice versa.
The diagram below shows the import hierarchy between each component. Clicking a node will open that component's location in the source code.
graph BT\n A([Plugins])\n click A \"https://github.com/DataDog/integrations-core/tree/master/datadog_checks_dev/datadog_checks/dev/plugin\" \"Test framework plugins location\"\n\n B([Test framework])\n click B \"https://github.com/DataDog/integrations-core/tree/master/datadog_checks_dev/datadog_checks/dev\" \"Test framework location\"\n\n C([CLI])\n click C \"https://github.com/DataDog/integrations-core/tree/master/datadog_checks_dev/datadog_checks/dev/tooling\" \"CLI tooling location\"\n\n A-->B\n C-->B
Name Type Description Default --core, -c boolean Work on integrations-core. False--extras, -e boolean Work on integrations-extras. False--marketplace, -m boolean Work on marketplace. False--agent, -a boolean Work on datadog-agent. False--here, -x boolean Work on the current location. False--org, -o text Override org config field for this invocation. None --color / --no-color boolean Whether or not to display colored output (default is auto-detection) [env vars: FORCE_COLOR/NO_COLOR] None --interactive / --no-interactive boolean Whether or not to allow features like prompts and progress bars (default is auto-detection) [env var: DDEV_INTERACTIVE] None --verbose, -v integer range (0 and above) Increase verbosity (can be used additively) [env var: DDEV_VERBOSE] 0--quiet, -q integer range (0 and above) Decrease verbosity (can be used additively) [env var: DDEV_QUIET] 0--config text The path to a custom config file to use [env var: DDEV_CONFIG] None --version boolean Show the version and exit. False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-ci","title":"ddev ci","text":"
CI related utils. Anything here should be considered experimental.
Usage:
ddev ci [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-ci-setup","title":"ddev ci setup","text":"
Run CI setup scripts
Usage:
ddev ci setup [OPTIONS] [CHECKS]...\n
Options:
Name Type Description Default --changed boolean Only target changed checks False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-clean","title":"ddev clean","text":"
Remove build and test artifacts for the entire repository.
Usage:
ddev clean [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config","title":"ddev config","text":"
Manage the config file
Usage:
ddev config [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-edit","title":"ddev config edit","text":"
Edit the config file with your default editor.
Usage:
ddev config edit [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-explore","title":"ddev config explore","text":"
Open the config location in your file manager.
Usage:
ddev config explore [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-find","title":"ddev config find","text":"
Show the location of the config file.
Usage:
ddev config find [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-restore","title":"ddev config restore","text":"
Restore the config file to default settings.
Usage:
ddev config restore [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-set","title":"ddev config set","text":"
Assign values to config file entries. If the value is omitted, you will be prompted, with the input hidden if it is sensitive.
Usage:
ddev config set [OPTIONS] KEY [VALUE]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-config-show","title":"ddev config show","text":"
Show the contents of the config file.
Usage:
ddev config show [OPTIONS]\n
Options:
Name Type Description Default --all, -a boolean Do not scrub secret fields False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-create","title":"ddev create","text":"
Create scaffolding for a new integration.
NAME: The display name of the integration that will appear in documentation.
Usage:
ddev create [OPTIONS] NAME\n
Options:
Name Type Description Default --type, -t choice (check | jmx | logs | metrics_crawler | snmp_tile | tile) The type of integration to create. See below for more details. check--location, -l text The directory where files will be written None --non-interactive, -ni boolean Disable prompting for fields False--quiet, -q boolean Show less output False--dry-run, -n boolean Only show what would be created False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep","title":"ddev dep","text":"
Manage dependencies
Usage:
ddev dep [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep-freeze","title":"ddev dep freeze","text":"
Combine all dependencies for the Agent's static environment.
This reads and merges the dependency specs from individual integrations and writes them to agent_requirements.in
Usage:
ddev dep freeze [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep-pin","title":"ddev dep pin","text":"
Pin a dependency for all checks that require it.
Usage:
ddev dep pin [OPTIONS] DEFINITION\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep-sync","title":"ddev dep sync","text":"
Synchronize integration dependency spec with that of the agent as a whole.
Reads dependency spec from agent_requirements.in and propagates it to all integrations. For each integration we propagate only the relevant parts (i.e. its direct dependencies).
Usage:
ddev dep sync [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-dep-updates","title":"ddev dep updates","text":"
Automatically check for dependency updates
Usage:
ddev dep updates [OPTIONS]\n
Options:
Name Type Description Default --sync, -s boolean Update the dependency definitions False--include-security-deps, -i boolean Attempt to update security dependencies False--batch-size, -b integer The maximum number of dependencies to upgrade if syncing None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-docs","title":"ddev docs","text":"
Manage documentation.
Usage:
ddev docs [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-docs-build","title":"ddev docs build","text":"
Build documentation.
Usage:
ddev docs build [OPTIONS]\n
Options:
Name Type Description Default --check boolean Ensure links are valid False--pdf boolean Also export the site as PDF False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-docs-serve","title":"ddev docs serve","text":"
Serve documentation.
Usage:
ddev docs serve [OPTIONS]\n
Options:
Name Type Description Default --dirty boolean Speed up reload time by only rebuilding edited pages (based on modified time). For development only. False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env","title":"ddev env","text":"
Manage environments.
Usage:
ddev env [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-agent","title":"ddev env agent","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config","title":"ddev env config","text":"
Manage the config file
Usage:
ddev env config [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config-edit","title":"ddev env config edit","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config-explore","title":"ddev env config explore","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config-find","title":"ddev env config find","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-config-show","title":"ddev env config show","text":"
Show the contents of the config file.
Usage:
ddev env config show [OPTIONS] INTEGRATION ENVIRONMENT\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-reload","title":"ddev env reload","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-shell","title":"ddev env shell","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-show","title":"ddev env show","text":"
Show active or available environments.
Usage:
ddev env show [OPTIONS] INTEGRATION [ENVIRONMENT]\n
Options:
Name Type Description Default --ascii boolean Whether or not to only use ASCII characters False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-start","title":"ddev env start","text":"
Name Type Description Default --dev boolean Install the local version of the integration False--base boolean Install the local version of the base package, implicitly enabling the --dev option False--agent, -a text The Agent build to use e.g. a Docker image like datadog/agent:latest. You can also use the name of an Agent defined in the agents configuration section. None -e text Environment variables to pass to the Agent e.g. -e DD_URL=app.datadoghq.com -e DD_API_KEY=foobar None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-stop","title":"ddev env stop","text":"
Stop environments. To stop all the running environments, use all as the integration name and the environment.
Usage:
ddev env stop [OPTIONS] INTEGRATION ENVIRONMENT\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-env-test","title":"ddev env test","text":"
Test environments.
This runs the end-to-end tests.
If no ENVIRONMENT is specified, active is selected which will test all environments that are currently running. You may choose all to test all environments whether or not they are running.
Testing active environments will not stop them after tests complete. Testing environments that are not running will start and stop them automatically.
See these docs for to pass ENVIRONMENT and PYTEST_ARGS:
https://datadoghq.dev/integrations-core/testing/
Usage:
ddev env test [OPTIONS] INTEGRATION [ENVIRONMENT] [PYTEST_ARGS]...\n
Options:
Name Type Description Default --dev boolean Install the local version of the integration False--base boolean Install the local version of the base package, implicitly enabling the --dev option False--agent, -a text The Agent build to use e.g. a Docker image like datadog/agent:latest. You can also use the name of an Agent defined in the agents configuration section. None -e text Environment variables to pass to the Agent e.g. -e DD_URL=app.datadoghq.com -e DD_API_KEY=foobar None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta","title":"ddev meta","text":"
Anything here should be considered experimental.
This meta namespace can be used for an arbitrary number of niche or beta features without bloating the root namespace.
Usage:
ddev meta [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-catalog","title":"ddev meta catalog","text":"
Create a catalog with information about integrations
Usage:
ddev meta catalog [OPTIONS] CHECKS...\n
Options:
Name Type Description Default -f, --file text Output to file (it will be overwritten), you can pass \"tmp\" to generate a temporary file None --markdown, -m boolean Output to markdown instead of CSV False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-changes","title":"ddev meta changes","text":"
Show changes since a specific date.
Usage:
ddev meta changes [OPTIONS] SINCE\n
Options:
Name Type Description Default --out, -o boolean Output to file False--eager boolean Skip validation of commit subjects False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-create-example-commits","title":"ddev meta create-example-commits","text":"
Create branch commits from example repo
Usage:
ddev meta create-example-commits [OPTIONS] SOURCE_DIR\n
Options:
Name Type Description Default --prefix, -p text Optional text to prefix each commit `` --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-dash","title":"ddev meta dash","text":"
Dashboard utilities
Usage:
ddev meta dash [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-dash-export","title":"ddev meta dash export","text":"
Export a Dashboard as JSON
Usage:
ddev meta dash export [OPTIONS] URL INTEGRATION\n
Options:
Name Type Description Default --author, -a text The owner of this integration's dashboard. Default is 'Datadog' Datadog--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-jmx","title":"ddev meta jmx","text":"
JMX utilities
Usage:
ddev meta jmx [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-jmx-query-endpoint","title":"ddev meta jmx query-endpoint","text":"
Query endpoint for JMX info
Usage:
ddev meta jmx query-endpoint [OPTIONS] HOST PORT [DOMAIN]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-manifest","title":"ddev meta manifest","text":"
Manifest utilities
Usage:
ddev meta manifest [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-manifest-migrate","title":"ddev meta manifest migrate","text":"
Helper tool to ease the migration of a manifest to a newer version, auto-filling fields when possible
Inputs:
integration: The name of the integration folder to perform the migration on
to_version: The schema version to upgrade the manifest to
Usage:
ddev meta manifest migrate [OPTIONS] INTEGRATION TO_VERSION\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-prom","title":"ddev meta prom","text":"
Prometheus utilities
Usage:
ddev meta prom [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-prom-info","title":"ddev meta prom info","text":"
Show metric info from a Prometheus endpoint.
Example: $ ddev meta prom info -e :8080/_status/vars
Usage:
ddev meta prom info [OPTIONS]\n
Options:
Name Type Description Default -e, --endpoint text N/A None -f, --file filename N/A None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-prom-parse","title":"ddev meta prom parse","text":"
Interactively parse metric info from a Prometheus endpoint and write it to metadata.csv.
Usage:
ddev meta prom parse [OPTIONS] CHECK\n
Options:
Name Type Description Default -e, --endpoint text N/A None -f, --file filename N/A None --here, -x boolean Output to the current location False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts","title":"ddev meta scripts","text":"
Miscellaneous scripts that may be useful.
Usage:
ddev meta scripts [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-email2ghuser","title":"ddev meta scripts email2ghuser","text":"
Given an email, attempt to find a Github username associated with the email.
$ ddev meta scripts email2ghuser example@datadoghq.com
Usage:
ddev meta scripts email2ghuser [OPTIONS] EMAIL\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-generate-metrics","title":"ddev meta scripts generate-metrics","text":"
Generate metrics with fake values for an integration
You can provide the site and API key as options:
$ ddev meta scripts generate-metrics --site --api-key
It's easier however to switch ddev's org setting temporarily:
$ ddev -o meta scripts generate-metrics
Usage:
ddev meta scripts generate-metrics [OPTIONS] INTEGRATION\n
Options:
Name Type Description Default --site text The datadog SITE to use, e.g. \"datadoghq.com\". If not provided we will use ddev config org settings. None --api-key text The API key. If not provided we will use ddev config org settings. None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-metrics2md","title":"ddev meta scripts metrics2md","text":"
Convert a check's metadata.csv file to a Markdown table, which will be copied to your clipboard.
By default it will be compact and only contain the most useful fields. If you wish to use arbitrary metric data, you may set the check to cb to target the current contents of your clipboard.
Usage:
ddev meta scripts metrics2md [OPTIONS] CHECK [FIELDS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-remove-labels","title":"ddev meta scripts remove-labels","text":"
Remove all labels from an issue or pull request. This is useful when there are too many labels and its state cannot be modified (known GitHub issue).
$ ddev meta scripts remove-labels 5626
Usage:
ddev meta scripts remove-labels [OPTIONS] ISSUE_NUMBER\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-serve-openmetrics-payload","title":"ddev meta scripts serve-openmetrics-payload","text":"
Serve and collect metrics from OpenMetrics files with a real Agent
$ ddev meta scripts serve-openmetrics-payload ray payload1.txt payload2.txt
Usage:
ddev meta scripts serve-openmetrics-payload [OPTIONS] INTEGRATION\n [PAYLOADS]...\n
Options:
Name Type Description Default -c, --config text Path to the config file to use for the integration. The openmetrics_endpoint option will be overriden to use the right URL. If not provided, the openmetrics_endpoint will be the only option configured. None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-scripts-upgrade-python","title":"ddev meta scripts upgrade-python","text":"
Upgrade the Python version of all test environments.
$ ddev meta scripts upgrade-python 3.11
Usage:
ddev meta scripts upgrade-python [OPTIONS] VERSION\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp","title":"ddev meta snmp","text":"
SNMP utilities
Usage:
ddev meta snmp [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-generate-profile-from-mibs","title":"ddev meta snmp generate-profile-from-mibs","text":"
Generate an SNMP profile from MIBs. Accepts a directory path containing mib files to be used as source to generate the profile, along with a filter if a device or family of devices support only a subset of oids from a mib.
filters is the path to a yaml file containing a collection of MIBs, with their list of MIB node names to be included. For example:
Note that each MIB:node_name correspond to exactly one and only one OID. However, some MIBs report legacy nodes that are overwritten.
To resolve, edit the MIB by removing legacy values manually before loading them with this profile generator. If a MIB is fully supported, it can be omitted from the filter as MIBs not found in a filter will be fully loaded. If a MIB is not fully supported, it can be listed with an empty node list, as CISCO-SYSLOG-MIB in the example.
-a, --aliases is an option to provide the path to a YAML file containing a list of aliases to be used as metric tags for tables, in the following format:
MIBs tables most of the time define a column OID within the table, or from a different table and even different MIB, which value can be used to index entries. This is the INDEX field in row nodes. As an example, entPhysicalContainsTable in ENTITY-MIB
Sometimes indexes are columns from another table, and we might want to use another column as it could have more human readable information - we might prefer to see the interface name vs its numerical table index. This can be achieved using metric_tag_aliases
Return a list of SNMP metrics and copy its yaml dump to the clipboard Metric tags need to be added manually
Usage:
ddev meta snmp generate-profile-from-mibs [OPTIONS] [MIB_FILES]...\n
Options:
Name Type Description Default -f, --filters text Path to OIDs filter None -a, --aliases text Path to metric tag aliases None --debug, -d boolean Include debug output False--interactive, -i boolean Prompt to confirm before saving to a file False--source, -s text Source of the MIBs files. Can be a url or a path for a directory https://raw.githubusercontent.com:443/DataDog/mibs.snmplabs.com/master/asn1/@mib@--compiled_mibs_path, -c text Source of compiled MIBs files. Can be a url or a path for a directory https://raw.githubusercontent.com/DataDog/mibs.snmplabs.com/master/json/@mib@--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-generate-traps-db","title":"ddev meta snmp generate-traps-db","text":"
Generate yaml or json formatted documents containing various information about traps. These files can be used by the Datadog Agent to enrich trap data. This command is intended for \"Network Devices Monitoring\" users who need to enrich traps that are not automatically supported by Datadog.
The expected workflow is as such:
1- Identify a type of device that is sending traps that Datadog does not already recognize.
2- Fetch all the MIBs that Datadog does not support.
3- Run ddev meta snmp generate-traps-db -o ./output_dir/ /path/to/my/mib1 /path/to/my/mib2
You'll need to install pysmi manually beforehand.
Usage:
ddev meta snmp generate-traps-db [OPTIONS] MIB_FILES...\n
Options:
Name Type Description Default --mib-sources, -s text Url or a path to a directory containing the dependencies for [mib_files...].Traps defined in these files are ignored. None --output-dir, -o directory Path to a directory where to store the created traps database file per MIB.Recommended option, do not use with --output-file None --output-file file Path to a file to store a compacted version of the traps database file. Do not use with --output-dir None --output-format choice (yaml | json) Use json instead of yaml for the output file(s). yaml--no-descr boolean Removes descriptions from the generated file(s) when set (more compact). False--debug, -d boolean Include debug output False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-translate-profile","title":"ddev meta snmp translate-profile","text":"
Do OID translation in a SNMP profile. This isn't a plain replacement, as it doesn't preserve comments and indent, but it should automate most of the work.
You'll need to install pysnmp and pysnmp-mibs manually beforehand.
Usage:
ddev meta snmp translate-profile [OPTIONS] PROFILE_PATH\n
Options:
Name Type Description Default --mib_source_url text Source url to fetch missing MIBS https://raw.githubusercontent.com:443/DataDog/mibs.snmplabs.com/master/asn1/@mib@--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-validate-mib-filenames","title":"ddev meta snmp validate-mib-filenames","text":"
Validate MIB file names. Frameworks used to load mib files expect MIB file names to match MIB name.
Usage:
ddev meta snmp validate-mib-filenames [OPTIONS] [MIB_FILES]...\n
Options:
Name Type Description Default --interactive, -i boolean Prompt to confirm before renaming all invalid MIB files False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-snmp-validate-profile","title":"ddev meta snmp validate-profile","text":"
Validate SNMP profiles
Usage:
ddev meta snmp validate-profile [OPTIONS]\n
Options:
Name Type Description Default -f, --file text Path to a profile file to validate None -d, --directory text Path to a directory of profiles to validate None -v, --verbose boolean Increase verbosity of error messages False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-windows","title":"ddev meta windows","text":"
Windows utilities
Usage:
ddev meta windows [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-windows-pdh","title":"ddev meta windows pdh","text":"
PDH utilities
Usage:
ddev meta windows pdh [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-meta-windows-pdh-browse","title":"ddev meta windows pdh browse","text":"
Explore performance counters.
You'll need to install pywin32 manually beforehand.
Usage:
ddev meta windows pdh browse [OPTIONS] [COUNTERSET]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release","title":"ddev release","text":"
Manage the release of integrations.
Usage:
ddev release [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-agent","title":"ddev release agent","text":"
A collection of tasks related to the Datadog Agent.
Usage:
ddev release agent [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-agent-changelog","title":"ddev release agent changelog","text":"
Generates a markdown file containing the list of checks that changed for a given Agent release. Agent version numbers are derived inspecting tags on integrations-core so running this tool might provide unexpected results if the repo is not up to date with the Agent release process.
If neither --since or --to are passed (the most common use case), the tool will generate the whole changelog since Agent version 6.3.0 (before that point we don't have enough information to build the log).
Usage:
ddev release agent changelog [OPTIONS]\n
Options:
Name Type Description Default --since text Initial Agent version 6.3.0--to text Final Agent version None --write, -w boolean Write to the changelog file, if omitted contents will be printed to stdout False--force, -f boolean Replace an existing file False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-agent-integrations","title":"ddev release agent integrations","text":"
Generates a markdown file containing the list of integrations shipped in a given Agent release. Agent version numbers are derived by inspecting tags on integrations-core, so running this tool might provide unexpected results if the repo is not up to date with the Agent release process.
If neither --since nor --to are passed (the most common use case), the tool will generate the list for every Agent since version 6.3.0 (before that point we don't have enough information to build the log).
Usage:
ddev release agent integrations [OPTIONS]\n
Options:
Name Type Description Default --since text Initial Agent version 6.3.0--to text Final Agent version None --write, -w boolean Write to file, if omitted contents will be printed to stdout False--force, -f boolean Replace an existing file False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-agent-integrations-changelog","title":"ddev release agent integrations-changelog","text":"
Update integration CHANGELOG.md by adding the Agent version.
Agent version is only added to the integration versions released with a specific Agent release.
Name Type Description Default --since text Initial Agent version 6.3.0--to text Final Agent version None --write, -w boolean Write to the changelog file, if omitted contents will be printed to stdout False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-branch","title":"ddev release branch","text":"
Manage Agent release branches.
Usage:
ddev release branch [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-branch-create","title":"ddev release branch create","text":"
Create a branch for a release of the Agent.
BRANCH_NAME should match this pattern: ^\\d+.\\d+.x$, for example7.52.x`.
This command will also create the backport/<BRANCH_NAME> label in GitHub for this release branch.
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-branch-tag","title":"ddev release branch tag","text":"
Tag the release branch either as release candidate or final release.
Usage:
ddev release branch tag [OPTIONS]\n
Options:
Name Type Description Default --final / --rc boolean Whether we're tagging the final release or a release candidate (rc). False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-build","title":"ddev release build","text":"
Build a wheel for a check as it is on the repo HEAD
Usage:
ddev release build [OPTIONS] CHECK\n
Options:
Name Type Description Default --sdist, -s boolean N/A False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-changelog","title":"ddev release changelog","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-changelog-fix","title":"ddev release changelog fix","text":"
Fix changelog entries.
This command is only needed if you are manually writing to the changelog. For instance for marketplace and extras integrations. Don't use this in integrations-core because the changelogs there are generated automatically.
The first line of every new changelog entry must include the PR number in which the change occurred. This command will apply this suffix to manually added entries if it is missing.
Usage:
ddev release changelog fix [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-changelog-new","title":"ddev release changelog new","text":"
This creates new changelog entries in Markdown format.
If the ENTRY_TYPE is not specified, you will be prompted.
The --message option can be used to specify the changelog text. If this is not supplied, an editor will be opened for you to manually write the entry. The changelog text that is opened defaults to the PR title, followed by the most recent commit subject. If that is sufficient, then you may close the editor tab immediately.
By default, changelog entries will be created for all integrations that have changed code. To create entries only for specific targets, you may pass them as additional arguments after the entry type.
Usage:
ddev release changelog new [OPTIONS] [ENTRY_TYPE] [TARGETS]...\n
Options:
Name Type Description Default --message, -m text The changelog text None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-list","title":"ddev release list","text":"
Show all versions of an integration.
Usage:
ddev release list [OPTIONS] INTEGRATION\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-make","title":"ddev release make","text":"
Perform a set of operations needed to release checks:
update the version in __about__.py
update the changelog
update the requirements-agent-release.txt file
update in-toto metadata
commit the above changes
You can release everything at once by setting the check to all.
If you run into issues signing: - Ensure you did gpg --import <YOUR_KEY_ID>.gpg.pub
Usage:
ddev release make [OPTIONS] CHECKS...\n
Options:
Name Type Description Default --version text N/A None --end text N/A None --new boolean Ensure versions are at 1.0.0 False--skip-sign boolean Skip the signing of release metadata False--sign-only boolean Only sign release metadata False--exclude text Comma-separated list of checks to skip None --allow-master boolean Allow ddev to commit directly to master. Forbidden for core. False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-show","title":"ddev release show","text":"
To avoid GitHub's public API rate limits, you need to set github.user/github.token in your config file or use the DD_GITHUB_USER/DD_GITHUB_TOKEN environment variables.
Usage:
ddev release show [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-show-changes","title":"ddev release show changes","text":"
Show all the pending PRs for a given check.
Usage:
ddev release show changes [OPTIONS] CHECK\n
Options:
Name Type Description Default --tag-pattern text The regex pattern for the format of the tag. Required if the tag doesn't follow semver None --tag-prefix text Specify the prefix of the tag to use if the tag doesn't follow semver None --dry-run, -n boolean Run the command in dry-run mode False--since text The git ref to use instead of auto-detecting the tag to view changes since None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-show-ready","title":"ddev release show ready","text":"
Show all the checks that can be released.
Usage:
ddev release show ready [OPTIONS]\n
Options:
Name Type Description Default --quiet, -q boolean N/A False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-stats","title":"ddev release stats","text":"
A collection of tasks to generate reports about releases.
Usage:
ddev release stats [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-stats-merged-prs","title":"ddev release stats merged-prs","text":"
Prints the PRs merged between the first RC and the current RC/final build
Usage:
ddev release stats merged-prs [OPTIONS]\n
Options:
Name Type Description Default --from-ref, -f text Reference to start stats on (first RC tagged) _required --to-ref, -t text Reference to end stats at (current RC/final tag) _required --release-milestone, -r text Github release milestone _required --exclude-releases, -e boolean Flag to exclude the release PRs from the list False--export-csv text CSV file where the list will be exported None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-stats-report","title":"ddev release stats report","text":"
Prints some release stats we want to track
Usage:
ddev release stats report [OPTIONS]\n
Options:
Name Type Description Default --from-ref, -f text Reference to start stats on (first RC tagged) _required --to-ref, -t text Reference to end stats at (current RC/final tag) _required --release-milestone, -r text Github release milestone _required --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-tag","title":"ddev release tag","text":"
Tag the HEAD of the git repo with the current release number for a specific check. The tag is pushed to origin by default.
You can tag everything at once by setting the check to all.
Notice: specifying a different version than the one in __about__.py is a maintenance task that should be run under very specific circumstances (e.g. re-align an old release performed on the wrong commit).
Usage:
ddev release tag [OPTIONS] CHECK [VERSION]\n
Options:
Name Type Description Default --push / --no-push boolean N/A True--dry-run, -n boolean N/A False--skip-prerelease boolean N/A False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-release-upload","title":"ddev release upload","text":"
Release a specific check to PyPI as it is on the repo HEAD.
Usage:
ddev release upload [OPTIONS] CHECK\n
Options:
Name Type Description Default --sdist, -s boolean N/A False--dry-run, -n boolean N/A False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-run","title":"ddev run","text":"
Run commands in the proper repo.
Usage:
ddev run [OPTIONS] [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-status","title":"ddev status","text":"
Show information about the current environment.
Usage:
ddev status [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-test","title":"ddev test","text":"
Run unit and integration tests.
Please see these docs to know how to pass TARGET_SPEC and PYTEST_ARGS:
https://datadoghq.dev/integrations-core/testing/
Usage:
ddev test [OPTIONS] [TARGET_SPEC] [PYTEST_ARGS]...\n
Options:
Name Type Description Default --lint, -s boolean Run only lint & style checks False--fmt, -fs boolean Run only the code formatter False--bench, -b boolean Run only benchmarks False--latest boolean Only verify support of new product versions False--cov, -c boolean Measure code coverage False--compat boolean Check compatibility with the minimum allowed Agent version. Implies --recreate. False--ddtrace boolean Enable tracing during test execution False--memray boolean Measure memory usage during test execution False--recreate, -r boolean Recreate environments from scratch False--list, -l boolean Show available test environments False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate","title":"ddev validate","text":"
Verify certain aspects of the repo.
Usage:
ddev validate [OPTIONS] COMMAND [ARGS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-agent-reqs","title":"ddev validate agent-reqs","text":"
Verify that the checks versions are in sync with the requirements-agent-release.txt file.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate agent-reqs [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-all","title":"ddev validate all","text":"
Run all CI validations for a repo.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate all [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-ci","title":"ddev validate ci","text":"
Validate CI infrastructure configuration.
Usage:
ddev validate ci [OPTIONS]\n
Options:
Name Type Description Default --sync boolean Update the CI configuration False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-codeowners","title":"ddev validate codeowners","text":"
Validate that every integration has an entry in the CODEOWNERS file.
Usage:
ddev validate codeowners [OPTIONS]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-config","title":"ddev validate config","text":"
Validate default configuration files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate config [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --sync, -s boolean Generate example configuration files based on specifications False--verbose, -v boolean Verbose mode False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-dashboards","title":"ddev validate dashboards","text":"
Validate all Dashboard definition files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate dashboards [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --fix boolean Attempt to fix errors False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-dep","title":"ddev validate dep","text":"
This command will:
Verify the uniqueness of dependency versions across all checks, or optionally a single check
Verify all the dependencies are pinned.
Verify the embedded Python environment defined in the base check and requirements listed in every integration are compatible.
Verify each check specifies a CHECKS_BASE_REQ variable for datadog-checks-base requirement
Optionally verify that the datadog-checks-base requirement is lower-bounded
Optionally verify that the datadog-checks-base requirement satisfies specific version
Usage:
ddev validate dep [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --require-base-check-version boolean Require specific version for datadog-checks-base requirement False--min-base-check-version text Specify minimum version for datadog-checks-base requirement, e.g. 11.0.0 None --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-eula","title":"ddev validate eula","text":"
Validate all EULA definition files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate eula [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-http","title":"ddev validate http","text":"
Validate all integrations for usage of HTTP wrapper.
If integrations is specified, only those will be validated, an 'all' check value will validate all checks.
Usage:
ddev validate http [OPTIONS] [INTEGRATIONS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-imports","title":"ddev validate imports","text":"
Validate proper imports in checks.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate imports [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --autofix boolean Apply suggested fix False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-integration-style","title":"ddev validate integration-style","text":"
Validate that check follows style guidelines.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Name Type Description Default --verbose, -v boolean Verbose mode False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-jmx-metrics","title":"ddev validate jmx-metrics","text":"
Validate all default JMX metrics definitions.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate jmx-metrics [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --verbose, -v boolean Verbose mode False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-labeler","title":"ddev validate labeler","text":"
Validate labeler configuration.
Usage:
ddev validate labeler [OPTIONS]\n
Options:
Name Type Description Default --sync boolean Update the labeler configuration False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-legacy-signature","title":"ddev validate legacy-signature","text":"
Validate that no integration uses the legacy signature.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-license-headers","title":"ddev validate license-headers","text":"
Validate license headers in python code files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all python files.
Usage:
ddev validate license-headers [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --fix boolean Attempt to fix errors False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-licenses","title":"ddev validate licenses","text":"
Validate third-party license list
Usage:
ddev validate licenses [OPTIONS]\n
Options:
Name Type Description Default --sync, -s boolean Generate the LICENSE-3rdparty.csv file False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-manifest","title":"ddev validate manifest","text":"
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-metadata","title":"ddev validate metadata","text":"
Validate metadata.csv files
If integrations is specified, only the check will be validated, an 'all' or empty value will validate all metadata.csv files, a changed value will validate changed integrations.
Name Type Description Default --check-duplicates boolean Output warnings if there are duplicate short names and descriptions False--show-warnings, -w boolean Show warnings in addition to failures False--sync boolean Update the file False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-models","title":"ddev validate models","text":"
Validate configuration data models.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate models [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --sync, -s boolean Generate data models based on specifications False--verbose, -v boolean Verbose mode False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-openmetrics","title":"ddev validate openmetrics","text":"
Validate OpenMetrics metric limit.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate nothing.
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-package","title":"ddev validate package","text":"
Validate all files for Python package metadata.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all files.
Usage:
ddev validate package [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-readmes","title":"ddev validate readmes","text":"
Validates README files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate readmes [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --format-links, -fl boolean Automatically format links False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-saved-views","title":"ddev validate saved-views","text":"
Validates saved view files
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all saved view files.
Usage:
ddev validate saved-views [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-service-checks","title":"ddev validate service-checks","text":"
Validate all service_checks.json files.
If check is specified, only the check will be validated, if check value is 'changed' will only apply to changed checks, an 'all' or empty check value will validate all README files.
Usage:
ddev validate service-checks [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --sync boolean Generate example configuration files based on specifications False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-typos","title":"ddev validate typos","text":"
Validate spelling in the source code.
If check is specified, only the directory is validated. Use codespell command line tool to detect spelling errors.
Usage:
ddev validate typos [OPTIONS] [CHECK]\n
Options:
Name Type Description Default --fix boolean Apply suggested fix False--help boolean Show this message and exit. False"},{"location":"ddev/cli/#ddev-validate-version","title":"ddev validate version","text":"
Check that the integration version is defined and makes sense.
It should exist.
In Python packages the CHANGELOG should be automatically generated and match about.py.
In new Python packages CHANGELOG should have no version and about.py should have 0.0.1 as the version.
For now the validation is limited to integrations-core. INTEGRATIONS can be one or more integrations or the special value \"all\"
Usage:
ddev validate version [OPTIONS] [INTEGRATIONS]...\n
Options:
Name Type Description Default --help boolean Show this message and exit. False"},{"location":"ddev/configuration/","title":"Configuration","text":"
All configuration can be managed entirely by the ddev config command group. To locate the TOML config file, run:
All CLI commands are aware of the current repository context, defined by the option repo. This option should be a reference to a key in repos which is set to the path of a supported repository. For example, this configuration:
would make it so running e.g. ddev test nginx will look for an integration named nginx in /path/to/integrations-core no matter what directory you are in. If the selected path does not exist, then the current directory will be used.
For running environments with a live Agent, you can select a specific build version to use with the option agent. This option should be a reference to a key in agents which is a mapping of environment types to Agent versions. For example, this configuration:
would make it so environments that define the type as docker will use the Docker image that was built with the latest commit to the datadog-agent repo.
You can switch to using a particular organization with the option org. This option should be a reference to a key in orgs which is a mapping containing data specific to the organization. For example, this configuration:
To avoid GitHub's public API rate limits, you need to set github.user/github.token in your config file or use the DD_GITHUB_USER/DD_GITHUB_TOKEN environment variables.
Run ddev config show to see if your GitHub user and token is set.
If not:
Run ddev config set github.user <YOUR_GITHUB_USERNAME>
Create a personal access token with public_repo and read:org permissions
Run ddev config set github.token then paste the token
Setting dd_check_style to true will enable 2 environments for enforcing our style conventions:
style - This will check the formatting and will error if any issues are found. You may use the -s/--style flag of ddev test to execute only this environment.
format_style - This will format the code for you, resolving the most common issues caught by style environment. You can run the formatter by using the -fs/--format-style flag of ddev test.
Our pytest plugin makes a few fixtures available globally for use during tests. Also, it's responsible for managing the control flow of E2E environments.
Most tests will execute checks via the run method of the AgentCheck interface (if the check is stateful).
A consequence of this is that, unlike the check method, exceptions are not propagated to the caller meaning not only can an exception not be asserted, but also errors are silently ignored.
The dd_run_check fixture takes a check instance and executes it while also propagating any exceptions like normal.
You can use the extract_message option to condense any exception message to just the original message rather than the full traceback.
def test_config(dd_run_check):\n check = AwesomeCheck('awesome', {}, [{'port': 'foo'}])\n\n with pytest.raises(Exception, match='^Option `port` must be an integer$'):\n dd_run_check(check, extract_message=True)\n
The dd_agent_check fixture will run the integration with a given configuration on a live Agent and return a populated aggregator. It accepts a single dict configuration representing either:
a single instance
a full configuration with top level keys instances, init_config, etc.
Internally, this is a wrapper around ddev env check and you can pass through any supported options or flags.
This fixture can only be used from tests marked as e2e. For example:
Occasionally, you will need to persist some data only known at the time of environment creation (like a generated token) through the test and environment tear down phases.
To do so, use the following fixtures:
dd_save_state - When executing the necessary steps to spin up an environment you may use this to save any object that can be serialized to JSON. For example:
dd_save_state('my_data', {'foo': 'bar'})\n
dd_get_state - This may be used to retrieve the data:
The mock_http_response fixture mocks HTTP requests for the lifetime of a test.
The fixture can be used to mock the response of an endpoint. In the following example, we can mock the Prometheus output.
def test(mock_http_response):\n mock_http_response(\n \"\"\"\n # HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.\n # TYPE go_memstats_alloc_bytes gauge\n go_memstats_alloc_bytes 6.396288e+06\n \"\"\"\n )\n ...\n
The fixture dd_environment_runner manages communication between environments and the ddev env command group. You will never use it directly as it runs automatically.
It acts upon a fixture named dd_environment that every integration's test suite will define if E2E testing on a live Agent is desired. This fixture is responsible for starting and stopping environments and must adhere to the following requirements:
It yields a single dict representing the default configuration the Agent will use. It must be either:
a single instance
a full configuration with top level keys instances, init_config, etc.
Additionally, you can pass a second dict containing metadata.
The setup logic must occur before the yield and the tear down logic must occur after it. Also, both steps must only execute based on the value of environment variables.
Setup - only if DDEV_E2E_UP is not set to false
Tear down - only if DDEV_E2E_DOWN is not set to false
Note
The provided Docker and Terraform environment runner utilities will do this automatically for you.
env_type - This is the type of interface that will be used to interact with the Agent. Currently, we support docker (default) and local.
env_vars - A dict of environment variables and their values that will be present when starting the Agent.
docker_volumes - A list of str representing Docker volume mounts if env_type is docker e.g. /local/path:/agent/container/path:ro.
docker_platform - The container architecture to use if env_type is docker. Currently, we support linux (default) and windows.
logs_config - A list of configs that will be used by the Logs Agent. You will never need to use this directly, but rather via higher level abstractions.
Most integrations monitor services like databases or web servers, rather than system properties like CPU usage. For such cases, you'll want to spin up an environment and gracefully tear it down when tests finish.
We define all environment actions in a fixture called dd_environment that looks semantically like this:
This is not only used for regular tests, but is also the basis of our E2E testing. The start command executes everything before the yield and the stop command executes everything after it.
We provide a few utilities for common environment types.
The terraform_run utility makes it easy to create services from a directory of Terraform files.
from datadog_checks.dev.terraform import terraform_run\n\n@pytest.fixture(scope='session')\ndef dd_environment():\n with terraform_run(os.path.join(HERE, 'terraform')):\n yield ...\n
Currently, we only use this for services that would be too complex to setup with Docker (like OpenStack) or things that cannot be provided by Docker (like vSphere). We provide some ready-to-use cloud templates that are available for referencing by default. We prefer using GCP when possible.
Terraform E2E tests are not run in our public CI as that would needlessly slow down builds.
The mocker fixture is provided by the pytest-mock plugin. This fixture automatically restores anything that was mocked at the end of each test and is more ergonomic to use than stacking decorators or nesting context managers.
The benchmark fixture is provided by the pytest-benchmark plugin. It enables the profiling of functions with the low-overhead cProfile module.
It is quite useful for seeing the approximate time a given check takes to run, as well as gaining insight into any potential performance bottlenecks. You would use it like this:
def test_large_payload(benchmark, dd_run_check):\n check = AwesomeCheck('awesome', {}, [instance])\n\n # Run once to get any initialization out of the way.\n dd_run_check(check)\n\n benchmark(dd_run_check, check)\n
To add benchmarks, define a bench environment in hatch.toml:
[envs.bench]\n
By default, the test command skips all benchmark environments. To run only benchmark environments use the --bench/-b flag. The results are sorted by tottime, which is the total time spent in the given function (and excluding time made in calls to sub-functions).
We provide an easy way to utilize log collection with E2E Docker environments.
Pass mount_logs=True to docker_run. This will use the logs example in the integration's config spec. For example, the following defines 2 example log files:
If mount_logs is a sequence of int, only the selected indices (starting at 1) will be used. So, using the Apache example above, to only monitor the error log you would set it to [2].
In lieu of a config spec, for whatever reason, you may set mount_logs to a dict containing the standard logs key.
All requested log files are available to reference as environment variables for any Docker calls as DD_LOG_<LOG_CONFIG_INDEX> where the indices start at 1.
A convenient context manager for safely setting up and tearing down Docker environments.
Parameters:
compose_file (str):\n A path to a Docker compose file. A custom tear\n down is not required when using this.\nbuild (bool):\n Whether or not to build images for when `compose_file` is provided\nservice_name (str):\n Optional name for when ``compose_file`` is provided\nup (callable):\n A custom setup callable\ndown (callable):\n A custom tear down callable. This is required when using a custom setup.\non_error (callable):\n A callable called in case of an unhandled exception\nsleep (float):\n Number of seconds to wait before yielding. This occurs after all conditions are successful.\nendpoints (list[str]):\n Endpoints to verify access for before yielding. Shorthand for adding\n `CheckEndpoints(endpoints)` to the `conditions` argument.\nlog_patterns (list[str | re.Pattern]):\n Regular expression patterns to find in Docker logs before yielding.\n This is only available when `compose_file` is provided. Shorthand for adding\n `CheckDockerLogs(compose_file, log_patterns, 'all')` to the `conditions` argument.\nmount_logs (bool):\n Whether or not to mount log files in Agent containers based on example logs configuration\nconditions (callable):\n A list of callable objects that will be executed before yielding to check for errors\nenv_vars (dict[str, str]):\n A dictionary to update `os.environ` with during execution\nwrappers (list[callable]):\n A list of context managers to use during execution\nattempts (int):\n Number of attempts to run `up` and the `conditions` successfully. Defaults to 2 in CI\nattempts_wait (int):\n Time to wait between attempts\n
Source code in datadog_checks_dev/datadog_checks/dev/docker.py
@contextmanager\ndef docker_run(\n compose_file=None,\n build=False,\n service_name=None,\n up=None,\n down=None,\n on_error=None,\n sleep=None,\n endpoints=None,\n log_patterns=None,\n mount_logs=False,\n conditions=None,\n env_vars=None,\n wrappers=None,\n attempts=None,\n attempts_wait=1,\n):\n \"\"\"\n A convenient context manager for safely setting up and tearing down Docker environments.\n\n Parameters:\n\n compose_file (str):\n A path to a Docker compose file. A custom tear\n down is not required when using this.\n build (bool):\n Whether or not to build images for when `compose_file` is provided\n service_name (str):\n Optional name for when ``compose_file`` is provided\n up (callable):\n A custom setup callable\n down (callable):\n A custom tear down callable. This is required when using a custom setup.\n on_error (callable):\n A callable called in case of an unhandled exception\n sleep (float):\n Number of seconds to wait before yielding. This occurs after all conditions are successful.\n endpoints (list[str]):\n Endpoints to verify access for before yielding. Shorthand for adding\n `CheckEndpoints(endpoints)` to the `conditions` argument.\n log_patterns (list[str | re.Pattern]):\n Regular expression patterns to find in Docker logs before yielding.\n This is only available when `compose_file` is provided. Shorthand for adding\n `CheckDockerLogs(compose_file, log_patterns, 'all')` to the `conditions` argument.\n mount_logs (bool):\n Whether or not to mount log files in Agent containers based on example logs configuration\n conditions (callable):\n A list of callable objects that will be executed before yielding to check for errors\n env_vars (dict[str, str]):\n A dictionary to update `os.environ` with during execution\n wrappers (list[callable]):\n A list of context managers to use during execution\n attempts (int):\n Number of attempts to run `up` and the `conditions` successfully. Defaults to 2 in CI\n attempts_wait (int):\n Time to wait between attempts\n \"\"\"\n if compose_file and up:\n raise TypeError('You must select either a compose file or a custom setup callable, not both.')\n\n if compose_file is not None:\n if not isinstance(compose_file, str):\n raise TypeError('The path to the compose file is not a string: {}'.format(repr(compose_file)))\n\n set_up = ComposeFileUp(compose_file, build=build, service_name=service_name)\n if down is not None:\n tear_down = down\n else:\n tear_down = ComposeFileDown(compose_file)\n if on_error is None:\n on_error = ComposeFileLogs(compose_file)\n else:\n set_up = up\n tear_down = down\n\n docker_conditions = []\n\n if log_patterns is not None:\n if compose_file is None:\n raise ValueError(\n 'The `log_patterns` convenience is unavailable when using '\n 'a custom setup. Please use a custom condition instead.'\n )\n docker_conditions.append(CheckDockerLogs(compose_file, log_patterns, 'all'))\n\n if conditions is not None:\n docker_conditions.extend(conditions)\n\n wrappers = list(wrappers) if wrappers is not None else []\n\n if mount_logs:\n if isinstance(mount_logs, dict):\n wrappers.append(shared_logs(mount_logs['logs']))\n # Easy mode, read example config\n else:\n # An extra level deep because of the context manager\n check_root = find_check_root(depth=2)\n\n example_log_configs = _read_example_logs_config(check_root)\n if mount_logs is True:\n wrappers.append(shared_logs(example_log_configs))\n elif isinstance(mount_logs, (list, set)):\n wrappers.append(shared_logs(example_log_configs, mount_whitelist=mount_logs))\n else:\n raise TypeError(\n 'mount_logs: expected True, a list or a set, but got {}'.format(type(mount_logs).__name__)\n )\n\n with environment_run(\n up=set_up,\n down=tear_down,\n on_error=on_error,\n sleep=sleep,\n endpoints=endpoints,\n conditions=docker_conditions,\n env_vars=env_vars,\n wrappers=wrappers,\n attempts=attempts,\n attempts_wait=attempts_wait,\n ) as result:\n yield result\n
Determine the hostname Docker uses based on the environment, defaulting to localhost.
Source code in datadog_checks_dev/datadog_checks/dev/docker.py
def get_docker_hostname():\n \"\"\"\n Determine the hostname Docker uses based on the environment, defaulting to `localhost`.\n \"\"\"\n return urlparse(os.getenv('DOCKER_HOST', '')).hostname or 'localhost'\n
Get a Docker container's IP address from its ID or name.
Source code in datadog_checks_dev/datadog_checks/dev/docker.py
def get_container_ip(container_id_or_name):\n \"\"\"\n Get a Docker container's IP address from its ID or name.\n \"\"\"\n command = [\n 'docker',\n 'inspect',\n '-f',\n '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}',\n container_id_or_name,\n ]\n\n return run_command(command, capture='out', check=True).stdout.strip()\n
Returns a bool indicating whether or not a compose file has any active services.
Source code in datadog_checks_dev/datadog_checks/dev/docker.py
def compose_file_active(compose_file):\n \"\"\"\n Returns a `bool` indicating whether or not a compose file has any active services.\n \"\"\"\n command = ['docker', 'compose', '-f', compose_file, 'ps']\n lines = run_command(command, capture='out', check=True).stdout.strip().splitlines()\n\n return len(lines) > 1\n
A convenient context manager for safely setting up and tearing down Terraform environments.
Parameters:
directory (str):\n A path containing Terraform files\nsleep (float):\n Number of seconds to wait before yielding. This occurs after all conditions are successful.\nendpoints (list[str]):\n Endpoints to verify access for before yielding. Shorthand for adding\n `CheckEndpoints(endpoints)` to the `conditions` argument.\nconditions (list[callable]):\n A list of callable objects that will be executed before yielding to check for errors\nenv_vars (dict[str, str]):\n A dictionary to update `os.environ` with during execution\nwrappers (list[callable]):\n A list of context managers to use during execution\n
Source code in datadog_checks_dev/datadog_checks/dev/terraform.py
@contextmanager\ndef terraform_run(directory, sleep=None, endpoints=None, conditions=None, env_vars=None, wrappers=None):\n \"\"\"\n A convenient context manager for safely setting up and tearing down Terraform environments.\n\n Parameters:\n\n directory (str):\n A path containing Terraform files\n sleep (float):\n Number of seconds to wait before yielding. This occurs after all conditions are successful.\n endpoints (list[str]):\n Endpoints to verify access for before yielding. Shorthand for adding\n `CheckEndpoints(endpoints)` to the `conditions` argument.\n conditions (list[callable]):\n A list of callable objects that will be executed before yielding to check for errors\n env_vars (dict[str, str]):\n A dictionary to update `os.environ` with during execution\n wrappers (list[callable]):\n A list of context managers to use during execution\n \"\"\"\n if not shutil.which('terraform'):\n pytest.skip('Terraform not available')\n\n set_up = TerraformUp(directory)\n tear_down = TerraformDown(directory)\n\n with environment_run(\n up=set_up,\n down=tear_down,\n sleep=sleep,\n endpoints=endpoints,\n conditions=conditions,\n env_vars=env_vars,\n wrappers=wrappers,\n ) as result:\n yield result\n
This is not meant to be an exhaustive list of all the things we use, but rather a token of appreciation for the services and open source software we publicly benefit from.
The Python programming language, the default language of Agent Integrations, enables us and contributors to think about problems abstractly and express intent as clearly and concisely as possible.
A huge thanks to everyone involved in maintaining PyPI. We rely on it for providing all dependencies for not only tests, but also all Datadog Agent deployments.
Azure Pipelines is used for testing all Agent Integrations. A special shout-out to Microsoft for being extremely generous with our allowance of parallel runners; only they were able to meet the requirements of our unique monorepo.
GitHub Actions is used for all repository automation, like documentation deployment and pull request labeling.
"},{"location":"faq/faq/","title":"FAQ","text":""},{"location":"faq/faq/#integration-vs-check","title":"Integration vs Check","text":"
A Check is any integration whose execution is triggered directly in code by the Datadog Agent. Therefore, all Agent-based integrations written in Python or Go are considered Checks.
"},{"location":"faq/faq/#why-test-tests","title":"Why test tests","text":"
We track the coverage of tests in all cases as a drop in test coverage for test code means a test function or part of it is not called. For an example see this test bug fixed thanks to test coverage. See pyca/pynacl#290 and #4280 for more details.
Often, libraries that interact with a product will name their packages after the product. So if you name a file <PRODUCT_NAME>.py, and inside try to import the library of the same name, you will get import errors that will be difficult to diagnose.
Never name a Python file the same as the integration's name.
The base classes may freely add new attributes for new features. Therefore to avoid collisions it is recommended that attribute names be prefixed with underscores, especially for names that are generic. For an example, see below.
Since Agent v6, every instance of AgentCheck corresponds to a single YAML instance of an integration defined in the instances array of user configuration. As such, the instance argument the check method accepts is redundant and wasteful since you are parsing the same configuration at every run.
If you would like to create a default dashboard for an integration, follow the guidelines in the Best Practices section.
"},{"location":"guidelines/dashboards/#exporting-a-dashboard-payload","title":"Exporting a dashboard payload","text":"
When you've created a dashboard in the Datadog UI, you can export the dashboard payload to be included in its integration's assets directory.
Ensure that you have set an api_key and app_key for the org that contains the new dashboard in the ddev configuration.
Run the following command to export the dashboard:
ddev meta dash export <URL_OF_DASHBOARD> <INTEGRATION>\n
Tip
If the dashboard is for a contributor-maintained integration in the integration-extras repo, run ddev --extras meta ... instead of ddev meta ....
The command will add the dashboard definition to the manifest.json file of the integration. The dashboard JSON payload will be available in /assets/dashboards/<DASHBOARD_TITLE>.json.
Tip
The dashboard is available at the following address /dash/integration/<DASHBOARD_KEY> in each region, where <DASHBOARD_KEY> is the one you have in the manifest.json file of the integration for this dashboard. This can be useful when you want to add a link to another dashboard inside your dashboard.
Commit the changes and create a pull request.
"},{"location":"guidelines/dashboards/#verify-the-preset-dashboard","title":"Verify the Preset Dashboard","text":"
Once your PR is merged and synced on production, you can find your dashboard in the Dashboard List page.
Tip
Make sure the integration tile is Installed in order to see the preset dashboard in the list.
Ensure logos render correctly on the Dashboard List page and within the preset dashboard.
"},{"location":"guidelines/dashboards/#best-practices","title":"Best Practices","text":""},{"location":"guidelines/dashboards/#why-are-dashboard-best-practices-useful","title":"Why are dashboard best practices useful?","text":"
A dashboard that follows best practices helps users consume data quickly. Best practices reduce friction when figuring out where to search for specific information or how to interpret data and find meaning. Additionally, guidelines give dashboard makers a starting point when creating a new dashboard.
Attention-grabbing \"about\" section with a banner image, concise copy, useful links, and a good typography hierarchy
A brief, annotated \"overview\" section with the most important data, right at the top
Simple graph titles and title-case group names
Nearly symmetrical in high density mode
Well formatted, concise notes explaining the value or purpose of data in each group. Try the presets \"caption\", \"annotation\", or \"header\", or pick your own combination of styles. Avoid using the smallest font size for notes that are long or include complex formatting, like bulleted lists or code blocks.
All widgets are placed within a group based on thematic organization, rather than directly on the background of the dashboard
Query value widgets have a timeseries background (e.g. \"Bars\") instead of being blank
Visualizations with obvious thresholds or zones use semantic formatting for graphs or custom red/green/yellow text formatting for query values.
Color coordination between group headers, notes within groups, and graphs within groups (e.g. all group headers or note widgets the same color). If you've applied a vivid green to all group headers, try making its notes light green.
Legends for each graph. Legends make it easy to read a graph without having to hover over each series or maximize the widget. Make sure you use aliases so the legend is easy to read. Automatic mode for legends is a great option that hides legends when space is tight and shows them when there's room.
Adjacent graphs have aligned x-axes. If one graph is showing a legend and the other isn't, the x-axes won't align\u2014make sure they either both show a legend or both do not.
For timeseries, base the display type on the type of metric.
Types of metric Display type Volume (e.g. number of connections) area Counts (e.g. number of errors) bars Multiple groups or default lines
"},{"location":"guidelines/dashboards/#creating-a-new-dashboard","title":"Creating a New Dashboard","text":"
After selecting New Dashboard, you will have the option to choose from: Dashboard, Screenboard, and Timeboard. Dashboard is recommended.
Add a logo to the dashboard header. The integration logo will automatically appear in the header if the icon exists here and the integration_id matches the icon name. That means it will only appear when the dashboard you're working on is made into the official integration board.
Include the integration name in the dashboard title. (e.g. \"Elasticsearch Overview Dashboard\").
Warning
Avoid using - (hyphen) in the dashboard title as the dashboard URL is generated from the title.
"},{"location":"guidelines/dashboards/#standard-groups-to-include","title":"Standard Groups to Include","text":"
Always include an About group for the integration containing a brief description and helpful links. Edit the About group and select the \"banner\" display option (with the \"Show Title\" option unchecked), then link to a banner image like this: /static/images/integration_dashboard/your-image.png. For instructions on how to create and upload a banner image, go to the DRUIDS logo gallery, click the relevant logo, and click the Dashboard Banner tab. The About section should contain content, not data; avoid making the About section full-width. Consider copying the content in the About section into the hovercard that appears when hovering over the dashboard title.
Also include an Overview group containing service checks (e.g. liveness or readiness checks), a few of the most important metrics, and a monitor summary if you have pre-existing monitors for this integration, and place it at the top of the dashboard. The Overview section should contain data.
If log collection is enabled, make a Logs group. Insert a timeseries widget showing a bar graph of logs by status over time. Also include a log stream of logs with the \"Error\" or \"Critical\" status.
Tip
Consider turning groups into powerpacks if they appear repeatedly in dashboards irrespective of the integration type, so that you can insert the entire group with the correct formatting with a few clicks rather than adding the same widgets from scratch each time.\n
Research the metrics supported by the integration and consider grouping them in relevant categories. Groups containing prioritized metrics that are key to the performance and overview of the integration should be closer to the top. Some considerations when deciding which widgets should be grouped together:
Go from macro to micro levels within the system (e.g. for a database integration's dashboard, you could group node metrics in one group, index metrics in the next group, shard metrics in the third group)
Go from upstream to downstream sections within the system (e.g. for a data streams integration's dashboard, you could group producer metrics in one group, broker metrics in the next group, and consumer metrics in the third group)
Group together metrics that lead to the same actionable insights (e.g. all indexing metrics that reveal which indexes/shards should be optimized could all go in one group, while resource utilization metrics like disk space or memory usage that inform allocation and redistribution decisions should be grouped together in a separate group).
Template variables allow you to dynamically filter one or more widgets in a dashboard. Template variables must be universal and accessible by any user or account using the monitored service. Make sure all relevant graphs are listening to the relevant template variable filters. Template variables should be customized based on the type of technology.
Type of integration technology Typical Template Variable Database Shards Data Streaming Consumer ML Model Serving Model
Tip
Adding *=scope as a template variable is useful since users can access all their own tags.
Prioritize concise graph titles that start with the most important information. Avoid common phrases such as \"number of\", and don't include the integration title e.g. \"Memcached Load\".
Concise title (good) Verbose title (bad) Events per node Number of Kubernetes events per node Pending tasks: [$node_name] Total number of pending tasks in [$node_name] Read/write operations Number of read/write operations Connections to server - rate Rate of connections to server Load Memcached Load
Avoid repeating the group title or integration name in every widget in a group, especially if the widgets are query values with a custom unit of the same name. Note the word \"shards\" in each widget title in the group named \"shards\".
Always alias formulas
Group titles should be title case. Widget titles should be sentence case.
If you're showing a legend, make sure the aliases are easy to understand.
Graph titles should summarize the queried metric. Do not indicate the unit in the graph title because unit types are displayed automatically from metadata. An exception to this is if the calculation of the query represents a different type of unit.
Which widgets best represent your data? Try using a mix of widget types and sizes. Explore visualizations and formatting options until you're confident your dashboard is as clear as it can be. Sometimes a whole dashboard of timeseries is ok, but other times variety can improve things. The most commonly used metric widgets are timeseries, query values, and tables. For more information on the available widget types, see the list of supported dashboard widgets.
Try to make the left and right halves of your dashboard symmetrical in high density mode. Users with large monitors will see your dashboard in high density mode by default, so it's important to make sure the group relationships make sense, and the dashboard looks good. You can adjust group heights to achieve this, and move groups between the left and right halves.
a. (perfectly symmetrical)
b. (close enough)
Timeseries widgets should be at least 4 columns wide in order not to appear squashed on smaller displays.
Stream widgets should be at least 6 columns wide (half the dashboard width) for readability. You should place them at the end of a dashboard so they don't \"trap\" scrolling. It's useful to put stream widgets in a group by themselves so they can be collapsed. Add an event stream only if the service monitored by the dashboard is reporting events. Use sources:service_name.
Always check a dashboard at 1280px wide and 2560px wide to see how it looks on a smaller laptop and a larger monitor. The most common screen widths for dashboards are 1920, 1680, 1440, 2560, and 1280px, making up more than half of all dashboard page views combined.
Tip
If your monitor isn't large enough for high density mode, use the browser zoom controls to zoom out.
"},{"location":"guidelines/pr/","title":"Pull requests","text":""},{"location":"guidelines/pr/#separation-of-concerns","title":"Separation of concerns","text":"
Every pull request should do one thing only for easier Git management. For example, if you are editing documentation and notice an error in the shipped example configuration, fix the error in a separate pull request. Doing so enables a clean cherry-pick or revert of the bug fix should the need arise.
Different guidelines apply depending on which repo you are contributing to.
integrations-extras and marketplaceintegrations-core
Every PR must add a changelog entry to each integration that has had its shipped code modified.
Each integration that can be installed on the Agent has its own CHANGELOG.md file at the root of its directory. Entries accumulate under the Unreleased section and at release time get put under their own section. For example:
# CHANGELOG - Foo\n\n## Unreleased\n\n***Changed***:\n\n* Made a breaking change ([#9000](https://github.com/DataDog/repo/pull/9000))\n\n Here's some extra context [...]\n\n***Added***:\n\n* Add a cool feature ([#42](https://github.com/DataDog/repo/pull/42))\n\n## 1.2.3 / 2081-04-01\n\n***Fixed***:\n\n...\n
For changelog types, we adhere to those defined by Keep a Changelog:
Added for new features or any non-trivial refactors.
Changed for changes in existing functionality.
Deprecated for soon-to-be removed features.
Removed for now removed features.
Fixed for any bug fixes.
Security in case of vulnerabilities.
The first line of every new changelog entry must end with a link to the PR in which the change occurred. To automatically apply this suffix to manually added entries, you may run the release changelog fix command. To create new entries, you may use the release changelog new command.
Tip
You may apply the changelog/no-changelog label to remove the CI check for changelog entries.
Formatting rules
If you are contributing to integrations-core all you need to do is use the release changelog new command. It adds files in the changelog.d folder inside the integrations that you have modified. Commit these files and push them to your PR.
If you decide that you do not need a changelog because the change you made won't be shipped with the Agent, add the changelog/no-changelog label to the PR.
The header for an integration version should be in the following format: version number / YYYY-MM-DD / Agent Version Number. The Agent version number is not necessary, but a valid version number and date are required. The first header after the file's title can be Unreleased. The content under this section is the same as any other.
Version is formatted incorrectly on line {line number}: The version you inputted is not a valid version, or there is no / separator between the version and date in your header.
Date is formatted incorrectly on line {line number}: The date must be formatted as YYYY-MM-DD, with no spaces in between.
The changelog header must be capitalized and written in this format: ***HEADER***:. Note that it should be bold and italicized.
Changelog type is incorrect on line {line count}: The changelog header on that line is not one of the six valid changelog types.
Changelog header order is incorrect on line {line count}: The changelog header on that line is in the wrong order. Double check the ordering of the changelogs and ensure that the headers for the changelog types are correctly ordered by priority.
Changelogs should start with asterisks, on line {line count}: All changelog details below each header should be bullet points, using asterisks.
A tool to sort imports lexicographically, by section, and by type. We use the 5 standard sections: __future__, stdlib, third party, first party, and local.
datadog_checks is configured as a first party namespace.
An easy-to-use wrapper around pycodestyle and pyflakes. We select everything it provides and only ignore a few things to give precedence to other tools.
A flake8 plugin for finding likely bugs and design problems in programs. We enable:
B001: Do not use bare except:, it also catches unexpected events like memory errors, interrupts, system exit, and so on. Prefer except Exception:.
B003: Assigning to os.environ doesn't clear the environment. Subprocesses are going to see outdated variables, in disagreement with the current process. Use os.environ.clear() or the env= argument to Popen.
B006: Do not use mutable data structures for argument defaults. All calls reuse one instance of that data structure, persisting changes between them.
B007: Loop control variable not used within the loop body. If this is intended, start the name with an underscore.
B301: Python 3 does not include .iter* methods on dictionaries. The default behavior is to return iterables. Simply remove the iter prefix from the method. For Python 2 compatibility, also prefer the Python 3 equivalent if you expect that the size of the dict to be small and bounded. The performance regression on Python 2 will be negligible and the code is going to be the clearest. Alternatively, use six.iter*.
B305: .next() is not a thing on Python 3. Use the next() builtin. For Python 2 compatibility, use six.next().
B306: BaseException.message has been deprecated as of Python 2.6 and is removed in Python 3. Use str(e) to access the user-readable message. Use e.args to access arguments passed to the exception.
B902: Invalid first argument used for method. Use self for instance methods, and cls for class methods.
A comment-based type checker allowing a mix of dynamic and static typing. This is optional for now. In order to enable mypy for a specific integration, open its hatch.toml file and add the lines in the correct section:
The mypy-args defines the mypy command line option for this specific integration. --py2 is here to make sure the integration is Python2.7 compatible. Here are some useful flags you can add:
--check-untyped-defs: Type-checks the interior of functions without type annotations.
--disallow-untyped-defs: Disallows defining functions without type annotations or with incomplete type annotations.
The datadog_checks/ tests/ arguments represent the list of files that mypy should type check. Feel free to edit them as desired, including removing tests/ (if you'd prefer to not type-check the test suite), or targeting specific files (when doing partial type checking).
Note that there is a default configuration in the mypy.ini file.
Prometheus is an open source monitoring system for timeseries metric data. Many Datadog integrations collect metrics based on Prometheus exported data sets.
Prometheus-based integrations use the OpenMetrics exposition format to collect metrics.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
class OpenMetricsBaseCheck(OpenMetricsScraperMixin, AgentCheck):\n \"\"\"\n OpenMetricsBaseCheck is a class that helps scrape endpoints that emit Prometheus metrics only\n with YAML configurations.\n\n Minimal example configuration:\n\n instances:\n - prometheus_url: http://example.com/endpoint\n namespace: \"foobar\"\n metrics:\n - bar\n - foo\n\n Agent 6 signature:\n\n OpenMetricsBaseCheck(name, init_config, instances, default_instances=None, default_namespace=None)\n\n \"\"\"\n\n DEFAULT_METRIC_LIMIT = 2000\n\n HTTP_CONFIG_REMAPPER = {\n 'ssl_verify': {'name': 'tls_verify'},\n 'ssl_cert': {'name': 'tls_cert'},\n 'ssl_private_key': {'name': 'tls_private_key'},\n 'ssl_ca_cert': {'name': 'tls_ca_cert'},\n 'prometheus_timeout': {'name': 'timeout'},\n 'request_size': {'name': 'request_size', 'default': 10},\n }\n\n # Allow tracing for openmetrics integrations\n def __init_subclass__(cls, **kwargs):\n super().__init_subclass__(**kwargs)\n return traced_class(cls)\n\n def __init__(self, *args, **kwargs):\n \"\"\"\n The base class for any Prometheus-based integration.\n \"\"\"\n args = list(args)\n default_instances = kwargs.pop('default_instances', None) or {}\n default_namespace = kwargs.pop('default_namespace', None)\n\n legacy_kwargs_in_args = args[4:]\n del args[4:]\n\n if len(legacy_kwargs_in_args) > 0:\n default_instances = legacy_kwargs_in_args[0] or {}\n if len(legacy_kwargs_in_args) > 1:\n default_namespace = legacy_kwargs_in_args[1]\n\n super(OpenMetricsBaseCheck, self).__init__(*args, **kwargs)\n self.config_map = {}\n self._http_handlers = {}\n self.default_instances = default_instances\n self.default_namespace = default_namespace\n\n # pre-generate the scraper configurations\n\n if 'instances' in kwargs:\n instances = kwargs['instances']\n elif len(args) == 4:\n # instances from agent 5 signature\n instances = args[3]\n elif isinstance(args[2], (tuple, list)):\n # instances from agent 6 signature\n instances = args[2]\n else:\n instances = None\n\n if instances is not None:\n for instance in instances:\n possible_urls = instance.get('possible_prometheus_urls')\n if possible_urls is not None:\n for url in possible_urls:\n try:\n new_instance = deepcopy(instance)\n new_instance.update({'prometheus_url': url})\n scraper_config = self.get_scraper_config(new_instance)\n response = self.send_request(url, scraper_config)\n response.raise_for_status()\n instance['prometheus_url'] = url\n self.get_scraper_config(instance)\n break\n except (IOError, requests.HTTPError, requests.exceptions.SSLError) as e:\n self.log.info(\"Couldn't connect to %s: %s, trying next possible URL.\", url, str(e))\n else:\n raise CheckException(\n \"The agent could not connect to any of the following URLs: %s.\" % possible_urls\n )\n else:\n self.get_scraper_config(instance)\n\n def check(self, instance):\n # Get the configuration for this specific instance\n scraper_config = self.get_scraper_config(instance)\n\n # We should be specifying metrics for checks that are vanilla OpenMetricsBaseCheck-based\n if not scraper_config['metrics_mapper']:\n raise CheckException(\n \"You have to collect at least one metric from the endpoint: {}\".format(scraper_config['prometheus_url'])\n )\n\n self.process(scraper_config)\n\n def get_scraper_config(self, instance):\n \"\"\"\n Validates the instance configuration and creates a scraper configuration for a new instance.\n If the endpoint already has a corresponding configuration, return the cached configuration.\n \"\"\"\n endpoint = instance.get('prometheus_url')\n\n if endpoint is None:\n raise CheckException(\"Unable to find prometheus URL in config file.\")\n\n # If we've already created the corresponding scraper configuration, return it\n if endpoint in self.config_map:\n return self.config_map[endpoint]\n\n # Otherwise, we create the scraper configuration\n config = self.create_scraper_configuration(instance)\n\n # Add this configuration to the config_map\n self.config_map[endpoint] = config\n\n return config\n\n def _finalize_tags_to_submit(self, _tags, metric_name, val, metric, custom_tags=None, hostname=None):\n \"\"\"\n Format the finalized tags\n This is generally a noop, but it can be used to change the tags before sending metrics\n \"\"\"\n return _tags\n\n def _filter_metric(self, metric, scraper_config):\n \"\"\"\n Used to filter metrics at the beginning of the processing, by default no metric is filtered\n \"\"\"\n return False\n
The base class for any Prometheus-based integration.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
def __init__(self, *args, **kwargs):\n \"\"\"\n The base class for any Prometheus-based integration.\n \"\"\"\n args = list(args)\n default_instances = kwargs.pop('default_instances', None) or {}\n default_namespace = kwargs.pop('default_namespace', None)\n\n legacy_kwargs_in_args = args[4:]\n del args[4:]\n\n if len(legacy_kwargs_in_args) > 0:\n default_instances = legacy_kwargs_in_args[0] or {}\n if len(legacy_kwargs_in_args) > 1:\n default_namespace = legacy_kwargs_in_args[1]\n\n super(OpenMetricsBaseCheck, self).__init__(*args, **kwargs)\n self.config_map = {}\n self._http_handlers = {}\n self.default_instances = default_instances\n self.default_namespace = default_namespace\n\n # pre-generate the scraper configurations\n\n if 'instances' in kwargs:\n instances = kwargs['instances']\n elif len(args) == 4:\n # instances from agent 5 signature\n instances = args[3]\n elif isinstance(args[2], (tuple, list)):\n # instances from agent 6 signature\n instances = args[2]\n else:\n instances = None\n\n if instances is not None:\n for instance in instances:\n possible_urls = instance.get('possible_prometheus_urls')\n if possible_urls is not None:\n for url in possible_urls:\n try:\n new_instance = deepcopy(instance)\n new_instance.update({'prometheus_url': url})\n scraper_config = self.get_scraper_config(new_instance)\n response = self.send_request(url, scraper_config)\n response.raise_for_status()\n instance['prometheus_url'] = url\n self.get_scraper_config(instance)\n break\n except (IOError, requests.HTTPError, requests.exceptions.SSLError) as e:\n self.log.info(\"Couldn't connect to %s: %s, trying next possible URL.\", url, str(e))\n else:\n raise CheckException(\n \"The agent could not connect to any of the following URLs: %s.\" % possible_urls\n )\n else:\n self.get_scraper_config(instance)\n
"},{"location":"legacy/prometheus/#datadog_checks.base.checks.openmetrics.base_check.OpenMetricsBaseCheck.check","title":"check(instance)","text":"Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
def check(self, instance):\n # Get the configuration for this specific instance\n scraper_config = self.get_scraper_config(instance)\n\n # We should be specifying metrics for checks that are vanilla OpenMetricsBaseCheck-based\n if not scraper_config['metrics_mapper']:\n raise CheckException(\n \"You have to collect at least one metric from the endpoint: {}\".format(scraper_config['prometheus_url'])\n )\n\n self.process(scraper_config)\n
Validates the instance configuration and creates a scraper configuration for a new instance. If the endpoint already has a corresponding configuration, return the cached configuration.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
def get_scraper_config(self, instance):\n \"\"\"\n Validates the instance configuration and creates a scraper configuration for a new instance.\n If the endpoint already has a corresponding configuration, return the cached configuration.\n \"\"\"\n endpoint = instance.get('prometheus_url')\n\n if endpoint is None:\n raise CheckException(\"Unable to find prometheus URL in config file.\")\n\n # If we've already created the corresponding scraper configuration, return it\n if endpoint in self.config_map:\n return self.config_map[endpoint]\n\n # Otherwise, we create the scraper configuration\n config = self.create_scraper_configuration(instance)\n\n # Add this configuration to the config_map\n self.config_map[endpoint] = config\n\n return config\n
"},{"location":"legacy/prometheus/#datadog_checks.base.checks.openmetrics.mixins.OpenMetricsScraperMixin","title":"datadog_checks.base.checks.openmetrics.mixins.OpenMetricsScraperMixin","text":"Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
class OpenMetricsScraperMixin(object):\n # pylint: disable=E1101\n # This class is not supposed to be used by itself, it provides scraping behavior but\n # need to be within a check in the end\n\n # indexes in the sample tuple of core.Metric\n SAMPLE_NAME = 0\n SAMPLE_LABELS = 1\n SAMPLE_VALUE = 2\n\n MICROS_IN_S = 1000000\n\n MINUS_INF = float(\"-inf\")\n\n TELEMETRY_GAUGE_MESSAGE_SIZE = \"payload.size\"\n TELEMETRY_COUNTER_METRICS_BLACKLIST_COUNT = \"metrics.blacklist.count\"\n TELEMETRY_COUNTER_METRICS_INPUT_COUNT = \"metrics.input.count\"\n TELEMETRY_COUNTER_METRICS_IGNORE_COUNT = \"metrics.ignored.count\"\n TELEMETRY_COUNTER_METRICS_PROCESS_COUNT = \"metrics.processed.count\"\n\n METRIC_TYPES = ['counter', 'gauge', 'summary', 'histogram']\n\n KUBERNETES_TOKEN_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token'\n METRICS_WITH_COUNTERS = {\"counter\", \"histogram\", \"summary\"}\n\n def __init__(self, *args, **kwargs):\n # Initialize AgentCheck's base class\n super(OpenMetricsScraperMixin, self).__init__(*args, **kwargs)\n\n def create_scraper_configuration(self, instance=None):\n \"\"\"\n Creates a scraper configuration.\n\n If instance does not specify a value for a configuration option, the value will default to the `init_config`.\n Otherwise, the `default_instance` value will be used.\n\n A default mixin configuration will be returned if there is no instance.\n \"\"\"\n if 'openmetrics_endpoint' in instance:\n raise CheckException('The setting `openmetrics_endpoint` is only available for Agent version 7 or later')\n\n # We can choose to create a default mixin configuration for an empty instance\n if instance is None:\n instance = {}\n\n # Supports new configuration options\n config = copy.deepcopy(instance)\n\n # Set the endpoint\n endpoint = instance.get('prometheus_url')\n if instance and endpoint is None:\n raise CheckException(\"You have to define a prometheus_url for each prometheus instance\")\n\n # Set the bearer token authorization to customer value, then get the bearer token\n self.update_prometheus_url(instance, config, endpoint)\n\n # `NAMESPACE` is the prefix metrics will have. Need to be hardcoded in the\n # child check class.\n namespace = instance.get('namespace')\n # Check if we have a namespace\n if instance and namespace is None:\n if self.default_namespace is None:\n raise CheckException(\"You have to define a namespace for each prometheus check\")\n namespace = self.default_namespace\n\n config['namespace'] = namespace\n\n # Retrieve potential default instance settings for the namespace\n default_instance = self.default_instances.get(namespace, {})\n\n def _get_setting(name, default):\n return instance.get(name, default_instance.get(name, default))\n\n # `metrics_mapper` is a dictionary where the keys are the metrics to capture\n # and the values are the corresponding metrics names to have in datadog.\n # Note: it is empty in the parent class but will need to be\n # overloaded/hardcoded in the final check not to be counted as custom metric.\n\n # Metrics are preprocessed if no mapping\n metrics_mapper = {}\n # We merge list and dictionaries from optional defaults & instance settings\n metrics = default_instance.get('metrics', []) + instance.get('metrics', [])\n for metric in metrics:\n if isinstance(metric, string_types):\n metrics_mapper[metric] = metric\n else:\n metrics_mapper.update(metric)\n\n config['metrics_mapper'] = metrics_mapper\n\n # `_wildcards_re` is a Pattern object used to match metric wildcards\n config['_wildcards_re'] = None\n\n wildcards = set()\n for metric in config['metrics_mapper']:\n if \"*\" in metric:\n wildcards.add(translate(metric))\n\n if wildcards:\n config['_wildcards_re'] = compile('|'.join(wildcards))\n\n # `prometheus_metrics_prefix` allows to specify a prefix that all\n # prometheus metrics should have. This can be used when the prometheus\n # endpoint we are scrapping allows to add a custom prefix to it's\n # metrics.\n config['prometheus_metrics_prefix'] = instance.get(\n 'prometheus_metrics_prefix', default_instance.get('prometheus_metrics_prefix', '')\n )\n\n # `label_joins` holds the configuration for extracting 1:1 labels from\n # a target metric to all metric matching the label, example:\n # self.label_joins = {\n # 'kube_pod_info': {\n # 'labels_to_match': ['pod'],\n # 'labels_to_get': ['node', 'host_ip']\n # }\n # }\n config['label_joins'] = default_instance.get('label_joins', {})\n config['label_joins'].update(instance.get('label_joins', {}))\n\n # `_label_mapping` holds the additionals label info to add for a specific\n # label value, example:\n # self._label_mapping = {\n # 'pod': {\n # 'dd-agent-9s1l1': {\n # \"node\": \"yolo\",\n # \"host_ip\": \"yey\"\n # }\n # }\n # }\n config['_label_mapping'] = {}\n\n # `_active_label_mapping` holds a dictionary of label values found during the run\n # to cleanup the label_mapping of unused values, example:\n # self._active_label_mapping = {\n # 'pod': {\n # 'dd-agent-9s1l1': True\n # }\n # }\n config['_active_label_mapping'] = {}\n\n # `_watched_labels` holds the sets of labels to watch for enrichment\n config['_watched_labels'] = {}\n\n config['_dry_run'] = True\n\n # Some metrics are ignored because they are duplicates or introduce a\n # very high cardinality. Metrics included in this list will be silently\n # skipped without a 'Unable to handle metric' debug line in the logs\n config['ignore_metrics'] = instance.get('ignore_metrics', default_instance.get('ignore_metrics', []))\n config['_ignored_metrics'] = set()\n\n # `_ignored_re` is a Pattern object used to match ignored metric patterns\n config['_ignored_re'] = None\n ignored_patterns = set()\n\n # Separate ignored metric names and ignored patterns in different sets for faster lookup later\n for metric in config['ignore_metrics']:\n if '*' in metric:\n ignored_patterns.add(translate(metric))\n else:\n config['_ignored_metrics'].add(metric)\n\n if ignored_patterns:\n config['_ignored_re'] = compile('|'.join(ignored_patterns))\n\n # Ignore metrics based on label keys or specific label values\n config['ignore_metrics_by_labels'] = instance.get(\n 'ignore_metrics_by_labels', default_instance.get('ignore_metrics_by_labels', {})\n )\n\n # If you want to send the buckets as tagged values when dealing with histograms,\n # set send_histograms_buckets to True, set to False otherwise.\n config['send_histograms_buckets'] = is_affirmative(\n instance.get('send_histograms_buckets', default_instance.get('send_histograms_buckets', True))\n )\n\n # If you want the bucket to be non cumulative and to come with upper/lower bound tags\n # set non_cumulative_buckets to True, enabled when distribution metrics are enabled.\n config['non_cumulative_buckets'] = is_affirmative(\n instance.get('non_cumulative_buckets', default_instance.get('non_cumulative_buckets', False))\n )\n\n # Send histograms as datadog distribution metrics\n config['send_distribution_buckets'] = is_affirmative(\n instance.get('send_distribution_buckets', default_instance.get('send_distribution_buckets', False))\n )\n\n # Non cumulative buckets are mandatory for distribution metrics\n if config['send_distribution_buckets'] is True:\n config['non_cumulative_buckets'] = True\n\n # If you want to send `counter` metrics as monotonic counts, set this value to True.\n # Set to False if you want to instead send those metrics as `gauge`.\n config['send_monotonic_counter'] = is_affirmative(\n instance.get('send_monotonic_counter', default_instance.get('send_monotonic_counter', True))\n )\n\n # If you want `counter` metrics to be submitted as both gauges and monotonic counts. Set this value to True.\n config['send_monotonic_with_gauge'] = is_affirmative(\n instance.get('send_monotonic_with_gauge', default_instance.get('send_monotonic_with_gauge', False))\n )\n\n config['send_distribution_counts_as_monotonic'] = is_affirmative(\n instance.get(\n 'send_distribution_counts_as_monotonic',\n default_instance.get('send_distribution_counts_as_monotonic', False),\n )\n )\n\n config['send_distribution_sums_as_monotonic'] = is_affirmative(\n instance.get(\n 'send_distribution_sums_as_monotonic',\n default_instance.get('send_distribution_sums_as_monotonic', False),\n )\n )\n\n # If the `labels_mapper` dictionary is provided, the metrics labels names\n # in the `labels_mapper` will use the corresponding value as tag name\n # when sending the gauges.\n config['labels_mapper'] = default_instance.get('labels_mapper', {})\n config['labels_mapper'].update(instance.get('labels_mapper', {}))\n # Rename bucket \"le\" label to \"upper_bound\"\n config['labels_mapper']['le'] = 'upper_bound'\n\n # `exclude_labels` is an array of label names to exclude. Those labels\n # will just not be added as tags when submitting the metric.\n config['exclude_labels'] = default_instance.get('exclude_labels', []) + instance.get('exclude_labels', [])\n\n # `include_labels` is an array of label names to include. If these labels are not in\n # the `exclude_labels` list, then they are added as tags when submitting the metric.\n config['include_labels'] = default_instance.get('include_labels', []) + instance.get('include_labels', [])\n\n # `type_overrides` is a dictionary where the keys are prometheus metric names\n # and the values are a metric type (name as string) to use instead of the one\n # listed in the payload. It can be used to force a type on untyped metrics.\n # Note: it is empty in the parent class but will need to be\n # overloaded/hardcoded in the final check not to be counted as custom metric.\n config['type_overrides'] = default_instance.get('type_overrides', {})\n config['type_overrides'].update(instance.get('type_overrides', {}))\n\n # `_type_override_patterns` is a dictionary where we store Pattern objects\n # that match metric names as keys, and their corresponding metric type overrides as values.\n config['_type_override_patterns'] = {}\n\n with_wildcards = set()\n for metric, type in iteritems(config['type_overrides']):\n if '*' in metric:\n config['_type_override_patterns'][compile(translate(metric))] = type\n with_wildcards.add(metric)\n\n # cleanup metric names with wildcards from the 'type_overrides' dict\n for metric in with_wildcards:\n del config['type_overrides'][metric]\n\n # Some metrics are retrieved from different hosts and often\n # a label can hold this information, this transfers it to the hostname\n config['label_to_hostname'] = instance.get('label_to_hostname', default_instance.get('label_to_hostname', None))\n\n # In combination to label_as_hostname, allows to add a common suffix to the hostnames\n # submitted. This can be used for instance to discriminate hosts between clusters.\n config['label_to_hostname_suffix'] = instance.get(\n 'label_to_hostname_suffix', default_instance.get('label_to_hostname_suffix', None)\n )\n\n # Add a 'health' service check for the prometheus endpoint\n config['health_service_check'] = is_affirmative(\n instance.get('health_service_check', default_instance.get('health_service_check', True))\n )\n\n # Can either be only the path to the certificate and thus you should specify the private key\n # or it can be the path to a file containing both the certificate & the private key\n config['ssl_cert'] = instance.get('ssl_cert', default_instance.get('ssl_cert', None))\n\n # Needed if the certificate does not include the private key\n #\n # /!\\ The private key to your local certificate must be unencrypted.\n # Currently, Requests does not support using encrypted keys.\n config['ssl_private_key'] = instance.get('ssl_private_key', default_instance.get('ssl_private_key', None))\n\n # The path to the trusted CA used for generating custom certificates\n config['ssl_ca_cert'] = instance.get('ssl_ca_cert', default_instance.get('ssl_ca_cert', None))\n\n # Whether or not to validate SSL certificates\n config['ssl_verify'] = is_affirmative(instance.get('ssl_verify', default_instance.get('ssl_verify', True)))\n\n # Extra http headers to be sent when polling endpoint\n config['extra_headers'] = default_instance.get('extra_headers', {})\n config['extra_headers'].update(instance.get('extra_headers', {}))\n\n # Timeout used during the network request\n config['prometheus_timeout'] = instance.get(\n 'prometheus_timeout', default_instance.get('prometheus_timeout', 10)\n )\n\n # Authentication used when polling endpoint\n config['username'] = instance.get('username', default_instance.get('username', None))\n config['password'] = instance.get('password', default_instance.get('password', None))\n\n # Custom tags that will be sent with each metric\n config['custom_tags'] = instance.get('tags', [])\n\n # Some tags can be ignored to reduce the cardinality.\n # This can be useful for cost optimization in containerized environments\n # when the openmetrics check is configured to collect custom metrics.\n # Even when the Agent's Tagger is configured to add low-cardinality tags only,\n # some tags can still generate unwanted metric contexts (e.g pod annotations as tags).\n ignore_tags = instance.get('ignore_tags', default_instance.get('ignore_tags', []))\n if ignore_tags:\n ignored_tags_re = compile('|'.join(set(ignore_tags)))\n config['custom_tags'] = [tag for tag in config['custom_tags'] if not ignored_tags_re.search(tag)]\n\n # Additional tags to be sent with each metric\n config['_metric_tags'] = []\n\n # List of strings to filter the input text payload on. If any line contains\n # one of these strings, it will be filtered out before being parsed.\n # INTERNAL FEATURE, might be removed in future versions\n config['_text_filter_blacklist'] = []\n\n # Refresh the bearer token every 60 seconds by default.\n # Ref https://github.com/DataDog/datadog-agent/pull/11686\n config['bearer_token_refresh_interval'] = instance.get(\n 'bearer_token_refresh_interval', default_instance.get('bearer_token_refresh_interval', 60)\n )\n\n config['telemetry'] = is_affirmative(instance.get('telemetry', default_instance.get('telemetry', False)))\n\n # The metric name services use to indicate build information\n config['metadata_metric_name'] = instance.get(\n 'metadata_metric_name', default_instance.get('metadata_metric_name')\n )\n\n # Map of metadata key names to label names\n config['metadata_label_map'] = instance.get(\n 'metadata_label_map', default_instance.get('metadata_label_map', {})\n )\n\n config['_default_metric_transformers'] = {}\n if config['metadata_metric_name'] and config['metadata_label_map']:\n config['_default_metric_transformers'][config['metadata_metric_name']] = self.transform_metadata\n\n # Whether or not to enable flushing of the first value of monotonic counts\n config['_flush_first_value'] = False\n\n # Whether to use process_start_time_seconds to decide if counter-like values should be flushed\n # on first scrape.\n config['use_process_start_time'] = is_affirmative(_get_setting('use_process_start_time', False))\n\n return config\n\n def get_http_handler(self, scraper_config):\n \"\"\"\n Get http handler for a specific scraper config.\n The http handler is cached using `prometheus_url` as key.\n The http handler doesn't use the cache if a bearer token is used to allow refreshing it.\n \"\"\"\n prometheus_url = scraper_config['prometheus_url']\n bearer_token = scraper_config['_bearer_token']\n if prometheus_url in self._http_handlers and bearer_token is None:\n return self._http_handlers[prometheus_url]\n\n # TODO: Deprecate this behavior in Agent 8\n if scraper_config['ssl_ca_cert'] is False:\n scraper_config['ssl_verify'] = False\n\n # TODO: Deprecate this behavior in Agent 8\n if scraper_config['ssl_verify'] is False:\n scraper_config.setdefault('tls_ignore_warning', True)\n\n http_handler = self._http_handlers[prometheus_url] = RequestsWrapper(\n scraper_config, self.init_config, self.HTTP_CONFIG_REMAPPER, self.log\n )\n\n headers = http_handler.options['headers']\n\n bearer_token = scraper_config['_bearer_token']\n if bearer_token is not None:\n headers['Authorization'] = 'Bearer {}'.format(bearer_token)\n\n # TODO: Determine if we really need this\n headers.setdefault('accept-encoding', 'gzip')\n\n # Explicitly set the content type we accept\n headers.setdefault('accept', 'text/plain')\n\n return http_handler\n\n def reset_http_config(self):\n \"\"\"\n You may need to use this when configuration is determined dynamically during every\n check run, such as when polling an external resource like the Kubelet.\n \"\"\"\n self._http_handlers.clear()\n\n def update_prometheus_url(self, instance, config, endpoint):\n if not endpoint:\n return\n\n config['prometheus_url'] = endpoint\n # Whether or not to use the service account bearer token for authentication.\n # Can be explicitly set to true or false to send or not the bearer token.\n # If set to the `tls_only` value, the bearer token will be sent only to https endpoints.\n # If 'bearer_token_path' is not set, we use /var/run/secrets/kubernetes.io/serviceaccount/token\n # as a default path to get the token.\n namespace = instance.get('namespace')\n default_instance = self.default_instances.get(namespace, {})\n bearer_token_auth = instance.get('bearer_token_auth', default_instance.get('bearer_token_auth', False))\n if bearer_token_auth == 'tls_only':\n config['bearer_token_auth'] = config['prometheus_url'].startswith(\"https://\")\n else:\n config['bearer_token_auth'] = is_affirmative(bearer_token_auth)\n\n # Can be used to get a service account bearer token from files\n # other than /var/run/secrets/kubernetes.io/serviceaccount/token\n # 'bearer_token_auth' should be enabled.\n config['bearer_token_path'] = instance.get('bearer_token_path', default_instance.get('bearer_token_path', None))\n\n # The service account bearer token to be used for authentication\n config['_bearer_token'] = self._get_bearer_token(config['bearer_token_auth'], config['bearer_token_path'])\n config['_bearer_token_last_refresh'] = time.time()\n\n def parse_metric_family(self, response, scraper_config):\n \"\"\"\n Parse the MetricFamily from a valid `requests.Response` object to provide a MetricFamily object.\n The text format uses iter_lines() generator.\n \"\"\"\n if response.encoding is None:\n response.encoding = 'utf-8'\n input_gen = response.iter_lines(decode_unicode=True)\n if scraper_config['_text_filter_blacklist']:\n input_gen = self._text_filter_input(input_gen, scraper_config)\n\n for metric in text_fd_to_metric_families(input_gen):\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_INPUT_COUNT, len(metric.samples), scraper_config\n )\n type_override = scraper_config['type_overrides'].get(metric.name)\n if type_override:\n metric.type = type_override\n elif scraper_config['_type_override_patterns']:\n for pattern, new_type in iteritems(scraper_config['_type_override_patterns']):\n if pattern.search(metric.name):\n metric.type = new_type\n break\n if metric.type not in self.METRIC_TYPES:\n continue\n metric.name = self._remove_metric_prefix(metric.name, scraper_config)\n yield metric\n\n def _text_filter_input(self, input_gen, scraper_config):\n \"\"\"\n Filters out the text input line by line to avoid parsing and processing\n metrics we know we don't want to process. This only works on `text/plain`\n payloads, and is an INTERNAL FEATURE implemented for the kubelet check\n :param input_get: line generator\n :output: generator of filtered lines\n \"\"\"\n for line in input_gen:\n for item in scraper_config['_text_filter_blacklist']:\n if item in line:\n self._send_telemetry_counter(self.TELEMETRY_COUNTER_METRICS_BLACKLIST_COUNT, 1, scraper_config)\n break\n else:\n # No blacklist matches, passing the line through\n yield line\n\n def _remove_metric_prefix(self, metric, scraper_config):\n prometheus_metrics_prefix = scraper_config['prometheus_metrics_prefix']\n return metric[len(prometheus_metrics_prefix) :] if metric.startswith(prometheus_metrics_prefix) else metric\n\n def scrape_metrics(self, scraper_config):\n \"\"\"\n Poll the data from Prometheus and return the metrics as a generator.\n \"\"\"\n response = self.poll(scraper_config)\n if scraper_config['telemetry']:\n if 'content-length' in response.headers:\n content_len = int(response.headers['content-length'])\n else:\n content_len = len(response.content)\n self._send_telemetry_gauge(self.TELEMETRY_GAUGE_MESSAGE_SIZE, content_len, scraper_config)\n try:\n # no dry run if no label joins\n if not scraper_config['label_joins']:\n scraper_config['_dry_run'] = False\n elif not scraper_config['_watched_labels']:\n watched = scraper_config['_watched_labels']\n watched['sets'] = {}\n watched['keys'] = {}\n watched['singles'] = set()\n for key, val in iteritems(scraper_config['label_joins']):\n labels = []\n if 'labels_to_match' in val:\n labels = val['labels_to_match']\n elif 'label_to_match' in val:\n self.log.warning(\"`label_to_match` is being deprecated, please use `labels_to_match`\")\n if isinstance(val['label_to_match'], list):\n labels = val['label_to_match']\n else:\n labels = [val['label_to_match']]\n\n if labels:\n s = frozenset(labels)\n watched['sets'][key] = s\n watched['keys'][key] = ','.join(s)\n if len(labels) == 1:\n watched['singles'].add(labels[0])\n\n for metric in self.parse_metric_family(response, scraper_config):\n yield metric\n\n # Set dry run off\n scraper_config['_dry_run'] = False\n # Garbage collect unused mapping and reset active labels\n for metric, mapping in list(iteritems(scraper_config['_label_mapping'])):\n for key in list(mapping):\n if (\n metric in scraper_config['_active_label_mapping']\n and key not in scraper_config['_active_label_mapping'][metric]\n ):\n del scraper_config['_label_mapping'][metric][key]\n scraper_config['_active_label_mapping'] = {}\n finally:\n response.close()\n\n def process(self, scraper_config, metric_transformers=None):\n \"\"\"\n Polls the data from Prometheus and submits them as Datadog metrics.\n `endpoint` is the metrics endpoint to use to poll metrics from Prometheus\n\n Note that if the instance has a `tags` attribute, it will be pushed\n automatically as additional custom tags and added to the metrics\n \"\"\"\n\n transformers = scraper_config['_default_metric_transformers'].copy()\n if metric_transformers:\n transformers.update(metric_transformers)\n\n counter_buffer = []\n agent_start_time = None\n process_start_time = None\n if not scraper_config['_flush_first_value'] and scraper_config['use_process_start_time']:\n agent_start_time = datadog_agent.get_process_start_time()\n\n if scraper_config['bearer_token_auth']:\n self._refresh_bearer_token(scraper_config)\n\n for metric in self.scrape_metrics(scraper_config):\n if agent_start_time is not None:\n if metric.name == 'process_start_time_seconds' and metric.samples:\n min_metric_value = min(s[self.SAMPLE_VALUE] for s in metric.samples)\n if process_start_time is None or min_metric_value < process_start_time:\n process_start_time = min_metric_value\n if metric.type in self.METRICS_WITH_COUNTERS:\n counter_buffer.append(metric)\n continue\n\n self.process_metric(metric, scraper_config, metric_transformers=transformers)\n\n if agent_start_time and process_start_time and agent_start_time < process_start_time:\n # If agent was started before the process, we assume counters were started recently from zero,\n # and thus we can compute the rates.\n scraper_config['_flush_first_value'] = True\n\n for metric in counter_buffer:\n self.process_metric(metric, scraper_config, metric_transformers=transformers)\n\n scraper_config['_flush_first_value'] = True\n\n def transform_metadata(self, metric, scraper_config):\n labels = metric.samples[0][self.SAMPLE_LABELS]\n for metadata_name, label_name in iteritems(scraper_config['metadata_label_map']):\n if label_name in labels:\n self.set_metadata(metadata_name, labels[label_name])\n\n def _metric_name_with_namespace(self, metric_name, scraper_config):\n namespace = scraper_config['namespace']\n if not namespace:\n return metric_name\n return '{}.{}'.format(namespace, metric_name)\n\n def _telemetry_metric_name_with_namespace(self, metric_name, scraper_config):\n namespace = scraper_config['namespace']\n if not namespace:\n return '{}.{}'.format('telemetry', metric_name)\n return '{}.{}.{}'.format(namespace, 'telemetry', metric_name)\n\n def _send_telemetry_gauge(self, metric_name, val, scraper_config):\n if scraper_config['telemetry']:\n metric_name_with_namespace = self._telemetry_metric_name_with_namespace(metric_name, scraper_config)\n # Determine the tags to send\n custom_tags = scraper_config['custom_tags']\n tags = list(custom_tags)\n tags.extend(scraper_config['_metric_tags'])\n self.gauge(metric_name_with_namespace, val, tags=tags)\n\n def _send_telemetry_counter(self, metric_name, val, scraper_config, extra_tags=None):\n if scraper_config['telemetry']:\n metric_name_with_namespace = self._telemetry_metric_name_with_namespace(metric_name, scraper_config)\n # Determine the tags to send\n custom_tags = scraper_config['custom_tags']\n tags = list(custom_tags)\n tags.extend(scraper_config['_metric_tags'])\n if extra_tags:\n tags.extend(extra_tags)\n self.count(metric_name_with_namespace, val, tags=tags)\n\n def _store_labels(self, metric, scraper_config):\n # If targeted metric, store labels\n if metric.name not in scraper_config['label_joins']:\n return\n\n watched = scraper_config['_watched_labels']\n matching_labels = watched['sets'][metric.name]\n mapping_key = watched['keys'][metric.name]\n\n labels_to_get = scraper_config['label_joins'][metric.name]['labels_to_get']\n get_all = '*' in labels_to_get\n match_all = mapping_key == '*'\n for sample in metric.samples:\n # metadata-only metrics that are used for label joins are always equal to 1\n # this is required for metrics where all combinations of a state are sent\n # but only the active one is set to 1 (others are set to 0)\n # example: kube_pod_status_phase in kube-state-metrics\n if sample[self.SAMPLE_VALUE] != 1:\n continue\n\n sample_labels = sample[self.SAMPLE_LABELS]\n sample_labels_keys = sample_labels.keys()\n\n if match_all or matching_labels.issubset(sample_labels_keys):\n label_dict = {}\n\n if get_all:\n for label_name, label_value in iteritems(sample_labels):\n if label_name in matching_labels:\n continue\n label_dict[label_name] = label_value\n else:\n for label_name in labels_to_get:\n if label_name in sample_labels:\n label_dict[label_name] = sample_labels[label_name]\n\n if match_all:\n mapping_value = '*'\n else:\n mapping_value = ','.join([sample_labels[l] for l in matching_labels])\n\n scraper_config['_label_mapping'].setdefault(mapping_key, {}).setdefault(mapping_value, {}).update(\n label_dict\n )\n\n def _join_labels(self, metric, scraper_config):\n # Filter metric to see if we can enrich with joined labels\n if not scraper_config['label_joins']:\n return\n\n label_mapping = scraper_config['_label_mapping']\n active_label_mapping = scraper_config['_active_label_mapping']\n\n watched = scraper_config['_watched_labels']\n sets = watched['sets']\n keys = watched['keys']\n singles = watched['singles']\n\n for sample in metric.samples:\n sample_labels = sample[self.SAMPLE_LABELS]\n sample_labels_keys = sample_labels.keys()\n\n # Match with wildcard label\n # Label names are [a-zA-Z0-9_]*, so no risk of collision\n if '*' in singles:\n active_label_mapping.setdefault('*', {})['*'] = True\n\n if '*' in label_mapping and '*' in label_mapping['*']:\n sample_labels.update(label_mapping['*']['*'])\n\n # Match with single labels\n matching_single_labels = singles.intersection(sample_labels_keys)\n for label in matching_single_labels:\n mapping_key = label\n mapping_value = sample_labels[label]\n\n active_label_mapping.setdefault(mapping_key, {})[mapping_value] = True\n\n if mapping_key in label_mapping and mapping_value in label_mapping[mapping_key]:\n sample_labels.update(label_mapping[mapping_key][mapping_value])\n\n # Match with tuples of labels\n for key, mapping_key in iteritems(keys):\n if mapping_key in matching_single_labels:\n continue\n\n matching_labels = sets[key]\n\n if matching_labels.issubset(sample_labels_keys):\n matching_values = [sample_labels[l] for l in matching_labels]\n mapping_value = ','.join(matching_values)\n\n active_label_mapping.setdefault(mapping_key, {})[mapping_value] = True\n\n if mapping_key in label_mapping and mapping_value in label_mapping[mapping_key]:\n sample_labels.update(label_mapping[mapping_key][mapping_value])\n\n def _ignore_metrics_by_label(self, scraper_config, metric_name, sample):\n ignore_metrics_by_label = scraper_config['ignore_metrics_by_labels']\n sample_labels = sample[self.SAMPLE_LABELS]\n for label_key, label_values in ignore_metrics_by_label.items():\n if not label_values:\n self.log.debug(\n \"Skipping filter label `%s` with an empty values list, did you mean to use '*' wildcard?\", label_key\n )\n elif '*' in label_values:\n # Wildcard '*' means all metrics with label_key will be ignored\n self.log.debug(\"Detected wildcard for label `%s`\", label_key)\n if label_key in sample_labels.keys():\n self.log.debug(\"Skipping metric `%s` due to label key matching: %s\", metric_name, label_key)\n return True\n else:\n for val in label_values:\n if label_key in sample_labels and sample_labels[label_key] == val:\n self.log.debug(\n \"Skipping metric `%s` due to label `%s` value matching: %s\", metric_name, label_key, val\n )\n return True\n return False\n\n def process_metric(self, metric, scraper_config, metric_transformers=None):\n \"\"\"\n Handle a Prometheus metric according to the following flow:\n - search `scraper_config['metrics_mapper']` for a prometheus.metric to datadog.metric mapping\n - call check method with the same name as the metric\n - log info if none of the above worked\n\n `metric_transformers` is a dict of `<metric name>:<function to run when the metric name is encountered>`\n \"\"\"\n # If targeted metric, store labels\n self._store_labels(metric, scraper_config)\n\n if scraper_config['ignore_metrics']:\n if metric.name in scraper_config['_ignored_metrics']:\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config\n )\n return # Ignore the metric\n\n if scraper_config['_ignored_re'] and scraper_config['_ignored_re'].search(metric.name):\n # Metric must be ignored\n scraper_config['_ignored_metrics'].add(metric.name)\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config\n )\n return # Ignore the metric\n\n self._send_telemetry_counter(self.TELEMETRY_COUNTER_METRICS_PROCESS_COUNT, len(metric.samples), scraper_config)\n\n if self._filter_metric(metric, scraper_config):\n return # Ignore the metric\n\n # Filter metric to see if we can enrich with joined labels\n self._join_labels(metric, scraper_config)\n\n if scraper_config['_dry_run']:\n return\n\n try:\n self.submit_openmetric(scraper_config['metrics_mapper'][metric.name], metric, scraper_config)\n except KeyError:\n if metric_transformers is not None and metric.name in metric_transformers:\n try:\n # Get the transformer function for this specific metric\n transformer = metric_transformers[metric.name]\n transformer(metric, scraper_config)\n except Exception as err:\n self.log.warning('Error handling metric: %s - error: %s', metric.name, err)\n\n return\n # check for wildcards in transformers\n for transformer_name, transformer in iteritems(metric_transformers):\n if transformer_name.endswith('*') and metric.name.startswith(transformer_name[:-1]):\n transformer(metric, scraper_config, transformer_name)\n\n # try matching wildcards\n if scraper_config['_wildcards_re'] and scraper_config['_wildcards_re'].search(metric.name):\n self.submit_openmetric(metric.name, metric, scraper_config)\n return\n\n self.log.debug(\n 'Skipping metric `%s` as it is not defined in the metrics mapper, '\n 'has no transformer function, nor does it match any wildcards.',\n metric.name,\n )\n\n def poll(self, scraper_config, headers=None):\n \"\"\"\n Returns a valid `requests.Response`, otherwise raise requests.HTTPError if the status code of the\n response isn't valid - see `response.raise_for_status()`\n\n The caller needs to close the requests.Response.\n\n Custom headers can be added to the default headers.\n \"\"\"\n endpoint = scraper_config.get('prometheus_url')\n\n # Should we send a service check for when we make a request\n health_service_check = scraper_config['health_service_check']\n service_check_name = self._metric_name_with_namespace('prometheus.health', scraper_config)\n service_check_tags = ['endpoint:{}'.format(endpoint)]\n service_check_tags.extend(scraper_config['custom_tags'])\n\n try:\n response = self.send_request(endpoint, scraper_config, headers)\n except requests.exceptions.SSLError:\n self.log.error(\"Invalid SSL settings for requesting %s endpoint\", endpoint)\n raise\n except IOError:\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.CRITICAL, tags=service_check_tags)\n raise\n try:\n response.raise_for_status()\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.OK, tags=service_check_tags)\n return response\n except requests.HTTPError:\n response.close()\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.CRITICAL, tags=service_check_tags)\n raise\n\n def send_request(self, endpoint, scraper_config, headers=None):\n kwargs = {}\n if headers:\n kwargs['headers'] = headers\n\n http_handler = self.get_http_handler(scraper_config)\n\n return http_handler.get(endpoint, stream=True, **kwargs)\n\n def get_hostname_for_sample(self, sample, scraper_config):\n \"\"\"\n Expose the label_to_hostname mapping logic to custom handler methods\n \"\"\"\n return self._get_hostname(None, sample, scraper_config)\n\n def submit_openmetric(self, metric_name, metric, scraper_config, hostname=None):\n \"\"\"\n For each sample in the metric, report it as a gauge with all labels as tags\n except if a labels `dict` is passed, in which case keys are label names we'll extract\n and corresponding values are tag names we'll use (eg: {'node': 'node'}).\n\n Histograms generate a set of values instead of a unique metric.\n `send_histograms_buckets` is used to specify if you want to\n send the buckets as tagged values when dealing with histograms.\n\n `custom_tags` is an array of `tag:value` that will be added to the\n metric when sending the gauge to Datadog.\n \"\"\"\n if metric.type in [\"gauge\", \"counter\", \"rate\"]:\n metric_name_with_namespace = self._metric_name_with_namespace(metric_name, scraper_config)\n for sample in metric.samples:\n if self._ignore_metrics_by_label(scraper_config, metric_name, sample):\n continue\n\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n custom_hostname = self._get_hostname(hostname, sample, scraper_config)\n # Determine the tags to send\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n if metric.type == \"counter\" and scraper_config['send_monotonic_counter']:\n self.monotonic_count(\n metric_name_with_namespace,\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif metric.type == \"rate\":\n self.rate(metric_name_with_namespace, val, tags=tags, hostname=custom_hostname)\n else:\n self.gauge(metric_name_with_namespace, val, tags=tags, hostname=custom_hostname)\n\n # Metric is a \"counter\" but legacy behavior has \"send_as_monotonic\" defaulted to False\n # Submit metric as monotonic_count with appended name\n if metric.type == \"counter\" and scraper_config['send_monotonic_with_gauge']:\n self.monotonic_count(\n metric_name_with_namespace + '.total',\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif metric.type == \"histogram\":\n self._submit_gauges_from_histogram(metric_name, metric, scraper_config)\n elif metric.type == \"summary\":\n self._submit_gauges_from_summary(metric_name, metric, scraper_config)\n else:\n self.log.error(\"Metric type %s unsupported for metric %s.\", metric.type, metric_name)\n\n def _get_hostname(self, hostname, sample, scraper_config):\n \"\"\"\n If hostname is None, look at label_to_hostname setting\n \"\"\"\n if (\n hostname is None\n and scraper_config['label_to_hostname'] is not None\n and sample[self.SAMPLE_LABELS].get(scraper_config['label_to_hostname'])\n ):\n hostname = sample[self.SAMPLE_LABELS][scraper_config['label_to_hostname']]\n suffix = scraper_config['label_to_hostname_suffix']\n if suffix is not None:\n hostname += suffix\n\n return hostname\n\n def _submit_gauges_from_summary(self, metric_name, metric, scraper_config, hostname=None):\n \"\"\"\n Extracts metrics from a prometheus summary metric and sends them as gauges\n \"\"\"\n for sample in metric.samples:\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n if self._ignore_metrics_by_label(scraper_config, metric_name, sample):\n continue\n custom_hostname = self._get_hostname(hostname, sample, scraper_config)\n if sample[self.SAMPLE_NAME].endswith(\"_sum\"):\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n self._submit_distribution_count(\n scraper_config['send_distribution_sums_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.sum\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif sample[self.SAMPLE_NAME].endswith(\"_count\"):\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n self._submit_distribution_count(\n scraper_config['send_distribution_counts_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.count\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n else:\n try:\n quantile = sample[self.SAMPLE_LABELS][\"quantile\"]\n except KeyError:\n # TODO: In the Prometheus spec the 'quantile' label is optional, but it's not clear yet\n # what we should do in this case. Let's skip for now and submit the rest of metrics.\n message = (\n '\"quantile\" label not present in metric %r. '\n 'Quantile-less summary metrics are not currently supported. Skipping...'\n )\n self.log.debug(message, metric_name)\n continue\n\n sample[self.SAMPLE_LABELS][\"quantile\"] = str(float(quantile))\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n self.gauge(\n \"{}.quantile\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n )\n\n def _submit_gauges_from_histogram(self, metric_name, metric, scraper_config, hostname=None):\n \"\"\"\n Extracts metrics from a prometheus histogram and sends them as gauges\n \"\"\"\n if scraper_config['non_cumulative_buckets']:\n self._decumulate_histogram_buckets(metric)\n for sample in metric.samples:\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n if self._ignore_metrics_by_label(scraper_config, metric_name, sample):\n continue\n custom_hostname = self._get_hostname(hostname, sample, scraper_config)\n if sample[self.SAMPLE_NAME].endswith(\"_sum\") and not scraper_config['send_distribution_buckets']:\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname)\n self._submit_distribution_count(\n scraper_config['send_distribution_sums_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.sum\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif sample[self.SAMPLE_NAME].endswith(\"_count\") and not scraper_config['send_distribution_buckets']:\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname)\n if scraper_config['send_histograms_buckets']:\n tags.append(\"upper_bound:none\")\n self._submit_distribution_count(\n scraper_config['send_distribution_counts_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.count\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif scraper_config['send_histograms_buckets'] and sample[self.SAMPLE_NAME].endswith(\"_bucket\"):\n if scraper_config['send_distribution_buckets']:\n self._submit_sample_histogram_buckets(metric_name, sample, scraper_config, hostname)\n elif \"Inf\" not in sample[self.SAMPLE_LABELS][\"le\"] or scraper_config['non_cumulative_buckets']:\n sample[self.SAMPLE_LABELS][\"le\"] = str(float(sample[self.SAMPLE_LABELS][\"le\"]))\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname)\n self._submit_distribution_count(\n scraper_config['send_distribution_counts_as_monotonic'],\n scraper_config['send_monotonic_with_gauge'],\n \"{}.count\".format(self._metric_name_with_namespace(metric_name, scraper_config)),\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n\n def _compute_bucket_hash(self, tags):\n # we need the unique context for all the buckets\n # hence we remove the \"le\" tag\n return hash(frozenset(sorted((k, v) for k, v in iteritems(tags) if k != 'le')))\n\n def _decumulate_histogram_buckets(self, metric):\n \"\"\"\n Decumulate buckets in a given histogram metric and adds the lower_bound label (le being upper_bound)\n \"\"\"\n bucket_values_by_context_upper_bound = {}\n for sample in metric.samples:\n if sample[self.SAMPLE_NAME].endswith(\"_bucket\"):\n context_key = self._compute_bucket_hash(sample[self.SAMPLE_LABELS])\n if context_key not in bucket_values_by_context_upper_bound:\n bucket_values_by_context_upper_bound[context_key] = {}\n bucket_values_by_context_upper_bound[context_key][float(sample[self.SAMPLE_LABELS][\"le\"])] = sample[\n self.SAMPLE_VALUE\n ]\n\n sorted_buckets_by_context = {}\n for context in bucket_values_by_context_upper_bound:\n sorted_buckets_by_context[context] = sorted(bucket_values_by_context_upper_bound[context])\n\n # Tuples (lower_bound, upper_bound, value)\n bucket_tuples_by_context_upper_bound = {}\n for context in sorted_buckets_by_context:\n for i, upper_b in enumerate(sorted_buckets_by_context[context]):\n if i == 0:\n if context not in bucket_tuples_by_context_upper_bound:\n bucket_tuples_by_context_upper_bound[context] = {}\n if upper_b > 0:\n # positive buckets start at zero\n bucket_tuples_by_context_upper_bound[context][upper_b] = (\n 0,\n upper_b,\n bucket_values_by_context_upper_bound[context][upper_b],\n )\n else:\n # negative buckets start at -inf\n bucket_tuples_by_context_upper_bound[context][upper_b] = (\n self.MINUS_INF,\n upper_b,\n bucket_values_by_context_upper_bound[context][upper_b],\n )\n continue\n tmp = (\n bucket_values_by_context_upper_bound[context][upper_b]\n - bucket_values_by_context_upper_bound[context][sorted_buckets_by_context[context][i - 1]]\n )\n bucket_tuples_by_context_upper_bound[context][upper_b] = (\n sorted_buckets_by_context[context][i - 1],\n upper_b,\n tmp,\n )\n\n # modify original metric to inject lower_bound & modified value\n for i, sample in enumerate(metric.samples):\n if not sample[self.SAMPLE_NAME].endswith(\"_bucket\"):\n continue\n\n context_key = self._compute_bucket_hash(sample[self.SAMPLE_LABELS])\n matching_bucket_tuple = bucket_tuples_by_context_upper_bound[context_key][\n float(sample[self.SAMPLE_LABELS][\"le\"])\n ]\n # Replacing the sample tuple\n sample[self.SAMPLE_LABELS][\"lower_bound\"] = str(matching_bucket_tuple[0])\n metric.samples[i] = Sample(sample[self.SAMPLE_NAME], sample[self.SAMPLE_LABELS], matching_bucket_tuple[2])\n\n def _submit_sample_histogram_buckets(self, metric_name, sample, scraper_config, hostname=None):\n if \"lower_bound\" not in sample[self.SAMPLE_LABELS] or \"le\" not in sample[self.SAMPLE_LABELS]:\n self.log.warning(\n \"Metric: %s was not containing required bucket boundaries labels: %s\",\n metric_name,\n sample[self.SAMPLE_LABELS],\n )\n return\n sample[self.SAMPLE_LABELS][\"le\"] = str(float(sample[self.SAMPLE_LABELS][\"le\"]))\n sample[self.SAMPLE_LABELS][\"lower_bound\"] = str(float(sample[self.SAMPLE_LABELS][\"lower_bound\"]))\n if sample[self.SAMPLE_LABELS][\"le\"] == sample[self.SAMPLE_LABELS][\"lower_bound\"]:\n # this can happen for -inf/-inf bucket that we don't want to send (always 0)\n self.log.warning(\n \"Metric: %s has bucket boundaries equal, skipping: %s\", metric_name, sample[self.SAMPLE_LABELS]\n )\n return\n tags = self._metric_tags(metric_name, sample[self.SAMPLE_VALUE], sample, scraper_config, hostname)\n self.submit_histogram_bucket(\n self._metric_name_with_namespace(metric_name, scraper_config),\n sample[self.SAMPLE_VALUE],\n float(sample[self.SAMPLE_LABELS][\"lower_bound\"]),\n float(sample[self.SAMPLE_LABELS][\"le\"]),\n True,\n hostname,\n tags,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n\n def _submit_distribution_count(\n self,\n monotonic,\n send_monotonic_with_gauge,\n metric_name,\n value,\n tags=None,\n hostname=None,\n flush_first_value=False,\n ):\n if monotonic:\n self.monotonic_count(metric_name, value, tags=tags, hostname=hostname, flush_first_value=flush_first_value)\n else:\n self.gauge(metric_name, value, tags=tags, hostname=hostname)\n if send_monotonic_with_gauge:\n self.monotonic_count(\n metric_name + \".total\", value, tags=tags, hostname=hostname, flush_first_value=flush_first_value\n )\n\n def _metric_tags(self, metric_name, val, sample, scraper_config, hostname=None):\n custom_tags = scraper_config['custom_tags']\n _tags = list(custom_tags)\n _tags.extend(scraper_config['_metric_tags'])\n for label_name, label_value in iteritems(sample[self.SAMPLE_LABELS]):\n if label_name not in scraper_config['exclude_labels']:\n if label_name in scraper_config['include_labels'] or len(scraper_config['include_labels']) == 0:\n tag_name = scraper_config['labels_mapper'].get(label_name, label_name)\n _tags.append('{}:{}'.format(to_native_string(tag_name), to_native_string(label_value)))\n return self._finalize_tags_to_submit(\n _tags, metric_name, val, sample, custom_tags=custom_tags, hostname=hostname\n )\n\n def _is_value_valid(self, val):\n return not (isnan(val) or isinf(val))\n\n def _get_bearer_token(self, bearer_token_auth, bearer_token_path):\n if bearer_token_auth is False:\n return None\n\n path = None\n if bearer_token_path is not None:\n if isfile(bearer_token_path):\n path = bearer_token_path\n else:\n self.log.error(\"File not found: %s\", bearer_token_path)\n elif isfile(self.KUBERNETES_TOKEN_PATH):\n path = self.KUBERNETES_TOKEN_PATH\n\n if path is None:\n self.log.error(\"Cannot get bearer token from bearer_token_path or auto discovery\")\n raise IOError(\"Cannot get bearer token from bearer_token_path or auto discovery\")\n\n try:\n with open(path, 'r') as f:\n return f.read().rstrip()\n except Exception as err:\n self.log.error(\"Cannot get bearer token from path: %s - error: %s\", path, err)\n raise\n\n def _refresh_bearer_token(self, scraper_config):\n \"\"\"\n Refreshes the bearer token if the refresh interval is elapsed.\n \"\"\"\n now = time.time()\n if now - scraper_config['_bearer_token_last_refresh'] > scraper_config['bearer_token_refresh_interval']:\n scraper_config['_bearer_token'] = self._get_bearer_token(\n scraper_config['bearer_token_auth'], scraper_config['bearer_token_path']\n )\n scraper_config['_bearer_token_last_refresh'] = now\n\n def _histogram_convert_values(self, metric_name, converter):\n def _convert(metric, scraper_config=None):\n for index, sample in enumerate(metric.samples):\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n if sample[self.SAMPLE_NAME].endswith(\"_sum\"):\n lst = list(sample)\n lst[self.SAMPLE_VALUE] = converter(val)\n metric.samples[index] = tuple(lst)\n elif sample[self.SAMPLE_NAME].endswith(\"_bucket\") and \"Inf\" not in sample[self.SAMPLE_LABELS][\"le\"]:\n sample[self.SAMPLE_LABELS][\"le\"] = str(converter(float(sample[self.SAMPLE_LABELS][\"le\"])))\n self.submit_openmetric(metric_name, metric, scraper_config)\n\n return _convert\n\n def _histogram_from_microseconds_to_seconds(self, metric_name):\n return self._histogram_convert_values(metric_name, lambda v: v / self.MICROS_IN_S)\n\n def _histogram_from_seconds_to_microseconds(self, metric_name):\n return self._histogram_convert_values(metric_name, lambda v: v * self.MICROS_IN_S)\n\n def _summary_convert_values(self, metric_name, converter):\n def _convert(metric, scraper_config=None):\n for index, sample in enumerate(metric.samples):\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n if sample[self.SAMPLE_NAME].endswith(\"_count\"):\n continue\n else:\n lst = list(sample)\n lst[self.SAMPLE_VALUE] = converter(val)\n metric.samples[index] = tuple(lst)\n self.submit_openmetric(metric_name, metric, scraper_config)\n\n return _convert\n\n def _summary_from_microseconds_to_seconds(self, metric_name):\n return self._summary_convert_values(metric_name, lambda v: v / self.MICROS_IN_S)\n\n def _summary_from_seconds_to_microseconds(self, metric_name):\n return self._summary_convert_values(metric_name, lambda v: v * self.MICROS_IN_S)\n
Parse the MetricFamily from a valid requests.Response object to provide a MetricFamily object. The text format uses iter_lines() generator.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def parse_metric_family(self, response, scraper_config):\n \"\"\"\n Parse the MetricFamily from a valid `requests.Response` object to provide a MetricFamily object.\n The text format uses iter_lines() generator.\n \"\"\"\n if response.encoding is None:\n response.encoding = 'utf-8'\n input_gen = response.iter_lines(decode_unicode=True)\n if scraper_config['_text_filter_blacklist']:\n input_gen = self._text_filter_input(input_gen, scraper_config)\n\n for metric in text_fd_to_metric_families(input_gen):\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_INPUT_COUNT, len(metric.samples), scraper_config\n )\n type_override = scraper_config['type_overrides'].get(metric.name)\n if type_override:\n metric.type = type_override\n elif scraper_config['_type_override_patterns']:\n for pattern, new_type in iteritems(scraper_config['_type_override_patterns']):\n if pattern.search(metric.name):\n metric.type = new_type\n break\n if metric.type not in self.METRIC_TYPES:\n continue\n metric.name = self._remove_metric_prefix(metric.name, scraper_config)\n yield metric\n
Poll the data from Prometheus and return the metrics as a generator.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def scrape_metrics(self, scraper_config):\n \"\"\"\n Poll the data from Prometheus and return the metrics as a generator.\n \"\"\"\n response = self.poll(scraper_config)\n if scraper_config['telemetry']:\n if 'content-length' in response.headers:\n content_len = int(response.headers['content-length'])\n else:\n content_len = len(response.content)\n self._send_telemetry_gauge(self.TELEMETRY_GAUGE_MESSAGE_SIZE, content_len, scraper_config)\n try:\n # no dry run if no label joins\n if not scraper_config['label_joins']:\n scraper_config['_dry_run'] = False\n elif not scraper_config['_watched_labels']:\n watched = scraper_config['_watched_labels']\n watched['sets'] = {}\n watched['keys'] = {}\n watched['singles'] = set()\n for key, val in iteritems(scraper_config['label_joins']):\n labels = []\n if 'labels_to_match' in val:\n labels = val['labels_to_match']\n elif 'label_to_match' in val:\n self.log.warning(\"`label_to_match` is being deprecated, please use `labels_to_match`\")\n if isinstance(val['label_to_match'], list):\n labels = val['label_to_match']\n else:\n labels = [val['label_to_match']]\n\n if labels:\n s = frozenset(labels)\n watched['sets'][key] = s\n watched['keys'][key] = ','.join(s)\n if len(labels) == 1:\n watched['singles'].add(labels[0])\n\n for metric in self.parse_metric_family(response, scraper_config):\n yield metric\n\n # Set dry run off\n scraper_config['_dry_run'] = False\n # Garbage collect unused mapping and reset active labels\n for metric, mapping in list(iteritems(scraper_config['_label_mapping'])):\n for key in list(mapping):\n if (\n metric in scraper_config['_active_label_mapping']\n and key not in scraper_config['_active_label_mapping'][metric]\n ):\n del scraper_config['_label_mapping'][metric][key]\n scraper_config['_active_label_mapping'] = {}\n finally:\n response.close()\n
Polls the data from Prometheus and submits them as Datadog metrics. endpoint is the metrics endpoint to use to poll metrics from Prometheus
Note that if the instance has a tags attribute, it will be pushed automatically as additional custom tags and added to the metrics
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def process(self, scraper_config, metric_transformers=None):\n \"\"\"\n Polls the data from Prometheus and submits them as Datadog metrics.\n `endpoint` is the metrics endpoint to use to poll metrics from Prometheus\n\n Note that if the instance has a `tags` attribute, it will be pushed\n automatically as additional custom tags and added to the metrics\n \"\"\"\n\n transformers = scraper_config['_default_metric_transformers'].copy()\n if metric_transformers:\n transformers.update(metric_transformers)\n\n counter_buffer = []\n agent_start_time = None\n process_start_time = None\n if not scraper_config['_flush_first_value'] and scraper_config['use_process_start_time']:\n agent_start_time = datadog_agent.get_process_start_time()\n\n if scraper_config['bearer_token_auth']:\n self._refresh_bearer_token(scraper_config)\n\n for metric in self.scrape_metrics(scraper_config):\n if agent_start_time is not None:\n if metric.name == 'process_start_time_seconds' and metric.samples:\n min_metric_value = min(s[self.SAMPLE_VALUE] for s in metric.samples)\n if process_start_time is None or min_metric_value < process_start_time:\n process_start_time = min_metric_value\n if metric.type in self.METRICS_WITH_COUNTERS:\n counter_buffer.append(metric)\n continue\n\n self.process_metric(metric, scraper_config, metric_transformers=transformers)\n\n if agent_start_time and process_start_time and agent_start_time < process_start_time:\n # If agent was started before the process, we assume counters were started recently from zero,\n # and thus we can compute the rates.\n scraper_config['_flush_first_value'] = True\n\n for metric in counter_buffer:\n self.process_metric(metric, scraper_config, metric_transformers=transformers)\n\n scraper_config['_flush_first_value'] = True\n
Returns a valid requests.Response, otherwise raise requests.HTTPError if the status code of the response isn't valid - see response.raise_for_status()
The caller needs to close the requests.Response.
Custom headers can be added to the default headers.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def poll(self, scraper_config, headers=None):\n \"\"\"\n Returns a valid `requests.Response`, otherwise raise requests.HTTPError if the status code of the\n response isn't valid - see `response.raise_for_status()`\n\n The caller needs to close the requests.Response.\n\n Custom headers can be added to the default headers.\n \"\"\"\n endpoint = scraper_config.get('prometheus_url')\n\n # Should we send a service check for when we make a request\n health_service_check = scraper_config['health_service_check']\n service_check_name = self._metric_name_with_namespace('prometheus.health', scraper_config)\n service_check_tags = ['endpoint:{}'.format(endpoint)]\n service_check_tags.extend(scraper_config['custom_tags'])\n\n try:\n response = self.send_request(endpoint, scraper_config, headers)\n except requests.exceptions.SSLError:\n self.log.error(\"Invalid SSL settings for requesting %s endpoint\", endpoint)\n raise\n except IOError:\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.CRITICAL, tags=service_check_tags)\n raise\n try:\n response.raise_for_status()\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.OK, tags=service_check_tags)\n return response\n except requests.HTTPError:\n response.close()\n if health_service_check:\n self.service_check(service_check_name, AgentCheck.CRITICAL, tags=service_check_tags)\n raise\n
For each sample in the metric, report it as a gauge with all labels as tags except if a labels dict is passed, in which case keys are label names we'll extract and corresponding values are tag names we'll use (eg: {'node': 'node'}).
Histograms generate a set of values instead of a unique metric. send_histograms_buckets is used to specify if you want to send the buckets as tagged values when dealing with histograms.
custom_tags is an array of tag:value that will be added to the metric when sending the gauge to Datadog.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def submit_openmetric(self, metric_name, metric, scraper_config, hostname=None):\n \"\"\"\n For each sample in the metric, report it as a gauge with all labels as tags\n except if a labels `dict` is passed, in which case keys are label names we'll extract\n and corresponding values are tag names we'll use (eg: {'node': 'node'}).\n\n Histograms generate a set of values instead of a unique metric.\n `send_histograms_buckets` is used to specify if you want to\n send the buckets as tagged values when dealing with histograms.\n\n `custom_tags` is an array of `tag:value` that will be added to the\n metric when sending the gauge to Datadog.\n \"\"\"\n if metric.type in [\"gauge\", \"counter\", \"rate\"]:\n metric_name_with_namespace = self._metric_name_with_namespace(metric_name, scraper_config)\n for sample in metric.samples:\n if self._ignore_metrics_by_label(scraper_config, metric_name, sample):\n continue\n\n val = sample[self.SAMPLE_VALUE]\n if not self._is_value_valid(val):\n self.log.debug(\"Metric value is not supported for metric %s\", sample[self.SAMPLE_NAME])\n continue\n custom_hostname = self._get_hostname(hostname, sample, scraper_config)\n # Determine the tags to send\n tags = self._metric_tags(metric_name, val, sample, scraper_config, hostname=custom_hostname)\n if metric.type == \"counter\" and scraper_config['send_monotonic_counter']:\n self.monotonic_count(\n metric_name_with_namespace,\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif metric.type == \"rate\":\n self.rate(metric_name_with_namespace, val, tags=tags, hostname=custom_hostname)\n else:\n self.gauge(metric_name_with_namespace, val, tags=tags, hostname=custom_hostname)\n\n # Metric is a \"counter\" but legacy behavior has \"send_as_monotonic\" defaulted to False\n # Submit metric as monotonic_count with appended name\n if metric.type == \"counter\" and scraper_config['send_monotonic_with_gauge']:\n self.monotonic_count(\n metric_name_with_namespace + '.total',\n val,\n tags=tags,\n hostname=custom_hostname,\n flush_first_value=scraper_config['_flush_first_value'],\n )\n elif metric.type == \"histogram\":\n self._submit_gauges_from_histogram(metric_name, metric, scraper_config)\n elif metric.type == \"summary\":\n self._submit_gauges_from_summary(metric_name, metric, scraper_config)\n else:\n self.log.error(\"Metric type %s unsupported for metric %s.\", metric.type, metric_name)\n
Handle a Prometheus metric according to the following flow: - search scraper_config['metrics_mapper'] for a prometheus.metric to datadog.metric mapping - call check method with the same name as the metric - log info if none of the above worked
metric_transformers is a dict of <metric name>:<function to run when the metric name is encountered>
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def process_metric(self, metric, scraper_config, metric_transformers=None):\n \"\"\"\n Handle a Prometheus metric according to the following flow:\n - search `scraper_config['metrics_mapper']` for a prometheus.metric to datadog.metric mapping\n - call check method with the same name as the metric\n - log info if none of the above worked\n\n `metric_transformers` is a dict of `<metric name>:<function to run when the metric name is encountered>`\n \"\"\"\n # If targeted metric, store labels\n self._store_labels(metric, scraper_config)\n\n if scraper_config['ignore_metrics']:\n if metric.name in scraper_config['_ignored_metrics']:\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config\n )\n return # Ignore the metric\n\n if scraper_config['_ignored_re'] and scraper_config['_ignored_re'].search(metric.name):\n # Metric must be ignored\n scraper_config['_ignored_metrics'].add(metric.name)\n self._send_telemetry_counter(\n self.TELEMETRY_COUNTER_METRICS_IGNORE_COUNT, len(metric.samples), scraper_config\n )\n return # Ignore the metric\n\n self._send_telemetry_counter(self.TELEMETRY_COUNTER_METRICS_PROCESS_COUNT, len(metric.samples), scraper_config)\n\n if self._filter_metric(metric, scraper_config):\n return # Ignore the metric\n\n # Filter metric to see if we can enrich with joined labels\n self._join_labels(metric, scraper_config)\n\n if scraper_config['_dry_run']:\n return\n\n try:\n self.submit_openmetric(scraper_config['metrics_mapper'][metric.name], metric, scraper_config)\n except KeyError:\n if metric_transformers is not None and metric.name in metric_transformers:\n try:\n # Get the transformer function for this specific metric\n transformer = metric_transformers[metric.name]\n transformer(metric, scraper_config)\n except Exception as err:\n self.log.warning('Error handling metric: %s - error: %s', metric.name, err)\n\n return\n # check for wildcards in transformers\n for transformer_name, transformer in iteritems(metric_transformers):\n if transformer_name.endswith('*') and metric.name.startswith(transformer_name[:-1]):\n transformer(metric, scraper_config, transformer_name)\n\n # try matching wildcards\n if scraper_config['_wildcards_re'] and scraper_config['_wildcards_re'].search(metric.name):\n self.submit_openmetric(metric.name, metric, scraper_config)\n return\n\n self.log.debug(\n 'Skipping metric `%s` as it is not defined in the metrics mapper, '\n 'has no transformer function, nor does it match any wildcards.',\n metric.name,\n )\n
If instance does not specify a value for a configuration option, the value will default to the init_config. Otherwise, the default_instance value will be used.
A default mixin configuration will be returned if there is no instance.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/mixins.py
def create_scraper_configuration(self, instance=None):\n \"\"\"\n Creates a scraper configuration.\n\n If instance does not specify a value for a configuration option, the value will default to the `init_config`.\n Otherwise, the `default_instance` value will be used.\n\n A default mixin configuration will be returned if there is no instance.\n \"\"\"\n if 'openmetrics_endpoint' in instance:\n raise CheckException('The setting `openmetrics_endpoint` is only available for Agent version 7 or later')\n\n # We can choose to create a default mixin configuration for an empty instance\n if instance is None:\n instance = {}\n\n # Supports new configuration options\n config = copy.deepcopy(instance)\n\n # Set the endpoint\n endpoint = instance.get('prometheus_url')\n if instance and endpoint is None:\n raise CheckException(\"You have to define a prometheus_url for each prometheus instance\")\n\n # Set the bearer token authorization to customer value, then get the bearer token\n self.update_prometheus_url(instance, config, endpoint)\n\n # `NAMESPACE` is the prefix metrics will have. Need to be hardcoded in the\n # child check class.\n namespace = instance.get('namespace')\n # Check if we have a namespace\n if instance and namespace is None:\n if self.default_namespace is None:\n raise CheckException(\"You have to define a namespace for each prometheus check\")\n namespace = self.default_namespace\n\n config['namespace'] = namespace\n\n # Retrieve potential default instance settings for the namespace\n default_instance = self.default_instances.get(namespace, {})\n\n def _get_setting(name, default):\n return instance.get(name, default_instance.get(name, default))\n\n # `metrics_mapper` is a dictionary where the keys are the metrics to capture\n # and the values are the corresponding metrics names to have in datadog.\n # Note: it is empty in the parent class but will need to be\n # overloaded/hardcoded in the final check not to be counted as custom metric.\n\n # Metrics are preprocessed if no mapping\n metrics_mapper = {}\n # We merge list and dictionaries from optional defaults & instance settings\n metrics = default_instance.get('metrics', []) + instance.get('metrics', [])\n for metric in metrics:\n if isinstance(metric, string_types):\n metrics_mapper[metric] = metric\n else:\n metrics_mapper.update(metric)\n\n config['metrics_mapper'] = metrics_mapper\n\n # `_wildcards_re` is a Pattern object used to match metric wildcards\n config['_wildcards_re'] = None\n\n wildcards = set()\n for metric in config['metrics_mapper']:\n if \"*\" in metric:\n wildcards.add(translate(metric))\n\n if wildcards:\n config['_wildcards_re'] = compile('|'.join(wildcards))\n\n # `prometheus_metrics_prefix` allows to specify a prefix that all\n # prometheus metrics should have. This can be used when the prometheus\n # endpoint we are scrapping allows to add a custom prefix to it's\n # metrics.\n config['prometheus_metrics_prefix'] = instance.get(\n 'prometheus_metrics_prefix', default_instance.get('prometheus_metrics_prefix', '')\n )\n\n # `label_joins` holds the configuration for extracting 1:1 labels from\n # a target metric to all metric matching the label, example:\n # self.label_joins = {\n # 'kube_pod_info': {\n # 'labels_to_match': ['pod'],\n # 'labels_to_get': ['node', 'host_ip']\n # }\n # }\n config['label_joins'] = default_instance.get('label_joins', {})\n config['label_joins'].update(instance.get('label_joins', {}))\n\n # `_label_mapping` holds the additionals label info to add for a specific\n # label value, example:\n # self._label_mapping = {\n # 'pod': {\n # 'dd-agent-9s1l1': {\n # \"node\": \"yolo\",\n # \"host_ip\": \"yey\"\n # }\n # }\n # }\n config['_label_mapping'] = {}\n\n # `_active_label_mapping` holds a dictionary of label values found during the run\n # to cleanup the label_mapping of unused values, example:\n # self._active_label_mapping = {\n # 'pod': {\n # 'dd-agent-9s1l1': True\n # }\n # }\n config['_active_label_mapping'] = {}\n\n # `_watched_labels` holds the sets of labels to watch for enrichment\n config['_watched_labels'] = {}\n\n config['_dry_run'] = True\n\n # Some metrics are ignored because they are duplicates or introduce a\n # very high cardinality. Metrics included in this list will be silently\n # skipped without a 'Unable to handle metric' debug line in the logs\n config['ignore_metrics'] = instance.get('ignore_metrics', default_instance.get('ignore_metrics', []))\n config['_ignored_metrics'] = set()\n\n # `_ignored_re` is a Pattern object used to match ignored metric patterns\n config['_ignored_re'] = None\n ignored_patterns = set()\n\n # Separate ignored metric names and ignored patterns in different sets for faster lookup later\n for metric in config['ignore_metrics']:\n if '*' in metric:\n ignored_patterns.add(translate(metric))\n else:\n config['_ignored_metrics'].add(metric)\n\n if ignored_patterns:\n config['_ignored_re'] = compile('|'.join(ignored_patterns))\n\n # Ignore metrics based on label keys or specific label values\n config['ignore_metrics_by_labels'] = instance.get(\n 'ignore_metrics_by_labels', default_instance.get('ignore_metrics_by_labels', {})\n )\n\n # If you want to send the buckets as tagged values when dealing with histograms,\n # set send_histograms_buckets to True, set to False otherwise.\n config['send_histograms_buckets'] = is_affirmative(\n instance.get('send_histograms_buckets', default_instance.get('send_histograms_buckets', True))\n )\n\n # If you want the bucket to be non cumulative and to come with upper/lower bound tags\n # set non_cumulative_buckets to True, enabled when distribution metrics are enabled.\n config['non_cumulative_buckets'] = is_affirmative(\n instance.get('non_cumulative_buckets', default_instance.get('non_cumulative_buckets', False))\n )\n\n # Send histograms as datadog distribution metrics\n config['send_distribution_buckets'] = is_affirmative(\n instance.get('send_distribution_buckets', default_instance.get('send_distribution_buckets', False))\n )\n\n # Non cumulative buckets are mandatory for distribution metrics\n if config['send_distribution_buckets'] is True:\n config['non_cumulative_buckets'] = True\n\n # If you want to send `counter` metrics as monotonic counts, set this value to True.\n # Set to False if you want to instead send those metrics as `gauge`.\n config['send_monotonic_counter'] = is_affirmative(\n instance.get('send_monotonic_counter', default_instance.get('send_monotonic_counter', True))\n )\n\n # If you want `counter` metrics to be submitted as both gauges and monotonic counts. Set this value to True.\n config['send_monotonic_with_gauge'] = is_affirmative(\n instance.get('send_monotonic_with_gauge', default_instance.get('send_monotonic_with_gauge', False))\n )\n\n config['send_distribution_counts_as_monotonic'] = is_affirmative(\n instance.get(\n 'send_distribution_counts_as_monotonic',\n default_instance.get('send_distribution_counts_as_monotonic', False),\n )\n )\n\n config['send_distribution_sums_as_monotonic'] = is_affirmative(\n instance.get(\n 'send_distribution_sums_as_monotonic',\n default_instance.get('send_distribution_sums_as_monotonic', False),\n )\n )\n\n # If the `labels_mapper` dictionary is provided, the metrics labels names\n # in the `labels_mapper` will use the corresponding value as tag name\n # when sending the gauges.\n config['labels_mapper'] = default_instance.get('labels_mapper', {})\n config['labels_mapper'].update(instance.get('labels_mapper', {}))\n # Rename bucket \"le\" label to \"upper_bound\"\n config['labels_mapper']['le'] = 'upper_bound'\n\n # `exclude_labels` is an array of label names to exclude. Those labels\n # will just not be added as tags when submitting the metric.\n config['exclude_labels'] = default_instance.get('exclude_labels', []) + instance.get('exclude_labels', [])\n\n # `include_labels` is an array of label names to include. If these labels are not in\n # the `exclude_labels` list, then they are added as tags when submitting the metric.\n config['include_labels'] = default_instance.get('include_labels', []) + instance.get('include_labels', [])\n\n # `type_overrides` is a dictionary where the keys are prometheus metric names\n # and the values are a metric type (name as string) to use instead of the one\n # listed in the payload. It can be used to force a type on untyped metrics.\n # Note: it is empty in the parent class but will need to be\n # overloaded/hardcoded in the final check not to be counted as custom metric.\n config['type_overrides'] = default_instance.get('type_overrides', {})\n config['type_overrides'].update(instance.get('type_overrides', {}))\n\n # `_type_override_patterns` is a dictionary where we store Pattern objects\n # that match metric names as keys, and their corresponding metric type overrides as values.\n config['_type_override_patterns'] = {}\n\n with_wildcards = set()\n for metric, type in iteritems(config['type_overrides']):\n if '*' in metric:\n config['_type_override_patterns'][compile(translate(metric))] = type\n with_wildcards.add(metric)\n\n # cleanup metric names with wildcards from the 'type_overrides' dict\n for metric in with_wildcards:\n del config['type_overrides'][metric]\n\n # Some metrics are retrieved from different hosts and often\n # a label can hold this information, this transfers it to the hostname\n config['label_to_hostname'] = instance.get('label_to_hostname', default_instance.get('label_to_hostname', None))\n\n # In combination to label_as_hostname, allows to add a common suffix to the hostnames\n # submitted. This can be used for instance to discriminate hosts between clusters.\n config['label_to_hostname_suffix'] = instance.get(\n 'label_to_hostname_suffix', default_instance.get('label_to_hostname_suffix', None)\n )\n\n # Add a 'health' service check for the prometheus endpoint\n config['health_service_check'] = is_affirmative(\n instance.get('health_service_check', default_instance.get('health_service_check', True))\n )\n\n # Can either be only the path to the certificate and thus you should specify the private key\n # or it can be the path to a file containing both the certificate & the private key\n config['ssl_cert'] = instance.get('ssl_cert', default_instance.get('ssl_cert', None))\n\n # Needed if the certificate does not include the private key\n #\n # /!\\ The private key to your local certificate must be unencrypted.\n # Currently, Requests does not support using encrypted keys.\n config['ssl_private_key'] = instance.get('ssl_private_key', default_instance.get('ssl_private_key', None))\n\n # The path to the trusted CA used for generating custom certificates\n config['ssl_ca_cert'] = instance.get('ssl_ca_cert', default_instance.get('ssl_ca_cert', None))\n\n # Whether or not to validate SSL certificates\n config['ssl_verify'] = is_affirmative(instance.get('ssl_verify', default_instance.get('ssl_verify', True)))\n\n # Extra http headers to be sent when polling endpoint\n config['extra_headers'] = default_instance.get('extra_headers', {})\n config['extra_headers'].update(instance.get('extra_headers', {}))\n\n # Timeout used during the network request\n config['prometheus_timeout'] = instance.get(\n 'prometheus_timeout', default_instance.get('prometheus_timeout', 10)\n )\n\n # Authentication used when polling endpoint\n config['username'] = instance.get('username', default_instance.get('username', None))\n config['password'] = instance.get('password', default_instance.get('password', None))\n\n # Custom tags that will be sent with each metric\n config['custom_tags'] = instance.get('tags', [])\n\n # Some tags can be ignored to reduce the cardinality.\n # This can be useful for cost optimization in containerized environments\n # when the openmetrics check is configured to collect custom metrics.\n # Even when the Agent's Tagger is configured to add low-cardinality tags only,\n # some tags can still generate unwanted metric contexts (e.g pod annotations as tags).\n ignore_tags = instance.get('ignore_tags', default_instance.get('ignore_tags', []))\n if ignore_tags:\n ignored_tags_re = compile('|'.join(set(ignore_tags)))\n config['custom_tags'] = [tag for tag in config['custom_tags'] if not ignored_tags_re.search(tag)]\n\n # Additional tags to be sent with each metric\n config['_metric_tags'] = []\n\n # List of strings to filter the input text payload on. If any line contains\n # one of these strings, it will be filtered out before being parsed.\n # INTERNAL FEATURE, might be removed in future versions\n config['_text_filter_blacklist'] = []\n\n # Refresh the bearer token every 60 seconds by default.\n # Ref https://github.com/DataDog/datadog-agent/pull/11686\n config['bearer_token_refresh_interval'] = instance.get(\n 'bearer_token_refresh_interval', default_instance.get('bearer_token_refresh_interval', 60)\n )\n\n config['telemetry'] = is_affirmative(instance.get('telemetry', default_instance.get('telemetry', False)))\n\n # The metric name services use to indicate build information\n config['metadata_metric_name'] = instance.get(\n 'metadata_metric_name', default_instance.get('metadata_metric_name')\n )\n\n # Map of metadata key names to label names\n config['metadata_label_map'] = instance.get(\n 'metadata_label_map', default_instance.get('metadata_label_map', {})\n )\n\n config['_default_metric_transformers'] = {}\n if config['metadata_metric_name'] and config['metadata_label_map']:\n config['_default_metric_transformers'][config['metadata_metric_name']] = self.transform_metadata\n\n # Whether or not to enable flushing of the first value of monotonic counts\n config['_flush_first_value'] = False\n\n # Whether to use process_start_time_seconds to decide if counter-like values should be flushed\n # on first scrape.\n config['use_process_start_time'] = is_affirmative(_get_setting('use_process_start_time', False))\n\n return config\n
Some options can be set globally in init_config (with instances taking precedence). For complete documentation of every option, see the associated configuration templates for the instances and init_config sections.
"},{"location":"legacy/prometheus/#config-changes-between-versions","title":"Config changes between versions","text":"
There are config option changes between OpenMetrics V1 and V2, so check if any updated OpenMetrics instances use deprecated options and update accordingly.
Note: The type_overrides option is incorporated in the metrics option. This metrics option defines the list of which metrics to collect from the openmetrics_endpoint, and it can be used to remap the names and types of exposed metrics as well as use regular expression to match exposed metrics.
share_labels are used to join labels with a 1:1 mapping and can take other parameters for sharing. More information can be found in the conf.yaml.exmaple.
All HTTP options are also supported.
Source code in datadog_checks_base/datadog_checks/base/checks/openmetrics/base_check.py
class StandardFields(object):\n pass\n
"},{"location":"legacy/prometheus/#prometheus-to-datadog-metric-types","title":"Prometheus to Datadog metric types","text":"
The Openmetrics Base Check supports various configurations for submitting Prometheus metrics to Datadog. We currently support Prometheus gauge, counter, histogram, and summary metric types.
A Prometheus counter is a cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart.
Config Option Value Datadog Metric Submitted send_monotonic_countertrue (default) monotonic_countfalsegauge"},{"location":"legacy/prometheus/#histogram","title":"Histogram","text":"
A Prometheus histogram samples observations and counts them in configurable buckets along with a sum of all observed values.
Histogram metrics ending in:
_sum represent the total sum of all observed values. Generally sums are like counters but it's also possible for a negative observation which would not behave like a typical always increasing counter.
_count represent the total number of events that have been observed.
_bucket represent the cumulative counters for the observation buckets. Note that buckets are only submitted if send_histograms_buckets is enabled.
Subtype Config Option Value Datadog Metric Submitted send_distribution_bucketstrue The entire histogram can be submitted as a single distribution metric. If the option is enabled, none of the subtype metrics will be submitted. _sumsend_distribution_sums_as_monotonicfalse (default) gaugetruemonotonic_count_countsend_distribution_counts_as_monotonicfalse (default) gaugetruemonotonic_count_bucketnon_cumulative_bucketsfalse (default) gaugetruemonotonic_count under .count metric name if send_distribution_counts_as_monotonic is enabled. Otherwise, gauge."},{"location":"legacy/prometheus/#summary","title":"Summary","text":"
Prometheus summary metrics are similar to histograms but allow configurable quantiles.
Summary metrics ending in:
_sum represent the total sum of all observed values. Generally sums are like counters but it's also possible for a negative observation which would not behave like a typical always increasing counter.
_count represent the total number of events that have been observed.
metrics with labels like {quantile=\"<\u03c6>\"} represent the streaming quantiles of observed events.
The default values for optional settings are populated in defaults.py and are derived from the value property of config spec options. The precedence is the default key followed by the example key (if it appears to represent a real value rather than an illustrative example and the type is a primitive). In all other cases, the default is None, which means there is no default getter function.
If such a validator exists in validators.py, then it is called once with the raw config that was supplied by the user. The returned mapping is used as the input config for the subsequent stages.
The value of each field goes through the following steps.
"},{"location":"meta/config-models/#default-value-population","title":"Default value population","text":"
If a field was not supplied by the user nor during the initialization stage, then its default value is taken from defaults.py. This stage is skipped for required fields.
"},{"location":"meta/config-models/#custom-field-validators","title":"Custom field validators","text":"
The contents of validators.py are entirely custom and contain functions to perform extra validation if necessary.
Such validators are called for the appropriate field of the proper model. The returned value is used as the new value of the option for the subsequent stages.
Note
This only occurs if the option was supplied by the user.
"},{"location":"meta/config-models/#pre-defined-field-validators","title":"Pre-defined field validators","text":"
A validators key under the value property of config spec options is considered. Every entry refers to a relative import path to a field validator under datadog_checks.base.utils.models.validation and is executed in the defined order.
Note
This only occurs if the option was supplied by the user.
"},{"location":"meta/config-models/#conversion-to-immutable-types","title":"Conversion to immutable types","text":"
Every list is converted to tuple and every dict is converted to types.MappingProxyType.
Note
A field or nested field would only be a dict when it is defined as a mapping with arbitrary keys. Otherwise, it would be a model with its own properties as usual.
If such a validator exists in validators.py, then it is called with the final constructed model. At this point, it cannot be mutated, so you can only raise errors.
Every integration has a specification detailing all the options that influence behavior. These YAML files are located at <INTEGRATION>/assets/configuration/spec.yaml.
name - This is the name of the file the Agent will look for (REQUIRED)
example_name - This is the name of the example file the Agent will ship. If none is provided, the default will be conf.yaml.example. The exceptions are as follows:
Auto-discovery files, which are named auto_conf.yaml
Python-based core check default files, which are named conf.yaml.default
description - Information about the option. This can be a multi-line string, but each line must contain fewer than 120 characters (REQUIRED).
required - Whether or not the option is required for basic functionality. It defaults to false.
hidden - Whether or not the option should not be publicly exposed. It defaults to false.
display_priority - An integer representing the relative visual rank the option should take on compared to other options when publicly exposed. It defaults to 0, meaning that every option will be displayed in the order defined in the spec.
deprecation - If the option is deprecated, a mapping of relevant information. For example:
deprecation:\n Agent version: 8.0.0\n Migration: |\n do this\n and that\n
multiple - Whether or not options may be selected multiple times like instances or just once like init_config
multiple_instances_defined - Whether or not we separate the definition into multiple instances or just one
metadata_tags - A list of tags (like docs:foo) that can be used for unexpected use cases
options - Nested options, indicating that this is a section like instances or logs
value - The expected type data
There are 2 types of options: those with and without a value. Those with a value attribute are the actual user-controlled settings that influence behavior like username. Those without are expected to be sections and therefore must have an options attribute. An option cannot have both attributes.
Options with a value (non-section) also support:
secret - Whether or not consumers should treat the option as sensitive information like password. It defaults to false.
Info
The option vs section logic was chosen instead of going fully typed to avoid deeply nested values.
The type system is based on a loose subset of OpenAPI 3 data types.
The differences are:
Only the minimum and maximum numeric modifiers are supported
Only the pattern string modifier is supported
The properties object modifier is not a map, but rather a list of maps with a required name attribute. This is so consumers will load objects consistently regardless of language guarantees regarding map key order.
Values also support 1 field of our own:
example - An example value, only required if the type is boolean. The default is <OPTION_NAME>.
Every option may reference pre-defined templates using a key called template. The template format looks like path/to/template_file where path/to must point an existing directory relative to a template directory and template_file must have the file extension .yaml or .yml.
You can use custom templates that will take precedence over the pre-defined templates by using the template_paths parameter of the ConfigSpec class.
The example consumer uses each spec to render the example configuration files that are shipped with every Agent and individual Integration release.
It respects a few extra option-level attributes:
example - A complete example of an option in lieu of a strictly typed value attribute
enabled - Whether or not to un-comment the option, overriding the behavior of required
display_priority - This is an integer affecting the order in which options are displayed, with higher values indicating higher priority. The default is 0.
It also respects a few extra fields under the value attribute of each option:
display_default - This is the default value that will be shown in the header of each option, useful if it differs from the example. You may set it to null explicitly to disable showing this part of the header.
compact_example - Whether or not to display complex types like arrays in their most compact representation. It defaults to false.
Use the --sync flag of the config validation command to render the example configuration files.
"},{"location":"meta/config-specs/#data-model-consumer","title":"Data model consumer","text":"
The model consumer uses each spec to render the pydantic models that checks use to validate and interface with configuration. The models are shipped with every Agent and individual Integration release.
It respects an extra field under the value attribute of each option:
default - This is the default value that options will be set to, taking precedence over the example.
validators - This refers to an array of pre-defined field validators to use. Every entry will refer to a relative import path to a field validator under datadog_checks.base.utils.models.validation and will be executed in the defined order.
Use the --sync flag of the model validation command to render the data model files.
"},{"location":"meta/config-specs/#api","title":"API","text":""},{"location":"meta/config-specs/#datadog_checks.dev.tooling.configuration.ConfigSpec","title":"datadog_checks.dev.tooling.configuration.ConfigSpec","text":"Source code in datadog_checks_dev/datadog_checks/dev/tooling/configuration/core.py
class ConfigSpec(object):\n def __init__(self, contents: str, template_paths: List[str] = None, source: str = None, version: str = None):\n \"\"\"\n Parameters:\n\n contents:\n the raw text contents of a spec\n template_paths:\n a sequence of directories that will take precedence when looking for templates\n source:\n a textual representation of what the spec refers to, usually an integration name\n version:\n the version of the spec to default to if the spec does not define one\n \"\"\"\n self.contents = contents\n self.source = source\n self.version = version\n self.templates = ConfigTemplates(template_paths)\n self.data: Union[dict, None] = None\n self.errors = []\n\n def load(self) -> None:\n \"\"\"\n This function de-serializes the specification and:\n 1. fills in default values\n 2. populates any selected templates\n 3. accumulates all error/warning messages\n If the `errors` attribute is empty after this is called, the `data` attribute\n will be the fully resolved spec object.\n \"\"\"\n if self.data is not None and not self.errors:\n return\n\n try:\n self.data = yaml.safe_load(self.contents)\n except Exception as e:\n self.errors.append(f'{self.source}: Unable to parse the configuration specification: {e}')\n return\n\n spec_validator(self.data, self)\n
contents:\n the raw text contents of a spec\ntemplate_paths:\n a sequence of directories that will take precedence when looking for templates\nsource:\n a textual representation of what the spec refers to, usually an integration name\nversion:\n the version of the spec to default to if the spec does not define one\n
Source code in datadog_checks_dev/datadog_checks/dev/tooling/configuration/core.py
def __init__(self, contents: str, template_paths: List[str] = None, source: str = None, version: str = None):\n \"\"\"\n Parameters:\n\n contents:\n the raw text contents of a spec\n template_paths:\n a sequence of directories that will take precedence when looking for templates\n source:\n a textual representation of what the spec refers to, usually an integration name\n version:\n the version of the spec to default to if the spec does not define one\n \"\"\"\n self.contents = contents\n self.source = source\n self.version = version\n self.templates = ConfigTemplates(template_paths)\n self.data: Union[dict, None] = None\n self.errors = []\n
This function de-serializes the specification and: 1. fills in default values 2. populates any selected templates 3. accumulates all error/warning messages If the errors attribute is empty after this is called, the data attribute will be the fully resolved spec object.
Source code in datadog_checks_dev/datadog_checks/dev/tooling/configuration/core.py
def load(self) -> None:\n \"\"\"\n This function de-serializes the specification and:\n 1. fills in default values\n 2. populates any selected templates\n 3. accumulates all error/warning messages\n If the `errors` attribute is empty after this is called, the `data` attribute\n will be the fully resolved spec object.\n \"\"\"\n if self.data is not None and not self.errors:\n return\n\n try:\n self.data = yaml.safe_load(self.contents)\n except Exception as e:\n self.errors.append(f'{self.source}: Unable to parse the configuration specification: {e}')\n return\n\n spec_validator(self.data, self)\n
Our CI deploys the documentation to GitHub Pages if any changes occur on commits to the master branch.
Danger
Never make documentation non-deterministic as it will trigger deploys for every single commit.
For example, say you want to display the valid values of a CLI option and the enumeration is represented as a set. Formatting the sequence directly will produce inconsistent results because sets do not guarantee order like dictionaries do, so you must sort it first.
We use official labeler action to automatically add labels to pull requests.
The labeler is configured to add the following:
Label Condition integration/<NAME> any directory at the root that actually contains an integration documentation any Markdown, config specs, manifest.json, or anything in /docs/ dev/testing GitHub Actions or Codecov config dev/tooling GitLab or GitHub Actions config, or ddev dependencies any change in shipped dependencies release any base package, dev package, or integration release changelog/no-changelog any release, or if all files don't modify code that is shipped"},{"location":"meta/ci/testing/","title":"Testing","text":""},{"location":"meta/ci/testing/#workflows","title":"Workflows","text":"
Master - Runs tests on Python 3 for every target on merges to the master branch
PR - Runs tests on Python 2 & 3 for any modified target in a pull request as long as the base or developer packages were not modified
PR All - Runs tests on Python 2 & 3 for every target in a pull request if the base or developer packages were modified
Nightly minimum base package test - Runs tests for every target once nightly using the minimum declared required version of the base package
Nightly Python 2 tests - Runs tests on Python 2 for every target once nightly
Test Agent release - Runs tests for every target when manually scheduled using specific versions of the Agent for E2E tests
This workflow is meant to be used on pull requests.
First it computes the job matrix based on what was changed. Since this is time sensitive, rather than fetching the entire history we use GitHub's API to find out the precise depth to fetch in order to reach the merge base. Then it runs the test workflow for every job in the matrix.
Note
Changes that match any of the following patterns inside a directory will trigger the testing of that target:
assets/configuration/**/*
tests/**/*
*.py
hatch.toml
metadata.csv
pyproject.toml
Warning
A matrix is limited to 256 jobs. Rather than allowing a workflow error, the matrix generator will enforce the cap and emit a warning.
This workflow runs a single job that is the foundation of how all tests are executed. Depending on the input parameters, the order of operations is as follows:
Checkout code (on pull requests this is a merge commit)
Set up Python 2.7
Set up the Python version the Agent currently ships
Some targets require additional set up such as the installation of system dependencies. Therefore, all such logic is put into scripts that live under /.ddev/ci/scripts.
As targets may need different set up on different platforms, all scripts live under a directory named after the platform ID. All scripts in the directory are executed in lexicographical order. Files in the scripts directory whose names begin with an underscore are not executed.
The step that executes these scripts is the only step that has access to secrets.
Since environment variables defined in a workflow do not propagate to reusable workflows, secrets must be passed as a JSON string representing a map.
Both the PR test and Test target reusable workflows for testing accept a setup-env-vars input parameter that defines the environment variables for the setup step. For example:
If environment variables need to be available for testing, you can add a script that writes to the file defined by the GITHUB_ENV environment variable:
Configuration for targets lives under the overrides.ci key inside a /.ddev/config.toml file.
Note
Targets are referenced by the name of their directory.
"},{"location":"meta/ci/testing/#platforms","title":"Platforms","text":"Name ID Default runner Linux linux Ubuntu 22.04 Windows windows Windows Server 2022 macOS macos macOS 12
If an integration's manifest.json indicates that the only supported platform is Windows then that will be used to run tests, otherwise they will run on Linux.
To override the platform(s) used, one can set the overrides.ci.<TARGET>.platforms array. For example:
During testing we use ddtrace to submit APM data to the Datadog Agent. To avoid every job pulling the Agent, these HTTP trace requests are captured and saved to a newline-delimited JSON file.
A workflow then runs after all jobs are finished and replays the requests to the Agent. At the end the artifact is deleted to avoid needless storage persistence and also so if individual jobs are rerun that only the new traces will be submitted.
We maintain a public dashboard for monitoring our CI.
A workflow runs on merges to the master branch that, if the files defining the dependencies have not changed, saves the dependencies shared by all targets for the current Python version for each platform.
During testing the cache is restored, with a fallback to an older compatible version of the cache.
The first command invocation is extraordinarily slow (see actions/runner-images#6561). Bash appears to be the least affected so we set that as the default shell for all workflows that run commands.
Note
The official checkout action is affected by a similar issue (see actions/checkout#1246) that has been narrowed down to disk I/O.
Various validations are ran to check for correctness. There is a reusable workflow that repositories may call with input parameters defining which validations to use, with each input parameter corresponding to a subcommand under the ddev validate command group.
This validates that each integration version is in sync with the requirements-agent-release.txt file. It is uncommon for this to fail because the release process is automated.
This validates that all CI entries for integrations are valid. This includes checking if the integration has the correct Codecov config, and has a valid CI entry if it is testable.
Tip
Run ddev validate ci --sync to resolve most errors.
This validates that every integration has a codeowner entry. If this validation fails, add an entry in the codewners file corresponding to any newly added integration.
Note
This validation is only enabled for integrations-extras.
This verifies that the config specs for all integrations are valid by enforcing our configuration spec schema. The most common failure is some version of File <INTEGRATION_SPEC> needs to be synced. To resolve this issue, you can run ddev validate config --sync
If you see failures regarding formatting or missing parameters, see our config spec documentation for more details on how to construct configuration specs.
This validates that the manifest files contain required fields, are formatted correctly, and don't contain common errors. See the Datadog docs for more detailed constraints.
This ensures that every integration's README.md file is formatted correctly. The main purpose of this validation is to ensure that any image linked in the readme exists and that all images are located in an integration's /image directory.
"},{"location":"tutorials/jmx/integration/#step-1-create-a-jmx-integration-scaffolding","title":"Step 1: Create a JMX integration scaffolding","text":"
ddev create --type jmx MyJMXIntegration\n
JMX integration contains specific init configs and instance configs:
init_config:\n is_jmx: true # tells the Agent that the integration is a JMX type of integration\n collect_default_metrics: true # if true, metrics declared in `metrics.yaml` are collected\n\ninstances:\n - host: <HOST> # JMX hostname\n port: <PORT> # JMX port\n ...\n
Other init and instance configs can be found on JMX integration page
"},{"location":"tutorials/jmx/integration/#step-2-define-metrics-you-want-to-collect","title":"Step 2: Define metrics you want to collect","text":"
Select what metrics you want to collect from JMX. Available metrics can be usually found on official documentation of the service you want to monitor.
You can also use tools like VisualVM, JConsole or jmxterm to explore the available JMX beans and their descriptions.
"},{"location":"tutorials/logs/http-crawler/#define-an-agent-check","title":"Define an Agent Check","text":"
We start by registering an implementation for our integration. At first it is empty, we will expand on it step by step.
Open datadog_checks/acme/check.py in our editor and put the following there:
from datadog_checks.base.checks.logs.crawler.base import LogCrawlerCheck\n\n\nclass AcmeCheck(LogCrawlerCheck):\n __NAMESPACE__ = 'acme'\n
Now we'll run something we will refer to as the check command:
ddev env agent acme py3.11 check\n
We'll see the following error:
Can't instantiate abstract class AcmeCheck with abstract method get_log_streams\n
We need to define the get_log_streams method. As stated in the docs, it must return an iterator over LogStream subclasses. The next section describes this further.
"},{"location":"tutorials/logs/http-crawler/#define-a-stream-of-logs","title":"Define a Stream of Logs","text":"
In the same file, add a LogStream subclass and return it (wrapped in a list) from AcmeCheck.get_log_streams:
from datadog_checks.base.checks.logs.crawler.base import LogCrawlerCheck\nfrom datadog_checks.base.checks.logs.crawler.stream import LogStream\n\nclass AcmeCheck(LogCrawlerCheck):\n __NAMESPACE__ = 'acme'\n\n def get_log_streams(self):\n return [AcmeLogStream(check=self, name='ACME log stream')]\n\nclass AcmeLogStream(LogStream):\n \"\"\"Stream of Logs from ACME\"\"\"\n
Now running the check command will show a new error:
TypeError: Can't instantiate abstract class AcmeLogStream with abstract method records\n
Once again we need to define a method, this time LogStream.records. This method accepts a cursor argument. We ignore this argument for now and explain it later.
from datadog_checks.base.checks.logs.crawler.stream import LogRecord, LogStream\nfrom datadog_checks.base.utils.time import get_timestamp\n\n... # Skip AcmeCheck to focus on LogStream.\n\n\nclass AcmeLogStream(LogStream):\n \"\"\"Stream of Logs from ACME\"\"\"\n\n def records(self, cursor=None):\n return [\n LogRecord(\n data={'message': 'This is a log from ACME.', 'level': 'info'},\n cursor={'timestamp': get_timestamp()},\n )\n ]\n
There are several things going on here. AcmeLogStream.records returns an iterator over LogRecord objects. For simplicity here we return a list with just one record. After we understand what each LogRecord looks like we can discuss how to generate multiple records.
"},{"location":"tutorials/logs/http-crawler/#what-is-a-log-record","title":"What is a Log Record?","text":"
The LogRecord class has 2 fields. In data we put any data in here that we want to submit as a log to Datadog. In cursor we store a unique identifier for this specific LogRecord.
We use the cursor field to checkpoint our progress as we scrape the external API. In other words, every time our integration completes its run we save the last cursor we submitted. We can then resume scraping from this cursor. That's what the cursor argument to the records method is for. The very first time the integration runs this cursor is None because we have no checkpoints. For every subsequent integration run, the cursor will be set to the LogRecord.cursor of the last LogRecord yielded or returned from records.
Some things to consider when defining cursors:
Use UTC time stamps!
Only using the timestamp as a unique identifier may not be enough. We can have different records with the same timestamp.
One popular identifier is the order of the log record in the stream. Whether this works or not depends on the API we are crawling.
"},{"location":"tutorials/logs/http-crawler/#scraping-for-log-records","title":"Scraping for Log Records","text":"
In our toy example we returned a list with just one record. In practice we will need to create a list or lazy iterator over LogRecords. We will construct them from data that we collect from the external API, in this case the one from ACME.
Below are some tips and considerations when scraping external APIs:
Use the cursor argument to checkpoint your progress.
The Agent schedules an integration run approximately every 10-15 seconds.
The intake won't accept logs that are older than 18 hours. For better performance skip such logs as you generate LogRecord items.
SNMP is a protocol for gathering metrics from network devices, but automated testing of the integration would not be practical nor reliable if we used actual devices.
Our approach is to use a simulated SNMP device that responds to SNMP queries using simulation data.
This simulated device is brought up as a Docker container when starting the SNMP test environment using:
The community_string must match the corresponding device .snmprec file name. For example, myprofile.snmprec gives community_string: myprofile. This also applies to walk files: myprofile.snmpwalk gives community_string: myprofile.
To find the IP address of the SNMP container, run:
Make sure you have the Net-SNMP tools installed on your machine. These should come pre-installed by default on Linux and macOS. If necessary, you can download them on the Net-SNMP website.
To query a specific OID from a device, we can use the snmpget command.
For example, the following command will query sysDescr OID of an SNMP device, which returns its human-readable description:
$ snmpget -v 2c -c public -IR 127.0.0.1:1161 system.sysDescr.0\nSNMPv2-MIB::sysDescr.0 = STRING: Linux 41ba948911b9 4.9.87-linuxkit-aufs #1 SMP Wed Mar 14 15:12:16 UTC 2018 x86_64\nSNMPv2-MIB::sysORUpTime.1 = Timeticks: (9) 0:00:00.09\n
Let's break this command down:
snmpget: this command sends an SNMP GET request, and can be used to query the value of an OID. Here, we are requesting the system.sysDescr.0 OID.
-v 2c: instructs your SNMP client to send the request using SNMP version 2c. See SNMP Versions.
-c public: instructs the SNMP client to send the community string public along with our request. (This is a form of authentication provided by SNMP v2. See SNMP Versions.)
127.0.0.1:1161: this is the host and port where the simulated SNMP agent is available at. (Confirm the port used by the ddev environment by inspecting the Docker port mapping via $ docker ps.)
system.sysDescr.0: this is the OID that the client should request. In practice this can refer to either a fully-resolved OID (e.g. 1.3.6.1.4.1[...]), or a label (e.g. sysDescr.0).
-IR: this option allows us to use labels for OIDs that aren't in the generic 1.3.6.1.2.1.* sub-tree (see: The OID tree). TL;DR: always use this option when working with OIDs coming from vendor-specific MIBs.
Tip
If the above command fails, try using the explicit OID like so:
$ snmpget -v 2c -c public -IR 127.0.0.1:1161 iso.3.6.1.2.1.1.1.0\n
To generate simulation data for tables automatically, use the mib2dev.py tool shipped with snmpsim. This tool will be renamed as snmpsim-record-mibs in the upcoming 1.0 release of the library.
First, install snmpsim:
pip install snmpsim\n
Then run the tool, specifying the MIB with the start and stop OIDs (which can correspond to .e.g the first and last columns in the table respectively).
mib2dev has a known issue with IF-MIB::ifPhysAddress, that is expected to contain an hexadecimal string, but mib2dev fills it with a string. To fix this, provide a valid hextring when prompted on the command line:
# Synthesizing row #1 of table 1.3.6.1.2.1.2.2.1\n*** Inconsistent value: Display format eval failure: b'driving kept zombies quaintly forward zombies': invalid literal for int() with base 16: 'driving kept zombies quaintly forward zombies'caused by <class 'ValueError'>: invalid literal for int() with base 16: 'driving kept zombies quaintly forward zombies'\n*** See constraints and suggest a better one for:\n# Table IF-MIB::ifTable\n# Row IF-MIB::ifEntry\n# Index IF-MIB::ifIndex (type InterfaceIndex)\n# Column IF-MIB::ifPhysAddress (type PhysAddress)\n# Value ['driving kept zombies quaintly forward zombies'] ? 001122334455\n
"},{"location":"tutorials/snmp/how-to/#generate-simulation-data-from-a-walk","title":"Generate simulation data from a walk","text":"
As an alternative to .snmprec files, it is possible to use a walk as simulation data. This is especially useful when debugging live devices, since you can export the device walk and use this real data locally.
To do so, paste the output of a walk query into a .snmpwalk file, and add this file to the test data directory. Then, pass the name of the walk file as the community_string. For more information, see Test SNMP profiles locally.
"},{"location":"tutorials/snmp/how-to/#find-where-mibs-are-installed-on-your-machine","title":"Find where MIBs are installed on your machine","text":"
Since community resources that list MIBs and OIDs are best effort, the MIB you are investigating may not be present or may not be available in its the latest version.
In that case, you can use the snmptranslate CLI tool to output similar information for MIBs installed on your system. This tool is part of Net-SNMP - see SNMP queries prerequisites.
Steps
Run $ snmptranslate -m <MIBNAME> -Tz -On to get a complete list of OIDs in the <MIBNAME> MIB along with their labels.
Redirect to a file for nicer formatting as needed.
Use the -M <DIR> option to specify the directory where snmptranslate should look for MIBs. Useful if you want to inspect a MIB you've just downloaded but not moved to the default MIB directory.
Tip
Use -Tp for an alternative tree-like formatting.
"},{"location":"tutorials/snmp/introduction/","title":"Introduction to SNMP","text":"
In this introduction, we'll cover general information about the SNMP protocol, including key concepts such as OIDs and MIBs.
If you're already familiar with the SNMP protocol, feel free to skip to the next page.
"},{"location":"tutorials/snmp/introduction/#what-is-snmp","title":"What is SNMP?","text":""},{"location":"tutorials/snmp/introduction/#overview","title":"Overview","text":"
SNMP (Simple Network Management Protocol) is a protocol for monitoring network devices. It uses UDP and supports both a request/response model (commands and queries) and a notification model (traps, informs).
In the request/response model, the SNMP manager (eg. the Datadog Agent) issues an SNMP command (GET, GETNEXT, BULK) to an SNMP agent (eg. a network device).
SNMP was born in the 1980s, so it has been around for a long time. While more modern alternatives like NETCONF and OpenConfig have been gaining attention, a large amount of network devices still use SNMP as their primary monitoring interface.
The SNMP protocol exists in 3 versions: v1 (legacy), v2c, and v3.
The main differences between v1/v2c and v3 are the authentication mechanism and transport layer, as summarized below.
Version Authentication Transport layer v1/v2c Password (the community string) Plain text only v3 Username/password Support for packet signing and encryption"},{"location":"tutorials/snmp/introduction/#oids","title":"OIDs","text":""},{"location":"tutorials/snmp/introduction/#what-is-an-oid","title":"What is an OID?","text":"
Identifiers for queryable quantities
An OID, also known as an Object Identifier, is an identifier for a quantity (\"object\") that can be retrieved from an SNMP device. Such quantities may include uptime, temperature, network traffic, etc (quantities available will vary across devices).
To make them processable by machines, OIDs are represented as dot-separated sequences of numbers, e.g. 1.3.6.1.2.1.1.1.
Global definition
OIDs are globally defined, which means they have the same meaning regardless of the device that processes the SNMP query. For example, querying the 1.3.6.1.2.1.1.1 OID (also known as sysDescr) on any SNMP agent will make it return the system description. (More on the OID/label mapping can be found in the MIBs section below.)
Not all OIDs contain metrics data
OIDs can refer to various types of objects, such as strings, numbers, tables, etc.
In particular, this means that only a fraction of OIDs refer to numerical quantities that can actually be sent as metrics to Datadog. However, non-numerical OIDs can also be useful, especially for tagging.
"},{"location":"tutorials/snmp/introduction/#the-oid-tree","title":"The OID tree","text":"
OIDs are structured in a tree-like fashion. Each number in the OID represents a node in the tree.
The wildcard notation is often used to refer to a sub-tree of OIDs, e.g. 1.3.6.1.2.*.
It so happens that there are two main OID sub-trees: a sub-tree for general-purpose OIDs, and a sub-tree for vendor-specific OIDs.
Located under the sub-tree: 1.3.6.1.4.1.* (a.k.a. enterprises).
These OIDs are defined and managed by network device vendors themselves.
Each vendor is assigned its own enterprise sub-tree in the form of 1.3.6.1.4.1.<N>.*.
For example:
1.3.6.1.4.1.2.* is the sub-tree for IBM-specific OIDs.
1.3.6.1.4.1.9.* is the sub-tree for Cisco-specific OIDs.
The full list of vendor sub-trees can be found here: SNMP OID 1.3.6.1.4.1.
"},{"location":"tutorials/snmp/introduction/#notable-oids","title":"Notable OIDs","text":"OID Label Description 1.3.6.1.2.1.2sysObjectId An OID whose value is an OID that represents the device make and model (yes, it's a bit meta). 1.3.6.1.2.1.1.1sysDescr A human-readable, free-form description of the device. 1.3.6.1.2.1.1.3sysUpTimeInstance The device uptime."},{"location":"tutorials/snmp/introduction/#mibs","title":"MIBs","text":""},{"location":"tutorials/snmp/introduction/#what-is-an-mib","title":"What is an MIB?","text":"
OIDs are grouped in modules called MIBs (Management Information Base). An MIB describes the hierarchy of a given set of OIDs. (This is somewhat analogous to a dictionary that contains the definitions for each word in a spoken language.)
For example, the IF-MIB describes the hierarchy of OIDs within the sub-tree 1.3.6.1.2.1.2.*. These OIDs contain metrics about the network interfaces available on the device. (Note how its location under the 1.3.6.1.2.* sub-tree indicates that it is a generic MIB, available on most network devices.)
As part of the description of OIDs, an MIB defines a human-readable label for each OID. For example, IF-MIB describes the OID 1.3.6.1.2.1.1 and assigns it the label sysDescr. The operation that consists in finding the OID from a label is called OID resolution.
"},{"location":"tutorials/snmp/introduction/#tools-and-resources","title":"Tools and resources","text":"
The following resources can be useful when working with MIBs:
MIB Discovery: a search engine for OIDs. Use it to find what an OID corresponds to, which MIB it comes from, what label it is known as, etc.
Circitor MIB files repository: a repository and search engine where one can download actual .mib files.
SNMP Labs MIB repository: alternate repo of many common MIBs. Note: this site hosts the underlying MIBs which the pysnmp-mibs library (used by the SNMP Python check) actually validates against. Double check any MIB you get from an alternate source with what is in this repo.
Tutorials: Internet Management and SNMP (YouTube) (In-depth videos about SNMP architecture, MIBs, protocol data structures, security models, monitoring code examples, etc.)
"},{"location":"tutorials/snmp/profile-format/","title":"Profile Format Reference","text":""},{"location":"tutorials/snmp/profile-format/#overview","title":"Overview","text":"
SNMP profiles are our way of providing out-of-the-box monitoring for certain makes and models of network devices.
An SNMP profile is materialised as a YAML file with the following structure:
sysobjectid: <x.y.z...>\n\n# extends:\n# <Optional list of base profiles to extend from...>\n\nmetrics:\n # <List of metrics to collect...>\n\n# metric_tags:\n# <List of tags to apply to collected metrics. Required for table metrics, optional otherwise>\n
This field can be used to include metrics and metric tags from other so-called base profiles. Base profiles can derive from other base profiles to build a hierarchy of reusable profile mixins.
Important
All device profiles should extend from the _base.yaml profile, which defines items that should be collected for all devices.
Example:
extends:\n - _base.yaml\n - _generic-if.yaml # Include basic metrics from IF-MIB.\n
Entries in the metrics field define which metrics will be collected by the profile. They can reference either a single OID (a.k.a symbol), or an SNMP table.
An SNMP symbol is an object with a scalar type (i.e. Counter32, Integer32, OctetString, etc).
In a MIB file, a symbol can be recognized as an OBJECT-TYPE node with a scalar SYNTAX, placed under an OBJECT IDENTIFIER node (which is often the root OID of the MIB):
In profiles, tables can be specified as entries containing the MIB, table and symbols fields. The syntax for the value contained in each row is typically <TABLE_OID>.1.<COLUMN_ID>.<INDEX>:
metrics:\n # Example for the dummy table above:\n - MIB: EXAMPLE-MIB\n table:\n # Identification of the table which metrics come from.\n OID: 1.3.6.1.4.1.10\n name: exampleTable\n symbols:\n # List of symbols ('columns') to retrieve.\n # Same format as for a single OID.\n # The value from each row (index) in the table will be collected `<TABLE_OID>.1.<COLUMN_ID>.<INDEX>`\n - OID: 1.3.6.1.4.1.10.1.1\n name: exampleColumn1\n - OID: 1.3.6.1.4.1.10.1.2\n name: exampleColumn2\n # ...\n\n # More realistic example:\n - MIB: CISCO-PROCESS-MIB\n table:\n # Each row in this table contains information about a CPU unit of the device.\n OID: 1.3.6.1.4.1.9.9.109.1.1.1\n name: cpmCPUTotalTable\n symbols:\n - OID: 1.3.6.1.4.1.9.9.109.1.1.1.1.12\n name: cpmCPUMemoryUsed\n # ...\n
Table metrics require metric_tags to identify each row's metric. It is possible to add tags to metrics retrieved from a table in three ways:
"},{"location":"tutorials/snmp/profile-format/#using-a-column-within-the-same-table","title":"Using a column within the same table","text":"
metrics:\n - MIB: IF-MIB\n table:\n OID: 1.3.6.1.2.1.2.2\n name: ifTable\n symbols:\n - OID: 1.3.6.1.2.1.2.2.1.14\n name: ifInErrors\n # ...\n metric_tags:\n # Add an 'interface' tag to each metric of each row,\n # whose value is obtained from the 'ifDescr' column of the row.\n # This allows querying metrics by interface, e.g. 'interface:eth0'.\n - tag: interface\n symbol:\n OID: 1.3.6.1.2.1.2.2.1.2\n name: ifDescr\n
"},{"location":"tutorials/snmp/profile-format/#using-a-column-from-a-different-table-with-identical-indexes","title":"Using a column from a different table with identical indexes","text":"
"},{"location":"tutorials/snmp/profile-format/#using-a-column-from-a-different-table-with-different-indexes","title":"Using a column from a different table with different indexes","text":"
If the external table has different indexes, use index_transform to select a subset of the full index. index_transform is a list of start/end ranges to extract from the current table index to match the external table index. start and end are inclusive.
External table indexes must be a subset of the indexes of the current table, or same indexes in a different order.
Example
In the example above, the index of cpiPduBranchTable looks like 1.6.0.36.155.53.3.246, the first digit is the cpiPduBranchId index and the rest is the cpiPduBranchMac index. The index of cpiPduTable looks like 6.0.36.155.53.3.246 and represents cpiPduMac (equivalent to cpiPduBranchMac).
By using the index_transform with start 1 and end 7, we extract 6.0.36.155.53.3.246 from 1.6.0.36.155.53.3.246 (cpiPduBranchTable full index), and then use it to match 6.0.36.155.53.3.246 (cpiPduTable full index).
index_transform can be more complex, the following definition will extract 2.3.5.6.7 from 1.2.3.4.5.6.7.
"},{"location":"tutorials/snmp/profile-format/#mapping-column-to-tag-string-value","title":"Mapping column to tag string value","text":"
You can use the following syntax to map OID values to tag string values. In the example below, the submitted metrics will be snmp.ifInOctets with tags like if_type:regular1822. Available in Agent 7.45+.
"},{"location":"tutorials/snmp/profile-format/#using-an-index","title":"Using an index","text":"
Important: \"index\" refers to one digit of the index part of the row OID. For example, if the column OID is 1.2.3.1.2 and the row OID is 1.2.3.1.2.7.8.9, the full index is 7.8.9. In this example, index: 1 refers to 7 and index: 2 refers to 8, and so on.
Here is specific example of an OID with multiple positions in the index (OID ref):
cfwConnectionStatEntry OBJECT-TYPE\n SYNTAX CfwConnectionStatEntry\n ACCESS not-accessible\n STATUS mandatory\n DESCRIPTION\n \"An entry in the table, containing information about a\n firewall statistic.\"\n INDEX { cfwConnectionStatService, cfwConnectionStatType }\n ::= { cfwConnectionStatTable 1 }\n
The index in the case is a combination of cfwConnectionStatService and cfwConnectionStatType. Inspecting the OBJECT-TYPE of cfwConnectionStatService reveals the SYNTAX as Services (OID ref):
cfwConnectionStatService OBJECT-TYPE\n SYNTAX Services\n MAX-ACCESS not-accessible\n STATUS current\n DESCRIPTION\n \"The identification of the type of connection providing\n statistics.\"\n ::= { cfwConnectionStatEntry 1 }\n
For example, when we fetch the value of cfwConnectionStatValue, the OID with the index is like 1.3.6.1.4.1.9.9.147.1.2.2.2.1.5.20.2 = 4087850099, here the indexes are 20.2 (1.3.6.1.4.1.9.9.147.1.2.2.2.1.5.<service type>.<stat type>). Here is how we would specify this configuration in the yaml (as seen in the corresponding profile packaged with the agent):
metrics:\n - MIB: CISCO-FIREWALL-MIB\n table:\n OID: 1.3.6.1.4.1.9.9.147.1.2.2.2\n name: cfwConnectionStatTable\n symbols:\n - OID: 1.3.6.1.4.1.9.9.147.1.2.2.2.1.5\n name: cfwConnectionStatValue\n metric_tags:\n - index: 1 // capture first index digit\n tag: service_type\n - index: 2 // capture second index digit\n tag: stat_type\n
"},{"location":"tutorials/snmp/profile-format/#mapping-index-to-tag-string-value","title":"Mapping index to tag string value","text":"
You can use the following syntax to map indexes to tag string values. In the example below, the submitted metrics will be snmp.ipSystemStatsHCInReceives with tags like ipversion:ipv6.
General guidelines on Datadog tagging also apply to table metric tags.
In particular, be mindful of the kind of value contained in the columns used a tag sources. E.g. avoid using a DisplayString (an arbitrarily long human-readable text description) or unbounded sources (timestamps, IDs...) as tag values.
Good candidates for tag values include short strings, enums, or integer indexes.
"},{"location":"tutorials/snmp/profile-format/#metric-type-inference","title":"Metric type inference","text":"
By default, the Datadog metric type of a symbol will be inferred from the SNMP type (i.e. the MIB SYNTAX):
SNMP type Inferred metric type Counter32rateCounter64rateGauge32gaugeIntegergaugeInteger32gaugeCounterBasedGauge64gaugeOpaquegauge
SNMP types not listed in this table are submitted as gauge by default.
Sometimes the inferred type may not be what you want. Typically, OIDs that represent \"total number of X\" are defined as Counter32 in MIBs, but you probably want to submit them monotonic_count instead of a rate.
For such cases, you can define a metric_type. Possible values and their effect are listed below.
Forced type Description gauge Submit as a gauge. rate Submit as a rate. percent Multiply by 100 and submit as a rate. monotonic_count Submit as a monotonic count. monotonic_count_and_rate Submit 2 copies of the metric: one as a monotonic count, and one as a rate (suffixed with .rate). flag_stream Submit each flag of a flag stream as individual metric with value 0 or 1. See Flag Stream section.
This works on both symbol and table metrics:
metrics:\n # On a symbol:\n - MIB: TCP-MIB\n symbol:\n OID: 1.3.6.1.2.1.6.5\n name: tcpActiveOpens\n metric_type: monotonic_count\n # On a table, apply same metric_type to all metrics:\n - MIB: IP-MIB\n table:\n OID: 1.3.6.1.2.1.4.31.1\n name: ipSystemStatsTable\n metric_type: monotonic_count\n symbols:\n - OID: 1.3.6.1.2.1.4.31.1.1.4\n name: ipSystemStatsHCInReceives\n - OID: 1.3.6.1.2.1.4.31.1.1.6\n name: ipSystemStatsHCInOctets\n # On a table, apply different metric_type per metric:\n - MIB: IP-MIB\n table:\n OID: 1.3.6.1.2.1.4.31.1\n name: ipSystemStatsTable\n symbols:\n - OID: 1.3.6.1.2.1.4.31.1.1.4\n name: ipSystemStatsHCInReceives\n metric_type: monotonic_count\n - OID: 1.3.6.1.2.1.4.31.1.1.6\n name: ipSystemStatsHCInOctets\n metric_type: gauge\n
When the value is a flag stream like 010101, you can use metric_type: flag_stream to submit each flag as individual metric with value 0 or 1. Two options are required when using flag_stream:
options.placement: position of the flag in the flag stream (1-based indexing, first element is placement 1).
options.metric_suffix: suffix appended to the metric name for a specific flag, usually matching the name of the flag.
An snmp.myDevice metric is sent, with a value of 1 and tagged by statuses. This allows you to monitor status changes, number of devices per state, etc., in Datadog.
This field is used to apply tags to all metrics collected by the profile. It has the same meaning than the instance-level config option (see conf.yaml.example).
Several collection methods are supported, as illustrated below:
"},{"location":"tutorials/snmp/profile-format/#value-from-multiple-oids-symbols","title":"Value from multiple OIDs (symbols)","text":"
When the value might be from multiple symbols, we try to get the value from first symbol, if the value can't be fetched (e.g. OID not available from the device), we try to get the value from the second symbol, and so on.
In the examples above, the OID value is a snmp OctetString value 22C and we want 22 to be submitted as value for snmp.temperature.
"},{"location":"tutorials/snmp/profile-format/#extract_value-can-be-used-to-trim-surrounding-non-printable-characters","title":"extract_value can be used to trim surrounding non-printable characters","text":"
If the raw SNMP OctetString value contains leading or trailing non-printable characters, you can use extract_value regex like ([a-zA-Z0-9_]+) to ignore them.
If you see MAC Address in tags being encoded as 0x000000000000 instead of 00:00:00:00:00:00, then you can use format: mac_address to format the MAC Address to 00:00:00:00:00:00 format.
If you see IP Address in tags being encoded as 0x0a430007 instead of 10.67.0.7, then you can use format: ip_address to format the IP Address to 10.67.0.7 format.
Generally, you'll want to search the web and find out about the following:
Device name, manufacturer, and device sysobjectid.
Understand what the device does, and what it is used for. (Which metrics are relevant varies between routers, switches, bridges, etc. See Networking hardware.)
E.g. from the HP iLO Wikipedia page, we can see that iLO4 devices are used by system administrators for remote management of embedded servers.
Available versions of the device, and which ones we target.
E.g. HP iLO devices exist in multiple versions (version 3, version 4...). Here, we are specifically targeting HP iLO4.
Supported MIBs and OIDs (often available in official documentation), and associated MIB files.
E.g. we can see that HP provides a MIB package for iLO devices here.
Now that we have gathered some basic information about the device and its SNMP interfaces, we should decide which metrics we want to collect. (Devices often expose thousands of metrics through SNMP. We certainly don't want to collect them all.)
Devices typically expose thousands of OIDs that can span dozens of MIB, so this can feel daunting at first. Remember, never give up!
Some guidelines to help you in this process:
10-40 metrics is a good amount already.
Explore base profiles to see which ones could be applicable to the device.
Explore manufacturer-specific MIB files looking for metrics such as:
General health: status gauges...
Network traffic: bytes in/out, errors in/out, ...
CPU and memory usage.
Temperature: temperature sensors, thermal condition, ...
sysobjectid can also be a wildcard pattern to match a sub-tree of devices, eg 1.3.6.1.131.12.4.*.
"},{"location":"tutorials/snmp/profiles/#generate-a-profile-file-from-a-collection-of-mibs","title":"Generate a profile file from a collection of MIBs","text":"
You can use ddev to create a profile from a list of mibs.
$ ddev meta snmp generate-profile-from-mibs --help\n
This script requires a list of ASN1 MIB files as input argument, and copies to the clipboard a list of metrics that can be used to create a profile.
Will include system, interfaces and ip nodes from RFC1213-MIB, no node from CISCO-SYSLOG-MIB, and node snmpEngine from SNMP-FRAMEWORK-MIB.
Note that each MIB:node_name correspond to exactly one and only one OID. However, some MIBs report legacy nodes that are overwritten.
To resolve, edit the MIB by removing legacy values manually before loading them with this profile generator. If a MIB is fully supported, it can be omitted from the filter as MIBs not found in a filter will be fully loaded. If a MIB is not fully supported, it can be listed with an empty node list, as CISCO-SYSLOG-MIB in the example.
-a, --aliases is an option to provide the path to a YAML file containing a list of aliases to be used as metric tags for tables, in the following format:
MIBs tables most of the time define one or more indexes, as columns within the same table, or columns from a different table and even a different MIB. The index value can be used to tag table's metrics. This is defined in the INDEX field in row nodes.
As an example, entPhysicalContainsTable in ENTITY-MIB is as follows:
entPhysicalContainsEntry OBJECT-TYPE\nSYNTAX EntPhysicalContainsEntry\nMAX-ACCESS not-accessible\nSTATUS current\nDESCRIPTION\n \"A single container/'containee' relationship.\"\nINDEX { entPhysicalIndex, entPhysicalChildIndex } <== this is the index definition\n::= { entPhysicalContainsTable 1 }\n
or its JSON dump, where INDEX is replaced by indices:
Indexes can be replaced by another MIB symbol that is more human friendly. You might prefer to see the interface name versus its numerical table index. This can be achieved using metric_tag_aliases.
"},{"location":"tutorials/snmp/profiles/#add-unit-tests","title":"Add unit tests","text":"
Add a unit test in test_profiles.py to verify that the metric is successfully collected by the integration when the profile is enabled. (These unit tests are mostly used to prevent regressions and will help with maintenance.)
"},{"location":"tutorials/snmp/profiles/#rinse-and-repeat","title":"Rinse and repeat","text":"
We have now covered the basic workflow \u2014 add metrics, expand tests, add simulation data. You can now go ahead and add more metrics to the profile!
Congratulations! You should now be able to write a basic SNMP profile.
We kept this tutorial as simple as possible, but profiles offer many more options to collect metrics from SNMP devices.
To learn more about what can be done in profiles, read the Profile format reference.
To learn more about .snmprec files, see the Simulation data format reference.
"},{"location":"tutorials/snmp/sim-format/","title":"Simulation Data Format Reference","text":""},{"location":"tutorials/snmp/sim-format/#conventions","title":"Conventions","text":"
Simulation data for profiles is contained in .snmprec files located in the tests directory.
Simulation files must be named after the SNMP community string used in the profile unit tests. For example: cisco-nexus.snmprec.
Adding simulation data for tables can be particularly tedious. This section documents the manual process, but automatic generation is possible \u2014 see How to generate table simulation data.
For table metrics, add one copy of the metric per row, appending the index to the OID.
For example, to simulate 3 rows in the table 1.3.6.1.4.1.6.13 that has OIDs 1.3.6.1.4.1.6.13.1.6 and 1.3.6.1.4.1.6.13.1.8, you could write:
If the table uses table metric tags, you may need to add additional OID simulation data for those tags.
"},{"location":"tutorials/snmp/tools/","title":"Tools","text":""},{"location":"tutorials/snmp/tools/#using-tcpdump-with-snmp","title":"Using tcpdump with SNMP","text":"
The tcpdump command shows the exact request and response content of SNMP GET, GETNEXT and other SNMP calls.
In a shell run tcpdump:
tcpdump -vv -nni lo0 -T snmp host localhost and port 161\n
-nn: turn off host and protocol name resolution (to avoid generating DNS packets)
-i INTERFACE: listen on INTERFACE (default: lowest numbered interface)
-T snmp: type/protocol, snmp in our case
In another separate shell run snmpwalk or snmpget:
snmpwalk -O n -v2c -c <COMMUNITY_STRING> localhost:1161 1.3.6\n
After you've run snmpwalk, you'll see results like this from tcpdump:
tcpdump -vv -nni lo0 -T snmp host localhost and port 161\ntcpdump: listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes\n17:25:43.639639 IP (tos 0x0, ttl 64, id 29570, offset 0, flags [none], proto UDP (17), length 76, bad cksum 0 (->91d)!)\n 127.0.0.1.59540 > 127.0.0.1.1161: { SNMPv2c C=\"cisco-nexus\" { GetRequest(28) R=1921760388 .1.3.6.1.2.1.1.2.0 } }\n17:25:43.645088 IP (tos 0x0, ttl 64, id 26543, offset 0, flags [none], proto UDP (17), length 88, bad cksum 0 (->14e4)!)\n 127.0.0.1.1161 > 127.0.0.1.59540: { SNMPv2c C=\"cisco-nexus\" { GetResponse(40) R=1921760388 .1.3.6.1.2.1.1.2.0=.1.3.6.1.4.1.9.12.3.1.3.1.2 } }\n
"},{"location":"tutorials/snmp/tools/#from-the-docker-agent-container","title":"From the Docker Agent container","text":"
If you want to run snmpget, snmpwalk, and tcpdump from the Docker Agent container you can install them by running the following commands (in the container):