Terraform module for Azure Data Factory and it's components creation
Currently, this module provides an ability to provision Data Factory Studio, Integration Runtime within managed network, Diagnostic Settings and Managed endpoints.
data "azurerm_databricks_workspace" "example" {
name = "example-adb-workspace"
resource_group_name = "example-rg"
}
data "azurerm_log_analytics_workspace" "example" {
name = "example-law-workspace"
resource_group_name = "example-rg"
}
module "data_factory" {
source = "data-platform-hq/data-factory/azurerm"
project = "datahq"
env = "example"
location = "eastus"
resource_group = "example-rg"
key_vault_name = "example-key-vault"
# Target Log Analytics Workspace used by Diagnostic Settings for log/metrics storage
log_analytics_workspace = {
(data.azurerm_log_analytics_workspace.example.name) = data.azurerm_log_analytics_workspace.example.id
}
# Set of Objects with parameters to create Managed endpoints in Integration Runtime Managed network.
managed_private_endpoint = [{
name = "adb"
target_resource_id = data.azurerm_databricks_workspace.example.id
subresource_name = "databricks_ui_api"
}]
}
Be aware that private endpoint connection is created in a Pending state and a manual approval is required.
To finish this configuration you have to open Azure Databricks Service (or other Azure Service you connect to), select "Networking" in Settings section. Change to Private endpoint connections tab and select created connection (it should be in a pending state) and press "Approve" button.
If your deployment creates multiple managed private endpoints for different Azure services, you must approve all of them.
Name | Version |
---|---|
terraform | >= 1.0.0 |
azurerm | >= 4.0.1 |
Name | Version |
---|---|
azurerm | >= 4.0.1 |
No modules.
Name | Type |
---|---|
azurerm_data_factory.this | resource |
azurerm_data_factory_integration_runtime_azure.auto_resolve | resource |
azurerm_data_factory_integration_runtime_self_hosted.this | resource |
azurerm_data_factory_managed_private_endpoint.this | resource |
azurerm_monitor_diagnostic_setting.this | resource |
azurerm_role_assignment.data_factory | resource |
azurerm_monitor_diagnostic_categories.this | data source |
Name | Description | Type | Default | Required |
---|---|---|---|---|
analytics_destination_type | Log analytics destination type | string |
"Dedicated" |
no |
cleanup_enabled | Cluster will not be recycled and it will be used in next data flow activity run until TTL (time to live) is reached if this is set as false | bool |
true |
no |
compute_type | Compute type of the cluster which will execute data flow job: [General|ComputeOptimized|MemoryOptimized] | string |
"General" |
no |
core_count | Core count of the cluster which will execute data flow job: [8|16|32|48|144|272] | number |
8 |
no |
custom_adf_name | Specifies the name of the Data Factory | string |
null |
no |
custom_default_ir_name | Specifies the name of the Managed Integration Runtime | string |
null |
no |
custom_diagnostics_name | Specifies the name of Diagnostic Settings that monitors ADF | string |
null |
no |
custom_shir_name | Specifies the name of Self Hosted Integration runtime | string |
null |
no |
env | Environment name | string |
n/a | yes |
global_parameter | Configuration of data factory global parameters | list(object({ |
[] |
no |
location | Azure location | string |
n/a | yes |
log_analytics_workspace | Log Analytics Workspace Name to ID map | map(string) |
{} |
no |
managed_private_endpoint | The ID and sub resource name of the Private Link Enabled Remote Resource which this Data Factory Private Endpoint should be connected to | set(object({ |
[] |
no |
managed_virtual_network_enabled | Is Managed Virtual Network enabled? | bool |
true |
no |
permissions | Data Factory permision map | list(map(string)) |
[ |
no |
project | Project name | string |
n/a | yes |
public_network_enabled | Is the Data Factory visible to the public network? | bool |
false |
no |
resource_group | The name of the resource group in which to create the storage account | string |
n/a | yes |
self_hosted_integration_runtime_enabled | Self Hosted Integration runtime | bool |
false |
no |
tags | A mapping of tags to assign to the resource | map(any) |
{} |
no |
time_to_live_min | TTL for Integration runtime | string |
15 |
no |
virtual_network_enabled | Managed Virtual Network for Integration runtime | bool |
true |
no |
vsts_configuration | Code storage configuration map | map(string) |
{} |
no |
Name | Description |
---|---|
default_integration_runtime_name | Data Factory Default Integration Runtime Name |
id | Data Factory ID |
identity | Data Factory Managed Identity |
name | Data Factory Name |
self_hosted_integration_runtime_key | Self hosted integration runtime primary authorization key |
Apache 2 Licensed. For more information please see LICENSE