Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks table scan for purview capability + ARM templates for ADF deployment #71

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

abdale
Copy link
Contributor

@abdale abdale commented Feb 23, 2021

This PR includes:

  • Databricks notebook to scan tables and push to Purview
  • ARM templates for ADF pipeline to orchestrate the running of this Databricks notebook - some values in the template need specification

@abdale abdale requested a review from marvinbuss as a code owner February 23, 2021 19:48
@abdale abdale self-assigned this Feb 24, 2021
@marvinbuss marvinbuss added the enhancement New feature or request label Feb 25, 2021
@marvinbuss marvinbuss marked this pull request as draft February 26, 2021 17:33
@abdale
Copy link
Contributor Author

abdale commented Mar 2, 2021

@marvinbuss all checks are cleared from a schema validation standpoint.

A few things that need your input/support:

(1) The new data factory has these seven parameters:

  1. tenantId
  2. purviewClientId
  3. purviewAccountName
  4. dataLandingZoneName
  5. databricksWorkspaceUrl
  6. purviewSecretPath
  7. databricksAccessToken

We need to configure the configs and yml files to enable these.

(2) The KeyVault to store the secrets needs configuration.

(3) The path to the Databricks Notebook needs to be updated in the deployment template (line 278)

(4) There is a need for a service principal with read access to Purview which will be used in parameters 2 and 6.

@renepajta
Copy link
Member

FYI - Hive Connector is in Preview and supports also Databricks Metastore. https://docs.microsoft.com/en-us/azure/purview/register-scan-hive-metastore-source

@marvinbuss
Copy link
Collaborator

@renepajta That is the reason why this PR was not merged. We probably have to work on some automation for this rather soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants