There are many ways to create interoperablity between SAS Viya and Databricks. While some of these are explicitily provided in documentation as a product features, there are a lot of interoperablity patterns that should be considered that are not explicitly identified in documentation. This repo provides examples of options available and identify important feature / version considerations.
These examples were developed in the following environements:
While there is reference code provided as SAS files and Jupyter notebooks, the interoperabily methods can all be inspected from Databricks notebooks. The notebooks are provided in order which can optionally be a way to go through all the material.
SAS Interoperability
cloud | method / interface |
description | Notebooks |
---|---|---|---|
az | SASPy | Python API for SAS, SAS maintainined | 01-SASPy_SETUP.py 01-SASPy.py |
az | SAS/ACCESS Interface to JDBC | Library and Dataset methods using JDBC | 02-SAS-ACCESS_JDBC.py |
az | SAS/ACCESS Interface to ODBC | Library and Dataset methods using ODBC | 03-SAS-ACCESS_ODBC.py |
az | SAS/ACCESS Interface to Spark | Library and Dataset methods using ODBC | 04-SAS-ACCESS_Spark.py |
az | Azure Access Method Databricks External Location |
Create a single shared data source in ADLS accessed directly by SAS & Databricks | 05-External_Location_ADLS.py |
az | Delimited Files | Use shared ADLS location to read/write text files | 06-Text_FIles_ADLS.py |
az | ORC Engine | Use shared ADLS location to read/write ORC files | 07-ORC_ADLS.py |
az | Parquet Engine | Use shared ADLS location to read/write Parquet files | 08-Parquet_ADLS.py |
az | Export sas7bdat spark-sas7bdat pandas.read_sas |
Use shared ADLS location to read/write SAS files | 09-sas7bdat_ADLS.py |
Since there are many deployment patterns and documentation sources for SAS, it is important that the documentation specific to your deployment is used. These interoperability examples were developed using an Azure SAS Viya Pay-As-You-Go managed application. As of 19 APR 2023, the current SAS Viya Version used by the managed application was 2023.01, thus the appropriate documentation version used during the development of these interoperability patterns are:
- SAS® Viya® Platform Administration (2023.01)
- SAS® Viya® Platform Operations (2023.01)
- Programming Documentation for the SAS® Viya® Platform (2023.01)
- SAS Embedded Process for Spark Action (Databricks)
- Provide examples that leverage S3 instead of ADLS
- Databricks SQL Connector - included for consideration since the SAS Viya PAYG environmenet includes jupyter and SAS allows for proc python