Scroll ignore | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||
About Collectors
Insert excerpt | ||||||
---|---|---|---|---|---|---|
|
...
Setting up Azure Synapse for metadata extraction using a service principal (Application).
Step 1: Registing an Entra App for KADA Azure Synapse Collector Application
Create an Entra Application for the kada synapse collector or reusing an existing kada application in Entra. TBD (Same steps as setup for power bi / AD integration. https://docs.kada.ai/home/powerbi#PowerBI-Step2)RegisteringPowerBIAppinAzureAD
Generate a secret for the Entra Application and note down the application id, tenant id.
Step 2: Establish Azure Synapse Access
Note |
---|
Apply in MASTER using an Azure Synapse Admin user |
Code Block |
---|
CREATE USER [<SERVICE_PRINCIPAL_<ENTRA APPLICATION NAME>] FROM EXTERNAL PROVIDER; |
Note |
---|
Apply per database in scope for metadata collection. |
Code Block |
---|
CREATE USER [<SERVICE_PRINCIPAL_<ENTRA APPLICATION NAME>] FROM EXTERNAL PROVIDER; GRANT VIEW DEFINITION TO [<SERVICE_PRINCIPAL_<ENTRA APPLICATION NAME>]; GRANT VIEW DATABASE STATE TO [<SERVICE_PRINCIPAL_<ENTRA APPLICATION NAME>]; |
The following table should also be available to SELECT by the user created in each database
INFORMATION_SCHEMA.ROUTINES
INFORMATION_SCHEMA.VIEWS
INFORMATION_SCHEMA.TABLE_CONSTRAINTS
INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE
INFORMATION_SCHEMA.TABLES
INFORMATION_SCHEMA.COLUMNS
INFORMATION_SCHEMA.VIEWS
sys.foreign_key_columns
sys.objects
sys.tables
sys.schemas
sys.columns
sys.databases
Note |
---|
Synapse has the concept of a serverless and dedicated sql pool per workspace. If you use both serverless and dedicated sql pools in a workspace you will need to onboard each as a separate source in K. For serverless sql pools the master database can’t be selected for metadata extraction. To onboard multiple Synapse workspace each workspace will need to be onboarded as a new source in K. |
...
Step 1: Create the Source in K
...
FIELD | FIELD TYPE | DESCRIPTION | EXAMPLE |
---|---|---|---|
client | string | Onboarded client in Azure to access Azure Synapse |
|
secret | string | Onboarded client secret in Azure to access Azure Synapse |
|
tenant | string | Tenant ID of where Azure Synapse exists | |
server | string | Azure Synapse server. If using a custom port append with comma | “<workspace-name>.sql.azuresynapse.net,1433” |
host | string | The onboarded host value in K, generally this would be the same as the server value, depending on what you onboard it as. | “<workspace-name>.sql.azuresynapse.net,1433” |
database_name | string | The name of the database that will be used to test the connection | master Note for serverlesss. sqlpools master is not accessible. |
databases | list<string> | A list of databases to extract from SQLServer Azure | [“dwh”, “adw”] |
driver | string | This is the ODBC driver, generally its ODBC Driver 17 for SQL Server, if you another driver installed please use that instead. | “ODBC Driver 17 for SQL Server” |
meta_only | boolean | Do you want to extract metadata only without enabling extended events? We currently only support true | true |
output_path | string | Absolute path to the output location where files are to be written | “/tmp/output” |
mask | boolean | To enable masking or not | true |
compress | boolean | To gzip the output or not | true |
connection_timeout | integer | Timeout in seconds allowed against Synapse Sql Pool connection, this is defaulted as 30 | 30 |
...
Code Block | ||
---|---|---|
| ||
{
"client": "",
"secret": "",
"tenant": "",
"server": "",
"host": "",
"driver": "ODBC Driver 17 for SQL Server",
"database_name": "master",
"databases": [""],
"output_path": "/tmp/output",
"mask": true,
"compress": true,
"meta_only": true,
"connection_timeout": 30
} |
...