Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

This page will walkthrough the setup of Azure Data Factory in K.


Step 1) Enabling Azure Data Factory Admin APIs to be accessible to an AD Group

This step is performed by the Azure Data Factory Admin

  • Under Azure services click on Data factories

  • Locate the Data Factory that you would like to connect to K

  • Click on Overview to copy the below details for a later step:

    • Factory name

    • Resource group name

    • Subscription ID


Step 2) Registering Azure Data Factory App in Azure AD

This step is performed by the Azure AD Admin

  • Log in to your company’s Azure Portal and open the Azure Active Directory page

  • Select App Registration in the side panel and click New registration

  • Complete the registration form

    • Name: Enter a name for the integration e.g. KADA Azure Data Factory API Integration

    • Supported account types: Select Accounts in this organisation directory only

    • Redirect URL: Add Web / https://www.kada.ai

  • Click Register to complete the registration

  • Click on the newly created KADA Azure Data Factory API Integration App

  • Save the Application (client) ID and Directory (tenant) ID for use in a later step

  • Click on Endpoints and save the URL for OpenID Connect metadata document for use in a later step

  • Select Certificates & secrets in the side panel and click New client secret

  • Complete the new secret form and save the Secret Value for use in a later step

Make sure you send all of the information from Step 1 and Step 2 to the K Admin so that they can complete step 4.

  • Factory name

  • Resource group name

  • Subscription ID

  • Application (client) ID

  • Directory (tenant) ID

  • Secret Value


Step 3) Update your Azure Data Factory access control

This step is performed by the Azure Data Factory Admin

To ensure your Azure Data Factory can connect to K, you will need to provide the Azure Data Factor with the correct Role Assignment

  • Click on Access control (IAM) in the panel and click Add

  • Select Data Factory Contributor


Step 4) Add Azure Data Factory as a New Source

This step is performed by the K Admin

  • Select Platform Settings in the side bar

  • In the pop-out side panel, under Integrations click on Sources

  • Click Add Source and select AZURE_DATA_FACTORY

  • Select Direct Connect and add your Azure Data Factory details and click Next

  • Fill in the Source Settings and click Save & Next

    • Name: Give the Azure Data Factory source a name in K. If you have multiple ADFs, each one will need to have a unique name

    • Host: Enter the url e.g. adf.azure.com

    • Timeout: Default is 10, sometimes it may take longer for the API to respond, so we recommend increasing it to 20

    • Add the Host/Mapping details. See Host / Database Mapping for more details

    • Select Enable Workspace Filtering if you wish to load only select Workspaces

  • Add Connection Details and click Save & Next

    • Tenant ID: Add the Directory (tenant) ID copied from step 2

    • Client ID : Add the Application (client) ID copied from Step 2

    • Client Secret: Add the Secret ID copied from Step 2

  • Test your connection and click Next

  • If you selected Enabled Workspace Filtering select the Workspaces you want to load. If you have a lot of workspaces this may take a bit of time to load.

  • Click Finish Setup


Step 4) Schedule Azure Data Factory source load

  • Select Platform Settings in the side bar

  • In the pop-out side panel, under Integrations click on Sources

  • Locate your new Azure Data Factory Source and click on the Schedule Settings (clock) icon to set the schedule

Note that scheduling a source can take up to 15 minutes to propagate the change.


Step 5) Manually run an ad hoc load to test Azure Data Factory

  • Next to your new Source, click on the Run manual load icon

    Confirm how your want the source to be loaded

  • After the source load is triggered, a pop up bar will appear taking you to the Monitor tab in the Batch Manager page. This is the usual page you visit to view the progress of source loads

A manual source load will also require a manual run of

  • DAILY

  • GATHER_METRICS_AND_STATS

To load all metrics and indexes with the manually loaded metadata. These can be found in the Batch Manager page

 

Troubleshooting failed loads

  • If the job failed at the extraction step

    • Check the error. Contact KADA Support if required.

    • Rerun the source job

  • If the job failed at the load step, the landing folder failed directory will contain the file with issues.

    • Find the bad record and fix the file

    • Rerun the source job

  • No labels