Document toolboxDocument toolbox

[M] DBT Core (Self Hosted DBT)

DBT Core is the self managed version of DBT. Artifacts that generated by DBT core need to be updated to include their project name and then pushed to the K landing directory.

Files to be loaded

DBT generates manifest, catalog and run_result files that can be used to load metadata and logs into K.

The following are the naming conventions for these dbt artefacts.

The project name can be found in the dbt_project.yaml file.

<project_name>_manifest.json

generated by dbt run / compile

<project_name>_catalog.json

generated by dbt run / compile

<project_name>_run_results_YYYYMMDDHHmmss.json

generated from dbt run

Optional but required for usage information in KADA.

mapping.json

manually created for your environment.

See Generating-the-mapping.json

Generating the mapping.json

A mapping file is required to map dbt projects to KADA sources. Example

{ "raw": "af33141.australia-east.azure", "transformed": "af33141.australia-east.azure" }


The file is a json payload containing key, value pairs:

  • key - dbt project name.

  • value - host of the database matching the KADA onboarded source’s host.
    This can be found in Platform Admin > Sources > Edit source > See host name.

 

You can find the project name in the dbt_project.yaml file like below

 

Update the mapping file when you create a new project in dbt.

Onboarding dbt source in KADA

  1. Create a DBT source by going to Platform Settings > Integration / Sources







    Click Add Source and choose Dbt

     

  2. Select Load DBT File

  3. Fill in the following details about your DBT instance

    1. Name: Give your DBT source a name

    2. Host: Add the host name of your DBT server

    3. DBT Server: Same as your host name of your DBT Server

    4. Ignore DBT Cloud Account ID


       

  4. Click Finish Setup

     

  5. Obtain credentials to the KADA Platform landing directory by contacting the KADA Support for keys

  6. Manually create the mapping.json file and upload to the landing directory.

  7. Configure your orchestration tool (eg Airflow) that executes dbt to:

    1. Rename files as per Key Artefacts

    2. Send the files to the landing directory.

      1. The landing directory can be found by going to Platform Settings > Sources > Select Edit on the DBT source you created




        You will see the landing directory at the bottom





         

  8. Go back to Sources and Select Edit Schedule

     

  9. Select a load frequency that aligns with your pipeline