Document toolboxDocument toolbox

Metadata enrichment via API

K provides an API which can be sent a JSON payload to bulk update objects in K

You can use this API to bulk update descriptions, classifications, owners etc for tables, reports etc

 

1. Preparing the data to upload

The following information is used by the API to update the object metadata

JSON is to be UTF-8 encoded.

 

Data template

ATTRIBUTE

DESCRIPTION

REQUIRED

DATA TYPE

Comments

ATTRIBUTE

DESCRIPTION

REQUIRED

DATA TYPE

Comments

id

Row ID for each object.

The Row ID must be unique per object

This is used to all the caller to linked rejected rows and error messages in the response to the objects in the request payload.

Y

INTEGER

Generated id so we can identify rejects by id back as in the error message, completely up to how they want to generate the id as long as its unique within the payload otherwise it may cause confusion with error messages.

object_k_id

The ID attribute for a K object.

Only populate if you need to reprocess objects because they errored because the name was not unique.

This can happen for Reports and Sheets where the object meta (below) is not unique

 

N

Only required when reprocessing objects where the name is not unique

UUID

The ID can be found in K by searching for the object and noting and taking the UUID from he url of the object’s profile page. Eg 4daa6fa3-3b32-3f07-b587-67bcd8133d35

from the URL

https://<yourdomain>.kada.ai/profile/content/4daa6fa3-3b32-3f07-b587-67bcd8133d35

 

If you know the UUID inside K, specify and we will use this over object_meta to identify the object, this is helpful is cases such as reports where multiple reports potentially can have the same name but different ids and ids for reports are generally derived from the system id and not uniquely generated by K.

object_type

valid values: SCHEMA/TABLE/COLUMN/REPORT/SHEET

 

Y

STRING

Note these must be UPPER CASE.

object_meta

Attributes to uniquely identify the object in K

NOTE that object_meta properties are different per object_type.

 

Y

JSON code blocks

Example for a TABLE. See more examples below

"object_meta": { "source_id": 1008, "database": "adventureworks", "schema": "sales", "table": "customer" }

All object_meta requires a source_id This value can be found in Admin > Sources.

Get reference list of sources

https://<youdomain>/api/sources

Specific information about that particular object type, this can be unique to each object type, generally the information needed to identify it

 

Get source ID from the UI

Go to Platform Settings. Click on Sources. Source ID is displayed next to each source you have integrated

 

 

 

description

The description for the object.

Any existing description will be overridden

If a blank description is provided it will be ignored.

N

STRING

From this attribute onwards at least 1 should be provided otherwise there is nothing to be enriched for the object.

business_name

This will populate the business name for the object.

Any existing business name will be overridden

If a blank business name is provided it will be ignored. 

N

STRING

will populate the alternate name

business_logic

A description to describe the business logic use to create the object

N

STRING

You should check if similar business logic for the object does not exist before creating to avoid creating duplicates

owners

An owner’s username of the onboarded user (typically email).

If platform_profile_user_lookups is set to data_roles then the user must be linked to the Data Owner instance.

Set to empty list “owners":[] of included property but there are no owners

 

N

LIST<STRING>

 

list of owners identified by either the username (if synced with AD this is the email of the user) of the onboarded user

Note: Check whether you have a set of owners configured or any user can be an owner This is configured in Platform settings - Customisation

 

API to get list of owners

https://<your-domain>/api/collectioninstances/16e05af2-13fa-301d-b7fa-69d48bc71d7d/related?object_name=USER&relationship=MEMBERS

stewards

A list of steward usernames of the onboarded user (typically email) .

If platform_profile_user_lookups is set to data_roles then the user must be linked to the Data Steward instance.

Set to empty list “stewards":[] of included property but there are no stewards

 

 

N

LIST<STRING>

list of stewards identified by username (if synced with AD this is the email of the user) of the onboarded user (check platform_profile_user_lookups value data_roles vs any_user)

Note: Check whether you have a set of stewards configured or any user can be an owner This is configured in Platform settings - Customisation

 

API to get a list of stewards

https://<your-domain>/api/collectioninstances/d13d6b10-a535-3718-854d-459f086ad057/related?object_name=USER&relationship=MEMBERS

verified_use_cases

A list of IDs for the verified use cases to be linked to this object.

Set to empty list “verified_use_cases":[] of included property but there are no verified_use_cases

 

N

LIST<UUID>

UUID of verified use cases, separated into these categories as the links are different and easier for the job to process

 

API to get a list of verified use cases

https://<your-domain>/api/collectioninstances?collection_id=0ffdae93-b46c-38c2-8511-8183f1d89c8d

classifications

A ID of the classification to be linked to this object.

This should be a single value.

Set to empty list “classifications":[] of included property but there are no classifications.

 

N

LIST<UUID>

UUID of classifications

 

API to get a list of classifications

https://<your-domain>/api/collectioninstances?collection_id=e1fd2337-1b0e-3841-8d60-7c85daa1707e

domains

A list of IDs of the domains to be linked to this object.

Set to empty list “domains":[] of included property but there are no domains.

 

N

LIST<UUID>

UUID of domains

 

API to get a list of domains

https://<your-domain>/api/collectioninstances?collection_id=73f92690-2edd-33a9-ad32-4f9c225a5c18

 

Example payloads

Schema

[ { "id": 1, "object_k_id": "982c18cf-643a-356b-bdef-15d69919d1d1", "object_type": "SCHEMA", "object_meta": { "source_id": 1008, "database": "adventureworks", "schema": "sales" }, "description": "my new schema", "business_name": "customer ledger", "business_logic": "some busines logic", "owners": ["xyz@kada.ai"], "stewards": ["abc@kada.ai"], "verified_use_cases": ["982c18cf-643a-356b-bdef-15d69919d1d1"], "classifications": ["982c18cf-643a-356b-bdef-15d69919d1d1"], "domains": ["982c18cf-643a-356b-bdef-15d69919d1d1"] } ]

Table

[ { "id": 1, "object_k_id": "982c18cf-643a-356b-bdef-15d69919d1d1", "object_type": "TABLE", "object_meta": { "source_id": 1008, "database": "adventureworks", "schema": "sales", "table": "customer" }, "description": "my new table", "business_name": "customer ledger", "business_logic": "some busines logic", "owners": ["xyz@kada.ai"], "stewards": ["abc@kada.ai"], "verified_use_cases": ["982c18cf-643a-356b-bdef-15d69919d1d1"], "classifications": ["982c18cf-643a-356b-bdef-15d69919d1d1"], "domains": ["982c18cf-643a-356b-bdef-15d69919d1d1"] } ]

Column

Report

Sheet

 

3. Setting up the API Key

Follow the API Guides to setup an access Token

API Guides

 

4. Using the API

API Endpoint

There is a single end point for uploading your Update payload.

Example

 

 

API Validation

The API will read your payload and perform a set of validations over the format of your payload.

If the validation fails, the API returns a error payload with additional detail on what has failed validation

In this example the key "1" refers to the element at index 1 in the json file uploaded.

 

API Responses

If the file passes validation, the file data will be loaded into K. You will be provided a job id in the response so you can check the current status of the load.

Which will return either the status is

SUBMITTED
RUNNING
FAILED
KILLED
COMPLETE

In the event of FAILED, the response will include additional details about the error.

 

Considerations

There is no guarantee that the job will load all data provided in the payload.

You will need to ensure your updates are checked and ensure you review any errors that are generated. Any performance issues should be reviewed - e.g. the job has been running for over 2 hours. Reach out to the KADA team for any support you may need.