Metadata enrichment via API

K provides an API which can be sent a JSON payload to bulk update objects in K

You can use this API to bulk update descriptions, classifications, owners etc for tables, reports etc

1. Preparing the data to upload

The following information is used by the API to update the object metadata

JSON is to be UTF-8 encoded.

Data template

ATTRIBUTE	DESCRIPTION	REQUIRED	DATA TYPE	Comments

ATTRIBUTE	DESCRIPTION	REQUIRED	DATA TYPE	Comments
id	Row ID for each object. The Row ID must be unique per object This is used to all the caller to linked rejected rows and error messages in the response to the objects in the request payload.	Y	INTEGER	Generated id so we can identify rejects by id back as in the error message, completely up to how they want to generate the id as long as its unique within the payload otherwise it may cause confusion with error messages.
object_k_id	The ID attribute for a K object. Only populate if you need to reprocess objects because they errored because the name was not unique. This can happen for Reports and Sheets where the object meta (below) is not unique	N Only required when reprocessing objects where the name is not unique	UUID	The ID can be found in K by searching for the object and noting and taking the UUID from he url of the object’s profile page. Eg `4daa6fa3-3b32-3f07-b587-67bcd8133d35` from the URL `https://<yourdomain>.kada.ai/profile/content/4daa6fa3-3b32-3f07-b587-67bcd8133d35` If you know the UUID inside K, specify and we will use this over object_meta to identify the object, this is helpful is cases such as reports where multiple reports potentially can have the same name but different ids and ids for reports are generally derived from the system id and not uniquely generated by K.
object_type	valid values: SCHEMA/TABLE/COLUMN/REPORT/SHEET	Y	STRING	Note these must be UPPER CASE.
object_meta	Attributes to uniquely identify the object in K NOTE that object_meta properties are different per object_type.	Y	JSON code blocks Example for a TABLE. See more examples below `"object_meta": { "source_id": 1008, "database": "adventureworks", "schema": "sales", "table": "customer" }`	All object_meta requires a `source_id` This value can be found in Admin > Sources. Get reference list of sources `https://<youdomain>/api/sources` Specific information about that particular object type, this can be unique to each object type, generally the information needed to identify it Get source ID from the UI Go to Platform Settings. Click on Sources. Source ID is displayed next to each source you have integrated
description	The description for the object. Any existing description will be overridden If a blank description is provided it will be ignored.	N	STRING	From this attribute onwards at least 1 should be provided otherwise there is nothing to be enriched for the object.
business_name	This will populate the business name for the object. Any existing business name will be overridden If a blank business name is provided it will be ignored.	N	STRING	will populate the alternate name
business_logic	A description to describe the business logic use to create the object	N	STRING	You should check if similar business logic for the object does not exist before creating to avoid creating duplicates
owners	An owner’s username of the onboarded user (typically email). If platform_profile_user_lookups is set to `data_roles` then the user must be linked to the `Data Owner` instance. Set to empty list `“owners":[]` of included property but there are no owners	N	LIST<STRING>	list of owners identified by either the username (if synced with AD this is the email of the user) of the onboarded user Note: Check whether you have a set of owners configured or any user can be an owner This is configured in Platform settings - Customisation API to get list of owners `https://<your-domain>/api/collectioninstances/16e05af2-13fa-301d-b7fa-69d48bc71d7d/related?object_name=USER&relationship=MEMBERS`
stewards	A list of steward usernames of the onboarded user (typically email) . If platform_profile_user_lookups is set to `data_roles` then the user must be linked to the `Data Steward` instance. Set to empty list `“stewards":[]` of included property but there are no stewards	N	LIST<STRING>	list of stewards identified by username (if synced with AD this is the email of the user) of the onboarded user (check platform_profile_user_lookups value data_roles vs any_user) Note: Check whether you have a set of stewards configured or any user can be an owner This is configured in Platform settings - Customisation API to get a list of stewards `https://<your-domain>/api/collectioninstances/d13d6b10-a535-3718-854d-459f086ad057/related?object_name=USER&relationship=MEMBERS`
verified_use_cases	A list of IDs for the verified use cases to be linked to this object. Set to empty list `“verified_use_cases":[]` of included property but there are no verified_use_cases	N	LIST<UUID>	UUID of verified use cases, separated into these categories as the links are different and easier for the job to process API to get a list of verified use cases `https://<your-domain>/api/collectioninstances?collection_id=0ffdae93-b46c-38c2-8511-8183f1d89c8d`
classifications	A ID of the classification to be linked to this object. This should be a single value. Set to empty list `“classifications":[]` of included property but there are no classifications.	N	LIST<UUID>	UUID of classifications API to get a list of classifications `https://<your-domain>/api/collectioninstances?collection_id=e1fd2337-1b0e-3841-8d60-7c85daa1707e`
domains	A list of IDs of the domains to be linked to this object. Set to empty list `“domains":[]` of included property but there are no domains.	N	LIST<UUID>	UUID of domains API to get a list of domains `https://<your-domain>/api/collectioninstances?collection_id=73f92690-2edd-33a9-ad32-4f9c225a5c18`

Example payloads

Schema

[
  {
    "id": 1,
    "object_k_id": "982c18cf-643a-356b-bdef-15d69919d1d1",
    "object_type": "SCHEMA",
    "object_meta": {
      "source_id": 1008,
      "database": "adventureworks",
      "schema": "sales"
    },
    "description": "my new schema",
    "business_name": "customer ledger",
    "business_logic": "some busines logic",
    "owners": ["xyz@kada.ai"],
    "stewards": ["abc@kada.ai"],
    "verified_use_cases": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "classifications": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "domains": ["982c18cf-643a-356b-bdef-15d69919d1d1"]
  }
]

Table

[
  {
    "id": 1,
    "object_k_id": "982c18cf-643a-356b-bdef-15d69919d1d1",
    "object_type": "TABLE",
    "object_meta": {
      "source_id": 1008,
      "database": "adventureworks",
      "schema": "sales",
      "table": "customer"
    },
    "description": "my new table",
    "business_name": "customer ledger",
    "business_logic": "some busines logic",
    "owners": ["xyz@kada.ai"],
    "stewards": ["abc@kada.ai"],
    "verified_use_cases": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "classifications": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "domains": ["982c18cf-643a-356b-bdef-15d69919d1d1"]
  }
]

Column

[
  {
    "id": 1,
    "object_k_id": "982c18cf-643a-356b-bdef-15d69919d1d1",
    "object_type": "COLUMN",
    "object_meta": {
      "source_id": 1008,
      "database": "adventureworks",
      "schema": "sales",
      "table": "customer",
      "column": "customer_id"
    },
    "description": "my new column",
    "business_name": "customer ledger",
    "business_logic": "some busines logic",
    "owners": ["xyz@kada.ai"],
    "stewards": ["abc@kada.ai"],
    "verified_use_cases": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "classifications": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "domains": ["982c18cf-643a-356b-bdef-15d69919d1d1"]
  }
]

Report

[
  {
    "id": 1,
    "object_k_id": "982c18cf-643a-356b-bdef-15d69919d1d1",
    "object_type": "REPORT",
    "object_meta": {
      "source_id": 1008,
      "name": "My brand spanking new report"
    },
    "description": "Some report",
    "business_name": "Some business name",
    "business_logic": "some busines logic",
    "owners": ["xyz@kada.ai"],
    "stewards": ["abc@kada.ai"],
    "verified_use_cases": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "classifications": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "domains": ["982c18cf-643a-356b-bdef-15d69919d1d1"]
  }
]

Sheet

[
  {
    "id": 1,
    "object_k_id": "982c18cf-643a-356b-bdef-15d69919d1d1",
    "object_type": "SHEET",
    "object_meta": {
      "source_id": 1008,
      "name": "My brand spanking new report sheet"
    },
    "description": "Report tile of sorts",
    "business_name": "Some interesting business name",
    "owners": ["xyz@kada.ai"],
    "stewards": ["abc@kada.ai"],
    "verified_use_cases": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "classifications": ["982c18cf-643a-356b-bdef-15d69919d1d1"],
    "domains": ["982c18cf-643a-356b-bdef-15d69919d1d1"]
  }
]

3. Setting up the API Key

Follow the API Guides to setup an access Token

API Guides

4. Using the API

API Endpoint

There is a single end point for uploading your Update payload.

POST /api/metadata/enrichment

Example

curl --location --request POST 'https://<YOUR_DOMAIN>.kada.ai:443/api/metadata/enrichment' \
--header 'Authorization: Bearer <YOUR_AUTH_TOKEN>' \
--form 'data=@"/path/to/your/enrichment_file.json"'

API Validation

The API will read your payload and perform a set of validations over the format of your payload.

If the validation fails, the API returns a error payload with additional detail on what has failed validation

In this example the key "1" refers to the element at index 1 in the json file uploaded.

{
  "errors": {
    "1": {
      "stewards": ["Not a valid list."]
    }
  }
}

API Responses

If the file passes validation, the file data will be loaded into K. You will be provided a job id in the response so you can check the current status of the load.

GET api/jobexecutions/<job id>

Which will return either the status is

SUBMITTED
RUNNING
FAILED
KILLED
COMPLETE

In the event of FAILED, the response will include additional details about the error.

{
    "created_at": "2022-02-15T03:23:29.224114+00:00",
    "data_processed": true,
    "error": "1: Object does not exist\n2001: Invalid collection UUID <> provided"
    "id": 2,
    "job_reference": "61e1b879-5a28-404a-8350-6ffec5dc6097",
    "job_status": "FAILED",
    "platform_job_id": 106,
    "source_id": 1000,
    "updated_at": "2022-02-15T03:23:30.852532+00:00"
}

Considerations

There is no guarantee that the job will load all data provided in the payload.

You will need to ensure your updates are checked and ensure you review any errors that are generated. Any performance issues should be reviewed - e.g. the job has been running for over 2 hours. Reach out to the KADA team for any support you may need.