Setup: Deploying K
Applicable to customer on-premise/cloud deployments
Setup Infrastructure
Kubernetes deployment
Platform:
Setup a Kubernetes on any cloud provider (AWS, Azure, Google Cloud) or on-premise solution (e.g. OpenShift)Environment Sizing for Deployment of KADA into your infrastructure:
Small Environments ( <1 Million objects)
4 nodes each node 4CPU, 16Gb Memory, PV Storage Class SSD disk any IOPS
Large Environments (more than 1 Million objects)
5 nodes
4 nodes each node 4CPU, 16Gb Memory, PV Class storage SSD disk any IOPS
1 node 8CPU, 32Gb Memory, for PV Class storage SSD with 1100 IOPS minimum.
For very complex environments 10M+ objects or large volume of historical data, infrastructure requirements can scale out according to data volumes.
Storage:
Setup an object store such as AWS s3, Azure Blob etc.
Minimum 200GB storage, to be mounted into Persistent volumes in the Kubernetes cluster. PV Class definitions need to be configured to meet the minimum IOPS requirements above.
Where the organisation defines their own PV definitions eg OpenShift, set the
Reclaim Policy
is set toRetain
. This is important to ensure there is no data lost during prolonged outage at the Kubernetes layer.
Networking:
Firewall rules may be required to enable access to HTTPS (443)
You may choose to use your own Kubernetes ingress services or use the one provided by KADA's configuration scripts.
Docker deployment (not recommended for production environments)
This setup requires a single machine with a minimum spec of 16CPU, 64GB MEM, 200GB (minimum) storage
Install docker: https://docs.docker.com/engine/install/
Install docker compose: https://docs.docker.com/compose/install/
Configuring Access to KADA Image Repository
The KADA installation will require access to the KADA Image repository.
The Kubernetes or Docker environment will need internet access to this repository to install the platform.
If your environment is air gap please advise the KADA Team and we will arrange an alternative method for loading imaging into your environment.
KADA will provide customers with a unique kada-client-[unique_customer_code]-key.json
to access the repository.
Kubernetes deployment
To setup the access key in your Kubernetes environment run the following.
kubectl create secret docker-registry kada-image-credentials \
--docker-server=https://asia.gcr.io \
--docker-username=_json_key \
--docker-email=kada-client-[unique_customer_code]@kada-external.iam.gserviceaccount.com \
--docker-password="$(cat kada-client-[unique_customer_code]-key.json)"
kubectl patch serviceaccount <REPLACE WITH THE SERVICE ACCOUNT NAME OR default> \
-p "{\"imagePullSecrets\": [{\"name\": \"kada-image-credentials\"}]}"
# Run the following to test connectivity
docker pull busybox:1.28
docker pull asia.gcr.io/kada-external/postgres:1.7.0-pg11
Docker deployment
To setup the access key in your Docker environment run the following.
docker login -u _json_key --password-stdin https://asia.gcr.io < /tmp/kada-client-[code]-key.json
# Run the following to test connectivity
docker pull busybox:1.28
docker pull asia.gcr.io/kada-external/postgres:1.7.0-pg11
KADA Platform Installation
KADA is packaged as a set of configuration files.
Download the latest package.
Kubernetes deployments
In
keycloak/k8s/keycloak-kada-realm.yaml
replaceDOMAIN_URL
with your base url of your installation. Eghttps://example.com
Platform credentials for internal services can be updated from their default values
Edit
postgres/k8s/credentials.yaml
to set your own passwordPOSTGRES_PASS=
Edit
keycloak/k8s/keycloak-credentials.yaml
to set your own password
Generate CA Certificates
Generate CA Certs base on the domain name of the host.
Once generate run the following command to upload to certs into KubernetesIf you are using your own Kubernetes ingress service. The service needs to map the ports as per
cortex/k8s/ingress-service.yaml
. Make sure certs have been added to your ingress service.If you are using the KADA ingress services update
cortex/k8s/ingress-service.yaml
and set the following
Deploy the Kubernetes config to start the platform
Upload config
Check environment is up
Deploy ingress-service (if not using your own)
Docker deployment
Edit the following
kada_docker_compose.env
and set the following valueIn
conf/kada-realm.json
replaceDOMAIN_URL
with your base url of your installation. Eghttps://example.com
Generate CA Certs base on the domain name of the host. In
conf/
rename and replace thecortex.crt
andcortex.key
with your generated CA Certificates.Deploy the environment
KADA Platform Configuration
Platform Settings
On the bottom left of screen click theGEAR
icon. And select Platform Settings.Then setup the following properties depending on your deployment setup
Integrating sources to KADA
KADA needs to be configured for each source that you want to integrate. Setup can be configure via the KADA front end. See [M - Done] How to: Onboard a new sourceKADA Platform Initial load
Setup the following Platform Setting values for initial load
KADA provides a built in Batch manager for triggering the loading of sources.
See[M - Done] How to: Onboard a new source | 4. Manually Triggering Source loads
Once the sources have been loaded. Manually trigger the following platform jobs. See [M - Done] How to: Manually run a data load from a source | Manually triggering a Platform job
1. GATHER_METRICS_AND_STATS
2. POST_PROCESS_QUERIES
3. DAILY
Schedule sources to load.
KADA provided a scheduler to periodically load the source you have configured.
Setup the following Platform Setting value to enable the scheduler to run.Each Source can now be scheduled to run. See [M - Done] How to: Onboard a new source | 3. Scheduling a Source
Upgrading KADA
KADA generally releases new updates each month. See our Release versions to see what the latest version available is.
To check your version see [Not migrated - Redundant] How to: Check the version of K platform
If a new version is available use the following steps to upgrade
To update your platform perform the following steps. Then follow any manual steps outlined in the release notes.
Kubernetes deployments
Docker deployments
KADA Integrations
1. Updating Kubernetes Configs for Source Onboarding
Some sources in KADA require additional configuration to establish connectivity. This section details the additional configuration steps per integration source.
Queries to extract from a source may need to be altered for a customer's deployment. These can be edited in the cerebrum-extract-scripts.yaml
file. Each extract script is prefixed with the relevant vendor source name.
After editing any of the config yaml files, upload the edited yaml file and restart the cerebrum services for the new configurations to take effect.
1.1. Teradata
Uses ODBC so an update is required to the cerebrum-odbc-ini.yaml
file
Update the DBCName
to the server for your Teradata, if there are multiple sources, then create a new DSN Entry for each one making sure to use the same format as in the Kada Teradata Extractor
populate example. Do not change the Driver
path.
Permissions:
Read access to the
DBC
andPDCRINFO
schemas.Specifically these tables:
PDCRINFO.DBQLogTbl
OtherwiseDBC.DBQLogTbl
PDCRINFO.DBQLSqlTbl
OtherwiseDBC.DBQLSqlTbl
DBC.TABLESV
DBC.TABLESV
DBC.INDICESV
DBC.ALL_RI_CHILDRENV
1.2. SQLServer 2012+
Uses ODBC so an update is required to the cerebrum-odbc-ini.yaml
file
Update the Server
, Port
according to your SQLServer, if there are multiple sources, then create a new DSN Entry for each one making sure to use the same format as in the Kada SQLServer Extractor
populate example. Do not change the Driver
or TDS_Version
values.
Permissions
Read access to the information_schema per database
Permission to create extended events
Permission to read extended events log file.
Log Capture
SqlServer Extended events need to be setup to capture query log data.
Here is a template for KADA’s extended events. Note that this will require some tuning depending on how much activity and the types of queries occurring in your SQLServer environment.
1.3. Oracle 11g+, Oracle Cloud and Oracle Analytics
Required an oracle wallet and the following. items to be updated in the cerebrum-oic.yaml file.
cwallet.sso
→ Binary Textewallet.p12
→ Binary Texttnsnames.ora
→ Text
NB: sqlnet.ora
if updated must have DIRECTORY="/etc/oracle_config"
To generate the binary or the text replacement, simply run to get the output in the console
You can use the output to replace the specific data/binaryData in each section of the cerebrum-oci.yaml file.
Alternatively if you have all the files, add each file as a --from-file argument to generate the whole config file again
Permissions
Read access to the
SYS.DBA_*
tablesSpecifically the following tables:
SYS.DBA_PROCEDURES
SYS.DBA_VIEWS
SYS_DBA_MVIEWS
SYS.DBA_CONSTRAINTS
SYS.DBA_CONS_COLUMNS
SYS.DBA_TAB_COLUMNS
SYS.dba_hist_active_sess_history
SYS.dba_hist_snapshot
SYS.dba_users
SYS.dba_hist_sqltext
1.4. Snowflake
No additional configuration is required. Snowflake uses a python native driver.
Permissions
Read access to the
SNOWFLAKE.ACCOUNT_USAGE
schemaUser must have access to role:
ACCOUNTADMIN
Alternatively grant to other role. https://docs.snowflake.com/en/sql-reference/account-usage.html#enabling-account-usage-for-other-roles
1.5. Tableau
Permissions
Read user to
workgroup
Tableau postgres databaseTableau enabled for Metadata API
Tableau Server enabled for respository access
Create a Tableau Server with the following access roles:
Site Administrator Creator
orServer Administrator
1.6 Informatica (coming soon)
Permissions
Read access to Informatica all repository tables.
1.7 DBT (coming soon)
Permissions
Read access a location that contains the manifest.json and catalogue.json file for each dbt project.