Greenplum
This page will guide you through the setup of Greenplum in K using the direct connect method.
Integration details
Scope | Included | Comments |
---|---|---|
Metadata | YES |
|
Lineage | YES |
|
Usage | No |
|
Sensitive Data Scanner | No |
|
Step 1) Establish Greenplum Access
You will need to create a user <kada user> for the K Platform.
Generally all users should have access to the pg_catalog tables on Database creation for Greenplum.
In the event the user doesn’t have access, explicit grants will need to be done per new Database in Greenplum to the <kada user>.
GRANT USAGE ON SCHEMA pg_catalog TO <kada user>;
GRANT SELECT ON ALL TABLES IN SCHEMA pg_catalog TO <kada user>;
The user used for the extraction must also be able to connect to the the databases needed for extraction.
PG Tables
The user must have access to these pg_catalog tables per applicable database in Greenplum
pg_class
pg_namespace
pg_proc
pg_database
pg_language
pg_type
pg_collation
pg_depend
pg_sequence
pg_constraint
pg_authid
pg_auth_members
Databases
The user must also be able to connect to all databases that you want onboarded.
Step 2) Connecting K to Greenplum
Select Platform Settings in the side bar
In the pop-out side panel, under Integrations click on Sources
Click Add Source and select Greenplum
Select Direct Connect and add your Greenplum details and click Next
Fill in the Source Settings and click Next
Name: The name you wish to give your Greenplum Instance in K
Host: Add your Greenplum host (found in your Greenplum URL)
Omit the https:// from the UR
Add the Connection details and click Save & Next when connection is successful
Host: Use the same details you previously added in the Host setting
Username: Add the Greenplum user name you created in Step 1
Password: Add the Greenplum user password you created in Step 1
Test your connection and click Save
Select the Databases you wish to load into K and click Finish Setup
All databases will be listed. If you have a lot of databases this may take a few seconds to load
Return to the Sources page and locate the new Greenplum source that you loaded
Click on the clock icon to select Edit Schedule and set your preferred schedule for the Snowflake load
Note that scheduling a source can take up to 15 minutes to propagate the change.
Step 3) Manually run an ad hoc load to test Greenplum setup
Next to your new Source, click on the Run manual load icon
Confirm how your want the source to be loaded
After the source load is triggered, a pop up bar will appear taking you to the Monitor tab in the Batch Manager page. This is the usual page you visit to view the progress of source loads
A manual source load will also require a manual run of
DAILY
GATHER_METRICS_AND_STATS
To load all metrics and indexes with the manually loaded metadata. These can be found in the Batch Manager page
Troubleshooting failed loads
If the job failed at the extraction step
Check the error. Contact KADA Support if required.
Rerun the source job
If the job failed at the load step, the landing folder failed directory will contain the file with issues.
Find the bad record and fix the file
Rerun the source job