Managing Data Quality Outcomes in K
K can store track, measure and manage data quality issues across your data ecosystem.
This knowledge page covers:
The 6 data quality dimensions that K uses
How to view data quality test results
Raising a data quality issue against a failed data quality test
Linking data quality tests to a data quality dimension
K can integrate and store Data Quality results from the following Data Quality Applications (DQ Apps):
Data Quality Dimensions
K has 6 defined data quality dimensions in the Data Quality collection
Data quality dimensions are measurement attributes of data that can help you assess the quality of your data and identify opportunities to improve data trust.
Accuracy
Accuracy is the degree to which data correctly reflects a real world object or an event being described. Accurate data should be verified against an authentic source.
Key questions include: Are there incorrect spellings of product or person names, addresses, and even untimely or not current data?
Completeness
Completeness is defined as the user's expected comprehensiveness. Data can be complete even if optional data is missing. As long as the data meets expectations, then the data is considered complete.
Key questions include: Is all the requisite information available? Do any data values have missing elements? Or are they in an unusable state?
Consistency
Consistency means data across all systems reflects the same information and are in synch with each other across the enterprise (e.g. customer address). Data consistency is often associated with data accuracy, and any data set scoring high on both will be a high-quality data set.
Key questions include: Are data values the same across the data sets? Are there any distinct occurrences of the same data instances that provide conflicting information?
Integrity
Integrity indicates that the attributes are maintained correctly, even as data gets stored and used in diverse systems. Data integrity ensures that all enterprise data can be traced and connected to other data.
Key questions include: Is there are any data missing important relationship linkages?
Timeliness
Timeliness references whether information is available when it is expected and needed based on the user's expectation.
Key questions include: Is the data available when you need it?
Uniqueness
Uniqueness is the most critical dimension for ensuring no duplication or overlaps. Data uniqueness is measured against all records within a data set or across data set and indicates if it is a single recorded instance in the data set used.
Key questions include: Are there any duplicates within your data
You can add additional dimensions by going to the Data Quality Dimension collection page and clicking ADD.
Viewing data quality results
If a data quality test has been loaded from a DQ App you will see a Quality tab on the Data Profile Page for the target of the test (e.g. the results of a test run on a column will appear on the column profile page)
On the Summary section, you can review the data quality scores across all the different dimensions. You can review data quality scores at both the table level and individual column level.
When you click on the Details section, you can review the data quality trend over time.
To see more details about the test results for the day, hover over the day you are interested in and click on the bar for more detail. A list of all tests run that day will be displayed and a confirmation on whether the test passed or failed.
You can click on each individual test for more detail.
Linking data quality tests to a data quality dimension
We recommend linking data quality tests to the Data Quality Collection. This way you can see all of the different data quality tests in a collection format and also identify data quality thematics.
To link a test:
Navigate to the data profile page of the data quality test you want to link
Click on Data Quality Dimension field in the Details panel.
Select the relevant Data Quality Dimension
After you link the test to the specific Data Quality Dimension, you can click on the tag to view the Data Quality Dimension instance that has been linked
Click on the Related tab to see a full list of all other Data Quality Tests linked to this specific dimension