K is a Data Knowledge platform enabling data discovery, knowledge management and data governance for all data users.
This page will provide a brief introduction to K and its architecture.
Introduction to K
K is a Data Knowledge platform for discovering, profiling and understanding how data products (data sets, analysis, reports etc) across an Enterprise is used.
K focuses on identifying and storing how users work with data; leveraging this information to enable data producers to improve their products; data owners to take accountability for the proper use of their data; and to scale hidden knowledge to all data workers. The product vision is to become the central platform for all Enterprise data users to easily discover, understand and govern the use of data.
K Architecture
K Services
Component | Description |
Extractors | The service is used for connecting to, extracting and loading metadata and logs from data sources and tools. The extractors can also be deployed as a collector service for on-premise sources when using the K SaaS offering if access to between the on-premise source and the SaaS offering is not available. |
Profiler | The service is used to identify and profile data assets and their usage. A set of proprietary algorithms are used to automatically match and analyse data assets over their lifecycle. |
Identity | The service is used to integrate with the Enterprise Identity Management service to provide single sign on. |
Search | The service provides fast, accurate and contextual search for all assets within K. |
Applications | The service is used to access dedicated applications built to solve specific data problems. E.g. migration assessment, impact assessment etc. |
Interfaces
Component | Description |
API | This interface is used by applications and services to interact and access data managed by K. |
Web Portal | This interface is used by end users (e.g. Data managers, analysts etc) to access K and its services. |
Notifications | This interface is used to engage with end users via push notifications e.g. Email. |
Stores
Component | Description |
Metadata | The metadata store is used to store the details and relationships between data assets, reports, users, teams and other objects within the data ecosystem. |
Timeseries | The timeseries is used to store each data asset, person or content item and its lifecycle over time. |
Index | Each object in the data ecosystem is added to a search index to enable the contextual search service. |
Inputs
Component | Description |
Data Sources | Data sources (e.g. Teradata, Hadoop, Snowflake, SQL Server etc.) where data is stored and used by the Enterprise data teams. K has integrators for many on-premise and cloud data sources and can also ingest custom data sources through the K ingestion framework. |
Data Tools | Reporting and Analytics applications (e.g. Tableau, Power BI etc.) used by the Enterprise data teams to create, manage and distribute content. K has integrators for common data tools and can also ingest custom data tools through the K ingestion framework. |
Identity | Identity provider and user management sources (e.g. LDAP, SAML, OpenID Connect) that can provide single sign on and user and team data. |