Data-centric knowledge

From PKC
Revision as of 06:01, 21 February 2022 by Benkoo (talk | contribs)
Jump to navigation Jump to search

Data-centric knowledge is an approach of explicitly applying Data Science concepts and data manipulation instruments to organize knowledge.. Its universal applicability is based on the representability assumption of Kan Extension. Kan extension states that all concepts and idealized knowledge are representable through functors from a domain of complex data types to uniquely identifiable data entries in set-theoretic format. This means that knowledge of any kinds can all be stored or represented using concrete data points stored in databases.

Data-Centric Knowledge under the context of MU

Knowledge is represented as a special kind of data based on raw data and computed from priorly established information content under a unifying context of MU data operations. Every piece of knowledge needs to go through the following stages to be given a representable handle for ongoing integration of knowledge content:

  1. Grounding Raw Data: This data set is collected from widely deployed user terminals or certified data sensors that should always be annotated with timestamps and spatial tags that explicitly specify who, when and where the data are being collected. These raw data content, especially the timestamps and location/account that provided the data will be used as a reference to determine the authenticity of data.
  2. Inferred information: The ordering and prioritization of information content are filtered by previously mentioned raw data. This information filtering procedure is conducted by a set of computational inference tools, whose source code are version-controlled based on MU-compliant rules. Computational procedures specified using Neural networks, Bayesian Belief Networks, System Dynamic models, and other data-intensive inference mechanisms will have their training data set as part of the version-controlled data content.
  3. Action of Acknowledgement: Knowledge is represented as a set of causal relations that are explicitly coded up as executable programs/contracts defined in MU compliant PKCs. An action of acknowledgment can be automatically triggered by verified raw data and programmatically computed information content, including semi-automatically acknowledged by human-in-the-loop authorization of action. The event of acknowledgment can be represented as a piece of authenticated data that possesses pragmatic value, such as a token of appreciation, honor badges, or cash payment. It is the data on the event-of-acknowledgement that we register and represent knowledge content in MU.

Observability of Data-Centric Knowledge

MU is about bringing the power of data to both individual and organizational awareness. This means that data of different kinds will be continuously processed and reported to enable project and resource management in general. By making data assets observable in their adequate reporting formats, it will significantly improve the quality and quantity of human and organizational activities in a profound way.

Report Generation as a Mean to an End

It is necessary to note that observability is just an mean to the end. The goal is not about generating reports, the goal is to elevate awareness from data reporting in context. Therefore, the format and frequency of report generation and data visualization is a form of art, where it needs to be integrated with UI/UX design efforts extensively.

Areas of Tasks regarding Data Management

Anshul Tiwari, a data engineer stated that there are at least 11 areas of data management activities[1]:

  1. Data Governance
  2. Data Architecture
  3. Data Modeling and Design
  4. Data Storage and Operations
  5. Data Security
  6. Data Integration
  7. Documentation and Content
  8. Master Data Management
  9. Data Warehouse and Business Intelligence
  10. Meta Data Management
  11. Data Quality

References

Related Pages