Difference between revisions of "Data-centric knowledge"

From PKC
Jump to navigation Jump to search
Line 8: Line 8:


==Universal Knowledge Representation==
==Universal Knowledge Representation==
Data-centric knowledge subscribe to a central thesis, that all knowledge content can be universally represented by one data type, composed of [[ordered relation]]s. Its universal applicability is based on the mathematical claim of [[Kan Extension]]. Kan extension states that all concepts and idealized knowledge are [[representable]] through [[functor]]s which is a kind of [[directed relation]]. This means that [[knowledge]] of any kind can all be [[Representable Functor|represented]] using [[ordered relation|ordered entries]] of concrete data points stored in scalable databases.
Data-centric knowledge subscribes to a central thesis, that all knowledge content can be universally represented by one data type, composed of [[ordered relation]]s. Its universal applicability is based on the mathematical claim of [[Kan Extension]]. Kan extension states that all concepts and idealized knowledge are [[representable]] through [[functor]]s which is a kind of [[directed relation]]. This means that [[knowledge]] of any kind can all be [[Representable Functor|represented]] using [[ordered relation|ordered entries]] of concrete data points stored in scalable databases.
===Composability and Univesality===
Given the universal assumption of knowledge content representability, it implies that content can always be composed of these [[arrow]]-like universal components. The direct connection between composability and universality is a crucial insight in learning and teaching knowledge based on [[Data Science]], where we may consider [[Data Science]] being an extension or inclusion of [[quantum mechanics]], where it is a scientific language explicitly designed to encompass data in all possible physical scales and forms. All scientific hypotheses of [[quantum mechanics]] are grounded in observable data and the interpretive mechanisms of these data. Therefore, instead of just focusing on the physical meaning of obserable data, we can use the science of data interpretation as a generalized tool for all other areas of intellectual work. In any case, the logic of data is the grounding currency of science, and all data interpretation must follow a [[consistency|consistent]] set of logical rules. Even if one tries to extend the scope of certain logical assertions, the scope is also denoted in a logically [[soundness|sound]] data set.


==Observability of Data-Centric Knowledge==
==Observability of Data-Centric Knowledge==

Revision as of 06:31, 21 February 2022

Data-centric knowledge is an approach of explicitly applying Data Science concepts and modern data manipulation instruments to organize knowledge. The main driver for organizing knowledge in a data-centric manner is inspired by Moore's Law, which points out the causal connections between the physical dimensions of data manipulation instruments and its impact on socio-technical dynamics. Effectively, Moore's Law established a functional relationship between dimensionless/scale-free data, with the observable performance of speed and scale of decision-making in commodity devices. This functional relationship, whether exponential or not, is qualitatively true, and it can be relaxed to encompass a larger class of decision-making problems, including shaping the intellectual framework of data-centric knowledge management.

Data-Centric Knowledge under the context of MU

Knowledge is represented as a special kind of data based on raw data and computed from priorly established information content under a unifying context of MU data operations. Every piece of knowledge needs to go through the following stages to be given a representable handle for ongoing integration of knowledge content:

  1. Grounding Raw Data: This data set is collected from widely deployed user terminals or certified data sensors that should always be annotated with timestamps and spatial tags that explicitly specify who, when and where the data are being collected. These raw data content, especially the timestamps and location/account that provided the data will be used as a reference to determine the authenticity of data.
  2. Inferred information: The ordering and prioritization of information content are filtered by previously mentioned raw data. This information filtering procedure is conducted by a set of computational inference tools, whose source code are version-controlled based on MU-compliant rules. Computational procedures specified using Neural networks, Bayesian Belief Networks, System Dynamic models, and other data-intensive inference mechanisms will have their training data set as part of the version-controlled data content.
  3. Action of Acknowledgement: Knowledge is represented as a set of causal relations that are explicitly coded up as executable programs/contracts defined in MU compliant PKCs. An action of acknowledgment can be automatically triggered by verified raw data and programmatically computed information content, including semi-automatically acknowledged by human-in-the-loop authorization of action. The event of acknowledgment can be represented as a piece of authenticated data that possesses pragmatic value, such as a token of appreciation, honor badges, or cash payment. It is the data on the event-of-acknowledgement that we register and represent knowledge content in MU.

Universal Knowledge Representation

Data-centric knowledge subscribes to a central thesis, that all knowledge content can be universally represented by one data type, composed of ordered relations. Its universal applicability is based on the mathematical claim of Kan Extension. Kan extension states that all concepts and idealized knowledge are representable through functors which is a kind of directed relation. This means that knowledge of any kind can all be represented using ordered entries of concrete data points stored in scalable databases.

Composability and Univesality

Given the universal assumption of knowledge content representability, it implies that content can always be composed of these arrow-like universal components. The direct connection between composability and universality is a crucial insight in learning and teaching knowledge based on Data Science, where we may consider Data Science being an extension or inclusion of quantum mechanics, where it is a scientific language explicitly designed to encompass data in all possible physical scales and forms. All scientific hypotheses of quantum mechanics are grounded in observable data and the interpretive mechanisms of these data. Therefore, instead of just focusing on the physical meaning of obserable data, we can use the science of data interpretation as a generalized tool for all other areas of intellectual work. In any case, the logic of data is the grounding currency of science, and all data interpretation must follow a consistent set of logical rules. Even if one tries to extend the scope of certain logical assertions, the scope is also denoted in a logically sound data set.

Observability of Data-Centric Knowledge

MU is about bringing the power of data to both individual and organizational awareness. This means that data of different kinds will be continuously processed and reported to enable project and resource management in general. By making data assets observable in their adequate reporting formats, it will significantly improve the quality and quantity of human and organizational activities in a profound way.

Report Generation as a Mean to an End

It is necessary to note that observability is just an mean to the end. The goal is not about generating reports, the goal is to elevate awareness from data reporting in context. Therefore, the format and frequency of report generation and data visualization is a form of art, where it needs to be integrated with UI/UX design efforts extensively.

Areas of Tasks regarding Data Management

Anshul Tiwari, a data engineer stated that there are at least 11 areas of data management activities[1]:

  1. Data Governance
  2. Data Architecture
  3. Data Modeling and Design
  4. Data Storage and Operations
  5. Data Security
  6. Data Integration
  7. Documentation and Content
  8. Master Data Management
  9. Data Warehouse and Business Intelligence
  10. Meta Data Management
  11. Data Quality

References

Related Pages