Cover photo

A Trustless Future: Verifiable Knowledge and Explainable AI

Why we need to build for Collective Wisdom

The year is 2030. Rapidly increasing crime rates have immensely strained the legal system, leaving government officials in the United States desperate to find solutions. To alleviate some of this burden, they decide to implement an AI-powered system to tackle a critical issue: the rise in recidivism rates. Designed to predict the likelihood of re-offense, the AI will streamline bail, sentencing, and parole decisions, saving taxpayers millions and public servants countless hours. Government officials, data scientists, and engineers train the model on historical criminal records, a logical decision. They spend months rigorously testing the system, running countless simulations to ensure the model works as intended.

Confident in their diligence, they launch the AI, trusting that cutting-edge technology and careful oversight will ease the burden on the legal system. But as the system goes live, troubling patterns emerge. Black defendants are disproportionately labeled as high-risk, leading to real-world implications like harsher bail and sentencing outcomes. The officials scramble to review the AI's explanations—why did the model make these decisions?

Except—and you might have picked up on this now—this isn't a futuristic scenario. In 2016, the COMPAS Recidivism Algorithm was used to make similar predictions. What the researchers didn't account for was that the historical data was biased, disproportionately labeling Black defendants as high-risk, reinforcing systemic racial prejudices. The algorithm's biased assessments resulted in harsher bail, sentencing, and parole outcomes for many Black defendants, perpetuating racial inequalities in the criminal justice system.

Shouldn’t the data used by AI systems, along with the reasoning and decision-making models they employ, be traceable, auditable, and open to scrutiny—especially when they determine the fates of people who never agreed to their use? In the case of COMPAS, one of the significant barriers to detecting bias was the algorithm's proprietary nature and reliance on opaque data. It isn't just the case of the example used above, though. There are countless cases, some high profile, and most unknown. When data is hoarded and proprietary models are insulated from public view, accountability falters and our collective potential diminishes. As we'll explore in the following sections, Verifiable Internet, along with Explainable AI, or XAI, allows us to change our story; we can become a society that thrives on transparency and accountability.

Building the Verifiable Internet

Trust is for friends and family, not data. With the Verifiable Internet, we can build trust-minimized knowledge; data validity is cryptographically proven, so there's no need to take someone's word for it. To do this, we can use digital signatures and hash functions to confirm the legitimacy of data. All content, from news articles to datasets, has a unique cryptographic hash—a fixed-length representation of the original data. Any alteration to the content results in a completely different hash, allowing for real-time integrity checks. By leveraging public-key infrastructure or PKI and blockchain-based systems, the content's origin can be authenticated and its integrity verified.

Thankfully, cryptographic verification could eliminate the need to rely on centralized authorities or intermediaries to confirm the accuracy of information. This feat of trust-minimization prevents data manipulation and ensures transparency, providing data consumers with confidence that it's also accurate and tamper-proof.

The Decentralized Knowledge Graph, or DKG, is foundational to the Verifiable Internet, structuring knowledge into discrete units known as Knowledge Assets. These assets are meticulously organized to be verifiable, ownable, and traceable, providing a clear and unalterable record of data provenance. Unlike traditional databases that centralize data storage and control, the DKG operates on a decentralized network of nodes, ensuring that no single entity has unilateral control over the data. This decentralization enhances security, reduces the risk of data tampering, and promotes transparency, as every Knowledge Asset is subject to continuous verification by the network.

Consider a research paper published on the current web. Typically, information consumers have no choice but to trust the paper's validity based on the reputation of its publisher or the institution behind it. Contrast this with the Verifiable Internet, which provides a way to cryptographically verify not only the paper's origin but also the data it utilized and every modification it underwent throughout its lifecycle. This thorough verification process means that each research element is transparently documented and immutable. If a person makes any intentional or accidental alterations, the cryptographic signatures associated with the Knowledge Asset will reflect these changes, allowing users to identify and assess them accurately.

Information provenance, the ability to trace the exact source and history of data, is the linchpin in the Verifiable Internet; we can thank Tim Berners-Lee for introducing this concept long before the term "Verifiable Internet" came into public consciousness. It's essential for AI models, which rely heavily on vast datasets to learn and make decisions. The Verifiable Internet mitigates risks associated with data corruption, manipulation, or inherent biases in unverified data sources by ensuring that AI systems consume only trusted, bias-free inputs.

For advocates of self-sovereignty, the Verifiable Internet is a compelling tool. It enhances data ownership and control, empowering individuals to dictate how their information is accessed, shared, and used. Individuals maintain full rights over their personal data. Personal Knowledge Graphs, or PKGs, are an individual's data, preferences, and interactions, ensuring that users are the sole custodians of their digital identity. With cryptographic verification, individuals can decide who accesses their PKG, how the data is shared, and why. Owning a PKG allows users to operate in a decentralized system where they retain full rights over their digital selves.

In addition to promoting data sovereignty, the Verifiable Internet is a powerful weapon against misinformation and disinformation. It allows users to confirm the authenticity of content before sharing it or making decisions based on it. In an era of fast-spreading false information, this system enables content to be cryptographically validated, ensuring that the information being shared is accurate and has not been tampered with. This contributes to a more transparent and trustworthy online ecosystem, empowering users to verify data and reduce the spread of harmful misinformation independently.

xAI and the Verifiable Internet

AI models face serious challenges today, particularly regarding hallucinations—outputs that appear convincing but are fundamentally incorrect, often due to corrupted, biased, or unverifiable data.

The risk of model collapse, where models degrade by training on synthetic outputs, becomes increasingly concerning as AI-generated content proliferates. The Verifiable Internet is key to mitigating these risks, establishing that every dataset or piece of information used by AI systems is verifiable and traceable. In critical sectors like healthcare, verifying the provenance of medical studies could be the difference between life-saving diagnoses and harmful misinformation.

Still, verifiable data alone cannot guarantee transparency or accountability. XAI is essential because it provides human-interpretable insights into how AI systems make decisions, which helps determine if their outputs are reliable. Combining XAI with the Verifiable Web would make both the data and the processes that drive AI models fully transparent and open to scrutiny, enabling deeper trust in AI technology.

A promising approach to addressing AI hallucinations is the dRAG, or Decentralized Retrieval Augmented Generation framework, which merges neural AI models, like large language models, with symbolic AI, like Knowledge Graphs. By dynamically retrieving verifiable data from DKGs before generating responses, dRAG anchors AI outputs to reliable, referenceable data points, reducing reliance on purely probabilistic guesses. The transparency offered by XAI, combined with verifiable data, enables users and developers to understand how and why decisions are made—an ideal far removed from what exists today.

Here's another compelling case for the Verifiable Internet: as synthetic AI-generated content increasingly floods the web, the Verifiable Internet could prevent model collapse. Authenticated, human-verified data will become indispensable for maintaining the quality of AI models, significantly increasing its value. Leveraging blockchain, decentralized storage systems, and cryptographic verification, the Verifiable Internet provides a reliable foundation for AI that remains explainable, auditable, and less prone to the risks of model degradation.

For this reason, tokenized reward systems should incentivize the creation of high-quality, human-authenticated data. These systems encourage contributions to decentralized knowledge pools by rewarding researchers, data scientists, and developers who generate valuable, verified knowledge assets. Reputation-based incentives can add certainty that contributors are continuously producing high-quality data.

Leveraging XAI and the Verifiable Web in DeSci

When it comes to decentralized science, or DeSci, integrating XAI into research workflows would significantly enhance efficiency and transparency, key value adds for the DeSci framework. XAI can help automate complex research tasks, such as literature reviews, identifying relevant datasets, or generating hypotheses. The real value, though, lies in its transparency. Each AI decision is clear and traceable, meaning researchers can understand not only the results but also how the AI arrived at those conclusions.

This fits seamlessly with the Verifiable Internet, which ensures that all datasets, models, and research inputs are auditable and traceable to their source. Researchers working within DeSci can crowdsource the validation of scientific data, supporting open and decentralized collaboration. By leveraging XAI within a verifiable internet framework, research processes can simultaneously be accelerated and subjected to rigorous peer review, reducing the risk of bias or faulty conclusions.

Crowdsourced verification, central to the DeSci ethos, perfectly aligns with the verifiable internet because it empowers a global community of scientists to audit and authenticate research data collectively. By decentralizing this verification process, researchers ensure that no single entity controls the data's integrity. It also reduces reliance on traditional gatekeepers like academic journals, democratizing access to the research and the tools needed to validate it.

Final Thoughts

The failures found in the case of the COMPAS Recidivism algorithm likely weren't due to negligence or bad intentions. It's highly probable it was something less malicious but still egregious: an eagerness to deploy AI models using flawed, biased data. A seemingly innocuous thing, junk data, becomes more ominous in an age where AI systems hold tremendous decision-making power. The case of the COMPAS Recidivism Algorithm from 2016 exemplified that flawed, unchecked data could have real-world, life-altering consequences.

Here's what I think: we shouldn't need to rely on blind trust regarding the systems that impact our lives. Data—and the AI models built on that data—should be trustless, meaning they are verifiable, traceable, auditable, and explainable. The significance of verifiable data informing explainable AI models cannot be understated. With XAI and the Verifiable Internet, we can see this vision manifest.

We're at a critical juncture as a species: we can choose a world where mega-rich, data-hoarding corporations hide behind black-box algorithms, or we can build for collaboration, tapping into the infinite wellspring of collective wisdom already abundant in humankind. To me, The latter is a future worth building.

Loading...
highlight
Collect this post to permanently own it.
wisdom of the dao logo
Subscribe to wisdom of the dao and never miss a post.
#ai#desci#blockchain#web3