A bit of an overview and then I'll get into some of the more specific arguments in a thread:
This piece is in three parts:
First I trace the mutation of the liberatory ambitions of the #SemanticWeb into #KnowledgeGraphs, an underappreciated component in the architecture of #SurveillanceCapitalism. This mutation plays out against the backdrop of the broader platform capture of the web, rendering us as consumer-users of information services rather than empowered people communicating over informational protocols.
I then show how this platform logic influences two contemporary public information infrastructure projects: the NIH's Biomedical Data Translator and the NSF's Open Knowledge Network. I argue that projects like these, while well intentioned, demonstrate the fundamental limitations of platformatized public infrastructure and create new capacities for harm by their enmeshment in and inevitable capture by information conglomerates. The dream of a seamless "knowledge graph of everything" is unlikely to deliver on the utopian promises made by techno-solutionists, but they do create new opportunities for algorithmic oppression -- automated conversion therapy, predictive policing, abuse of bureacracy in "smart cities," etc. Given the framing of corporate knowledge graphs, these projects are poised to create facilitating technologies (that the info conglomerates write about needing themselves) for a new kind of interoperable corporate data infrastructure, where a gradient of public to private information is traded between "open" and quasi-proprietary knowledge graphs to power derivative platforms and services.
When approaching "AI" from the perspective of the semantic web and knowledge graphs, it becomes apparent that the new generation of #LLMs are intended to serve as interfaces to knowledge graphs. These "augmented language models" are joint systems that combine a language model as a means of interacting with some underlying knowledge graph, integrated in multiple places in the computing ecosystem: eg. mobile apps, assistants, search, and enterprise platforms. I concretize and extend prior criticism about the capacity for LLMs to concentrate power by capturing access to information in increasingly isolated platforms and expand surveillance by creating the demand for extended personalized data graphs across multiple systems from home surveillance to your workplace, medical, and governmental data.
I pose Vulgar Linked Data as an alternative to the infrastructural pattern I call the Cloud Orthodoxy: rather than platforms operated by an informational priesthood, reorienting our public infrastructure efforts to support vernacular expression across heterogeneous #p2p mediums. This piece extends a prior work of mine: Decentralized Infrastructure for (Neuro)science) which has more complete draft of what that might look like.
(I don't think you can pre-write threads on masto, so i'll post some thoughts as I write them under this) /1
The essential feature of knowledge graphs that makes them coproductive with surveillance capitalism is how they allow for a much more fluid means of data integration. Most contemporary corporations are data corporations, and their operation increasingly requires integrating far-flung and heterogeneous datasets, often stitched together from decades of acquisitions. While they are of course not universal, and there is again a large amount of variation in their deployment and use, knowledge graphs power many of the largest information conglomerates. The graph structure of KGs as well as the semantic constraints that can be imposed by controlled ontologies and schemas make them particularly well-suited to the sprawling data conglomerate that typifies contemporary surveillance capitalism.
I give a case study in RELX, parent of Elsevier and LexisNexis, among others, which is relatively explicit about how it operates as a gigantic graph of data with various overlay platforms.