Java is an interesting language for a Fediverse project because it's the one language with several mature implementations of Semantic Web tech (RDF, SPARQL, etc). JSON-LD just works, out of the box. It was kind of shocking to see Apache Jena do in a few minutes of work what took me weeks in Deno!
And I learned about a piece of the Semantic Web ecosystem I wasn't familiar with before. Have you heard the good word of OWL?
This is actually a good example of something that really bothers me in #ActivityPub / AS2 / #JsonLD
From a security standpoint and from a performance standpoint there is a world of difference between replies returning:
A list of references
A list of Links
A list of Objects
But from the standpoint of AP/AS2 there is no difference between these. They are all the same, and the documentation provides no guidance on navigating this.
Ugh, at some point I need to sit down and actually evaluate the performance of JSON-centric vs. #JsonLD centric models, but that's a friggin ton of work.
I just keep running into cases where it would be good to have as a reference.
As I read through the discussions around the creation of and revisions around #RDF three things are clear in my eye.
Usability was not the primary concern. It seems to have been widely believed that Other Tools™ would fill this gap and that RDF should focus first on a kind of expressibility.
Those Other Tools™ never materialized.
Most who use RDF-derived tooling seem to assume either that it gives them those tools or that Others™ will build them on top of their solution as well
This is… whatever it is for #RDF or even for #JsonLD, but I'd argue that it is absolutely the wrong choice for something like ActivityPub.
Just in a general communication over unreliable radio networks/distributed systems sense, nothing about this is a good idea.
Because when I am building a social network—or most apps of this nature—I care about things like round trip times, number of network calls, and what pieces of information I have, don't have, or can/cannot get to make a decision.
No. But it does mean that if you are going to use it you need to layer on top of it some additional set of restrictions.
You could easily say, for example, that "tag must always be a list/set/bag of links" (or whatever), imposing a syntactic (a list) and semantic (links) restriction.
You could say "Create must always include the full object, adjusted for permissions, Update must always be just the URI referencing the object."
Yes, they are, and I'm not even advocating changing any of them (except AS2, separate debate), what I'm saying instead is that maybe part of your protocol is that it needs to actually constrain the lexical forms, not just be defined in terms of semantics, and also constrain the semantics, not just aim to capture "that which is expressible."
You can only kick these decisions down the road so many times.
Ignoring that I misspelled Bachelet in one case, these objects have a (simple) equivalence relationship with each other. (Example from Hogan, 2017, Canonical Forms for Isomorphic and Equivalent RDF Graphs: Algorithms for Leaning and Labelling Blank Nodes)
Some notes on my momentary distraction into complexity:
There's virtually no analysis into #JsonLD algorithms proper. Canonicalization is better studied in RDF, but only just
Comparing objects with no blank nodes that are already in RDF can be solved in polynomial time. If there are blank nodes then that becomes GI-complete (which says basically "we couldn't prove this problem maps to NPC but also there are no P algorithms that solve it and it seems like it should map to NPC")
For #ActivityPub the question of "Why use #LinkedData?" has never been answered. There should be clear merits to wade through all the complexity that this choice brings, right?
Yes, its ultra flexible, and you can define your own semantic #ontologies, and theoretically it could provide a robust extension mechanism to AP protocol. Except that right now it doesn't.
What's the vision of a Linked Data #Fediverse? What great innovative #SocialNetworking#UX would it bring, that makes it worthwhile?
What I mean with "robust extension mechanism" is more than a #JsonLD context. It comprises the entire set of tools and practices for writing quality #ActivityPub extensions defining data formats, msg exchange patterns, business logic, etc. so that I have biggest chances to write extensions that can interoperate well. All this may include a way to discover which extensions an endpoint supports, and able to find their docs/specs.
In truth I think comparing #JsonLD to XML is well-intentioned but incorrect. I suspect it comes from a "XML became this heavyweight thing, and JSON-LD adds these heavyweight things, therefore…"
There is some overlap in intent, but XML both a) had it easier to do the same thing that JSON-LD is trying to do b) was made obnoxious due to in many cases the opposite set of problems that people have with JSON-LD.
Actually, I think what bothers me most about #JsonLd is that it completely misses the point of having specs and standards in the first place. Specs and standards are coordination. The whole point is to clarify, define, agree, and document things so that coordination costs stay linear, at worst.
Json-ld just says nope to that, and instead tries to replace coordination with computational wizardry. In my experience so far, it works as well as you might expect.
“The serialized JSON form of an Activity Streams 2.0 document MUST be consistent with what would be produced by the standard JSON-LD 1.0 Processing Algorithms and API [JSON-LD-API] Compaction Algorithm using, at least, the normative JSON-LD @context definition provided here.”
As much as I dislike #JsonLD for this use case (and oh can I tell you about why I dislike it for this use case), I actually think in many ways #ActivityPub would be better if it went all-in with it.
Because right now I can't use any of the features of JSON-LD that make it actually powerful, but I've inherited a lot of the problems with the worldview.
HTTP is similar (except I don't hate it for this use case): by not committing we end up in a half-state. Neither fish, flesh, nor fowl.
For any given consistently constructed message system M which contains within it a message schema M.F that describes M, a sufficiently complex message m cannot be unambiguously typed using only the information found in m.f.
Okay, it's probably not strictly true, but it seems to be that way often enough in practice.
I think overall a lot of #JsonLD-inspired protocols struggle with (3) in particular.
They are reasonably good at saying "this is that." They are very bad at giving me clues that "this is not that" without pulling them apart element at a time.
For instance, if I take an #ActivityStreams Article object in AP I can easily see what it is (an article, so a multiparagraph work).
I'm developing an increasing suspicion that I have a different definition of what makes a protocol difficult to implement than other people? Along with what partial, incremental progress means here?
Yes, I am probably going to be ranting about the union type problem in #ActivityPub/ #JsonLD more than usual at least until I get this serialization pattern worked out…
…or until I decide to just chuck it all out of the window and only write or accept one format.