jonny, (edited ) to random
@jonny@neuromatch.social avatar

so we have been batting around the idea of some kinda paper bot for awhile re: the question "how do we track discussions around scholarly work" and I am starting to think this paper-feeds project is the way to do it.

So say it is an AP instance and it has one primary bot user, you follow it and it follows you back. When you make a post with something that resolves to a DOI, then that post is linked to that work. Any hashtags used in that post are added to that papers keywords (assuming some basic moderation and word ban lists). Then keyword feeds are also represented as AP actors that can be followed and make a post per paper. I wonder if we can spoof the "in reply to" field to present all those posts as being replies to that paper.

So say the bot also has some simple microsyntax for linking your account to an ORCID - either directly in a profile field, or by @'ing the bot and checking a rel=me, or hell even oauth. Then you could also relate when the authors of given works talk about other works and use that as another proximity measure. Then you could make an author RSS feed/AP actor that is just the works someone publishes and optionally that they talk mention - so eg I could make an aggregate feed for the papers my friends are reading.

Then you could have instances of this feed generator follow one another and broadcast aggregated similarity information at a paper level not linked to personal information, and also opt-in info like the fedi account <-> ORCID link. Since youre on AP already you basically get that for free.

Thinking about what would be useful for social discovery of scholarly works, and there are a lot of really interesting ideas once you start actually yno doing it starting from a place of not having a product to sell or a platform to run so you avoid some of the scale and liability probs.

Edit: prior post here: https://neuromatch.social/@jonny/111688727690129033
And repo here: https://github.com/sneakers-the-rat/paper-feeds/
And ill start tagging these with but that last post has too many interactions to edit now

jonny, to random
@jonny@neuromatch.social avatar

So hmm if for #PaperFeeds we want to extract text from a paper and use it to generate a vector embedding for similarity analysis, could we just temporarily download the paper from #SciHub , compute on it, discard? The crime is unauthorized distribution, right? So if we dont serve the PDFs that should be fine? Maybe? I mean we could just analyze open access lit but where is the fun in that. Then we could make an API endpoint for paper embeddings and share them among instances of the feed generator...

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • mdbf
  • everett
  • osvaldo12
  • magazineikmin
  • thenastyranch
  • rosin
  • tester
  • Youngstown
  • Durango
  • slotface
  • ngwrru68w68
  • kavyap
  • DreamBathrooms
  • megavids
  • InstantRegret
  • ethstaker
  • GTA5RPClips
  • tacticalgear
  • normalnudes
  • Leos
  • modclub
  • khanakhh
  • cubers
  • cisconetworking
  • anitta
  • provamag3
  • lostlight
  • All magazines