sean

@sean@idf.social

This profile is from a federated server and may be incomplete. Browse more on the original instance.

KathyReid, to stackoverflow
@KathyReid@aus.social avatar

I just issued a data deletion request to #StackOverflow to erase all of the associations between my name and the questions, answers and comments I have on the platform.

One of the key ways in which #RAG works to supplement #LLMs is based on proven associations. Higher ranked Stack Overflow members' answers will carry more weight in any #LLM that is produced.

By asking for my name to be disassociated from the textual data, it removes a semantic relationship that is helpful for determining which tokens of text to use in an #LLM.

If you sell out your user base without consultation, expect a backlash.

sean,

@KathyReid Good stuff! Out of curiosity… when you mention that higher ranked users' posts carry more weight… is there anywhere I can read more about this feature engineering? Are we talking about RAG/search-operators manually annotating CSS selectors to pull user-ranking info per-site? Related: after crawling user rank info, would a RAG/search-provider not keep the info in-cache, i.e. do account deletions actually trickle down to search engines' collection of valuable features?

sean,

@KathyReid The high volume of text for high rank users makes total sense from a training bias perspective, though I think anonymizing authors might not change this.

Unless the RAG provider uses specially designed extractors for user rank info in their corpus, I'm doubtful ML could pick up on a numerical rank like SO karma and figure out to weight by this number. That's too much System 2 thinking for ML, IMO!

Still good to give big firms as little free data as possible, of course! ☺

alex, to random

Is there a way to adjust GitHub history in a particular repository to see differences by character rather than by line?

sean,

@alex I don't think so, but if you git clone the repository, then you can use git diff --word-diff locally to get something similar.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • megavids
  • mdbf
  • ngwrru68w68
  • modclub
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • InstantRegret
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • DreamBathrooms
  • JUstTest
  • ethstaker
  • osvaldo12
  • tester
  • GTA5RPClips
  • cubers
  • everett
  • tacticalgear
  • cisconetworking
  • normalnudes
  • anitta
  • provamag3
  • Leos
  • lostlight
  • All magazines