rysiek, (edited )
@rysiek@mstdn.social avatar

Hey @nextcloud I see you made "AI" the "centerpiece" of Hub 8?
https://news.itsfoss.com/nextcloud-hub-8/

What model are you using?
What data has it been trained on, and by whom?
Can I recreate your model from scratch?

Edit: the "centerpiece" part might have come from It's FOSS News, although Nextcloud messaging around AI is similarly excited.

vt52,
@vt52@ioc.exchange avatar

@rysiek well documented:

https://docs.nextcloud.com/server/28/admin_manual/ai/index.html

lots of options and they're transparent about tradeoffs
@nextcloud

rysiek,
@rysiek@mstdn.social avatar

@vt52 I would disagree on the transparency there. One of the points of the ethical assessment is:

> Is the training data available and free to use?

Consider how StackOverflow is basically arguing that the stuff people wrote on the site is "free to use" (as it is on a CC By-SA license), but the community outcry seems to suggest that they are not exactly on board with that interpretation.

LocalAI gets a Green rating, for example. But I cannot find info on the training data… 👀

@nextcloud

rysiek,
@rysiek@mstdn.social avatar

@vt52 also, there are multiple lawsuits currently that focus on the copyright question. If they go a particular way, it might potentially mean that:

  1. AI models are derivative works of the training data they are trained on.

  2. Anything done with the input from these AI models are in turn derivative works of these models.

IOW, if works licensed under CC By-SA (Wikipedia, Wikimedia Commons) are in the training corpus, suddenly anything created with these models might be CC By-SA.

@nextcloud

rysiek,
@rysiek@mstdn.social avatar

@vt52 so, while CC By-SA works are free to use, the licensing terms are crucially important. This is in no way captured by the "Ethical AI Rating".

It also does not capture at all the labor issues around training of LLMs. Who tagged and categorized the data? We know how that works in the BIg Tech space:
https://www.theguardian.com/technology/2024/apr/16/techscape-ai-gadgest-humane-ai-pin-chatgpt

@nextcloud

rysiek,
@rysiek@mstdn.social avatar

@vt52 and finally: does "Red" mean "unethical" and "Green" – "ethical"?

If yes, that would mean that @nextcloud is facilitating use of tech they themselves consider unethical.

If not, then why is this called "Ethical AI rating" in the first place?..

m0bi13,
@m0bi13@pol.social avatar

@rysiek

Nextcloud allows integration with multiple llm locally, or via API. It depends on the user's choice.
They have created an ‘ethicality’ scale for the use of llm:
https://nextcloud.com/blog/nextcloud-ethical-ai-rating/
@nextcloud

jrp,

@m0bi @Michał "rysiek" Woźniak · 🇺🇦 I see no mentioning of origins and ownership of their training data playing any role in their ethicality rating.

They do only offer

"Is the software (both for inferencing and training) open source?
Is the trained model freely available for self-hosting?
Is the training data available and free to use?"

Seems fundamentally flawed, if i may say so.

@nextcloud

m0bi13,
@m0bi13@pol.social avatar

@jrp

> I see no mentioning of the origina of their trainigh data

There is no "their" (Nextcloud) LLM so no "their" data used to train. There are integration options only. You decide what you use with your Netcloud instance, if you want LLM in Nextcloud. Its optional.

jrp,

@m0bi If it's optional, why do they provide an (incomplete) ethicality rating at all?

m0bi13,
@m0bi13@pol.social avatar

@jrp

Don't ask me. I provided information on how Nextcloud's LLM integration works and that they are working on a ranking of the ‘ethics’ of using LLM.

tadzik,
@tadzik@social.tadzik.net avatar

@rysiek https://github.com/nextcloud/llm?tab=readme-ov-file#a-large-language-model-in-nextcloud cites three models to choose from, plus there's integration with SaaSes AIUI.

rysiek,
@rysiek@mstdn.social avatar

@tadzik well first of all:

> Note: This app is deprecated and no longer being maintained. Its successor is https://github.com/nextcloud/llm2

Secondly, it mentions it can use Llama, but also mentions that "the training data is freely available". I doubt both of these can be true. Unless I missed something major, Llama training data have not been released by Meta.

rysiek,
@rysiek@mstdn.social avatar

@tadzik the "llm2" Readme sends me to the Admin docs:
https://docs.nextcloud.com/server/latest/admin_manual/ai/overview.html

There they mention models like "text2image_stablediffusion2". So this applies:
https://en.wikipedia.org/wiki/Stable_Diffusion#Training_data
> An investigation by Bayerischer Rundfunk showed that LAION's datasets, hosted on Hugging Face, contain large amounts of private and sensitive data.

noodlejetski,
@noodlejetski@masto.ai avatar

@rysiek @nextcloud oh god no

tfiebig,
@tfiebig@wybt.net avatar

@rysiek @nextcloud IIRC they go for a "bring your own model"-thing; And provide a traffic light system for assessing those.

rysiek,
@rysiek@mstdn.social avatar

@tfiebig wait so the "centerpiece" of @nextcloud is actually missing from the product?

I'm confused, that does not sound right.

tfiebig,
@tfiebig@wybt.net avatar

@rysiek @nextcloud They integrated it in all the places(tm) and you just have to plug in an API. LocalAI, o4 whatever other $thing you may fancy. And then all the funny functions work.

tfiebig,
@tfiebig@wybt.net avatar

@rysiek @nextcloud Can send screenshots later.

rysiek,
@rysiek@mstdn.social avatar

@tfiebig thanks, I do appreciate it!

But I'd really like @nextcloud to respond, too.

viktor,
@viktor@me.dm avatar

@rysiek @tfiebig @nextcloud Tobias is correct. Nextcloud isn't an AI company, we don't have our own LLMs. First, all AI features can be disabled because some folks don't like AI. Second, you can bring your own model now with Nextcloud Hub 8, or even integrate with OpenAI API. It's really up to you what flavor of AI you prefer to use. Your data, your choice.

rysiek,
@rysiek@mstdn.social avatar

@viktor right, thank you for responding!

The thing I am confused about, I guess, is the messaging around the "AI" assistant being the "centerpiece" of Nc Hub 8. What I understand from what you said and what I found in the docs, the Assistant requires some model to be available. Is there a model distributed with Nextcloud for that purpose?

@tfiebig @nextcloud

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • ngwrru68w68
  • rosin
  • GTA5RPClips
  • osvaldo12
  • love
  • Youngstown
  • slotface
  • khanakhh
  • everett
  • kavyap
  • mdbf
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • megavids
  • InstantRegret
  • normalnudes
  • tacticalgear
  • cubers
  • ethstaker
  • modclub
  • cisconetworking
  • Durango
  • anitta
  • Leos
  • tester
  • provamag3
  • JUstTest
  • All magazines