mike,
@mike@fosstodon.org avatar

looks like it's going to be building it's tool called directly into their OS. They're also including a Copilot in Office and Edge. Basically all Microsoft products will have a built in "assistant". If you're concerned about AI having access to your data, it's going to be much more difficult to avoid in the near future. Adding to the concern is this is black box AI. It's not open. We have no idea what's going on inside that digital head.

https://www.cnet.com/tech/computing/windows-copilot-puts-ai-in-the-middle-of-microsofts-most-important-software/

adamsdesk,
@adamsdesk@fosstodon.org avatar

@mike I didn't want M$ to do Copilot on GitHub let alone anything else. Sounds awful.

mike,
@mike@fosstodon.org avatar

@adamsdesk I'm not opposed to a "Copilot" on a purely abstract level. I can see where an OS based assistant would be a benefit for people who don't want to be computer people and just want to do specific tasks. The OS of a computer SHOULD be invisible in my opinion. I'm NOT a fan of Microsoft or Google being the ones in charge of that copilot though. If it doesn't work correctly air gapped, it's wrong IMHO.

adamsdesk,
@adamsdesk@fosstodon.org avatar

@mike That is well said.

adamsdesk,
@adamsdesk@fosstodon.org avatar

@mike What's you're thoughts on the legal side based upon license?

mike,
@mike@fosstodon.org avatar

@adamsdesk I think the legal stuff is going to be a constant question that never really has a good answer. These LLMs are just really big probability engines tacking words together that statistically make sense. We all do that same thing in our head every time we write a sentence. The question boils down to how much can the LLM tack together that matches what I tacked together before it violates my intellectual property. I've got no good answers.

I think the question of training the model 1/

mike,
@mike@fosstodon.org avatar

@adamsdesk in the first place is a little bit more cut and dry. If I read your blog and I learn something from it or internalize a phrase that you've used in your post, I haven't "stolen" anything from you. I don't see where a language model reading your blog post and using that data would be different. Honestly, the data they get from it is even more abstract that what I would.

Of course, there's an interesting caveat to all of this too, and that's rarity. Rarity creates odd circumstances. 2/

mike,
@mike@fosstodon.org avatar

@adamsdesk The MORE original what you've written is, the more likely it will be replicated specifically. Example, if I ask you to write Hello World in Python, the code you'd write wouldn't be significantly different than the code millions of other people have written when doing the same thing. The more complex and unique the request, the smaller the sampled data set the LLM has to work with, so the more likely it is it will replicate a specific thing. That's bad.

So, three posts in and all 3/

mike,
@mike@fosstodon.org avatar

@adamsdesk can hope is that I've accurately conveyed just how confusing this whole subject is for me. There's a lot of gray area and literal philosophy going on in there (Ship of Theseus anybody?). I don't think we're ever going to get to a place where everybody agrees on all of it. 4/4

adamsdesk,
@adamsdesk@fosstodon.org avatar

@mike Dang, I never looked at it that way. Thanks for the lengthy response. I may be looking at this wrong, but doesn't this then in a sense make copyright license useless then? If a license is to be followed and then an entire code block is replicated without said license feels not right.

mike,
@mike@fosstodon.org avatar

@adamsdesk I wouldn't go so far as to say useless. I think we have to treat AI like we do people. Just like I can't credit every source of inspiration I have for everything I write (even for something as short as this reply), we can't expect an AI to worry about attribution for every single word. License will have to work with a gradient approach. Entire code blocks of replicated code would be a license violation (just as it would be for a human). The question becomes where to draw the line, 1/

mike,
@mike@fosstodon.org avatar

@adamsdesk and that's when things get REALLY stupid. When it comes to an automated system generating massive amounts of text/images/video/whatever, we're going to rely on internal checks. How do we do that? We either draw a dark black line SOMEWHERE in there where 2 lines of code matching is fine, but 3 is a copyright violation (as an example, not to be taken literally), or we're going to have to use some automated system to compare what the AI generates to existing work to see if TOO much 2/

mike,
@mike@fosstodon.org avatar

@adamsdesk of that generated work matches prior work. What kind of automated system excels at analyzing massive quantities of data quickly? I bet you can guess. Neither solution sounds great. Any dark line we draw will be an arbitrary selection. It's hard to argue for 3 when 2 and 4 are both so close to the exact same thing. Equally, the same questions come up with a model to determine if something AI generated as came up with the AI model in the first place. Such a weird problem. 3/3

adamsdesk,
@adamsdesk@fosstodon.org avatar

@mike Again very valid points I never thought of. Thanks the great response. I have a no way to look at all this now. Still not going to say I'm thrilled about it though.

benjaminhollon,
@benjaminhollon@fosstodon.org avatar

@adamsdesk @mike

Yeah, interesting thoughts.

My personal line is that I think ML models shouldn't be allowed to be trained on data they don't have explicit permission to use. As someone wanting to go into the writing industry, I don't want my writing to be used to train ML models; having AI take your job is one thing, but training them to replace you and having no way to disallow that? That crosses a line for me.

mike,
@mike@fosstodon.org avatar

@benjaminhollon I guess the question becomes what is the practical difference between a ML model reading your content and a person? In either case, they're not explicitly copying it, and either could end up replacing an individual.

@adamsdesk

benjaminhollon,
@benjaminhollon@fosstodon.org avatar

@mike @adamsdesk

A good question.

As I see it, literature is a grand conversation (I've seen others draw the analogy too). People learn from each other and their own writing answers with new ideas, building on each other. I'd argue that it's impossible to learn from someone's writing style without understanding their writing on a deep level.

With people using an ML model, all they care about is the output. It's purely transactional; no conversation happens.

adamsdesk,
@adamsdesk@fosstodon.org avatar

@benjaminhollon @mike I agree with these thoughts, though Mike has a valid point to me.

benjaminradio,
@benjaminradio@fosstodon.org avatar

deleted_by_author

  • Loading...
  • benjaminhollon,
    @benjaminhollon@fosstodon.org avatar

    @benjaminradio @adamsdesk @mike

    That's fair. I think "replacing" was a strong word in my case; I don't think LLMs will ever replace truly great writers. But I think they make it harder to become a great writer because you have to slog through the skill level that they sit at, where your writing is seen as valueless because it's no better than an ML model.

    pixelherodev,

    @mike @adamsdesk Actually, legally, I think this is very wrong.

    https://en.wikipedia.org/wiki/Clean_room_design

    > The term implies that the design team works in an environment that is "clean" or demonstrably uncontaminated by any knowledge of the proprietary techniques used by the competitor.

    The whole point of clean-room engineering is that, if you're aware of copyrighted techniques, and you reproduce them, you're in violation of copyright, no matter how abstract your reasoning was.

    pixelherodev,

    @mike @adamsdesk That said, I don't think we should treat copyright this way.

    Rather, I think we should instead do the exact opposite: change the law so it's permissible for a person to do the same.

    mike,
    @mike@fosstodon.org avatar

    @pixelherodev I'm not a lawyer by any stretch of the imagination. I haven't even pretended to be one for Halloween, but is the style (technique) we use when we combine words copyrighted? Obviously the end results are (or could be), but putting one word after another in a particular way?

    @adamsdesk

    pixelherodev,

    @mike @adamsdesk IANAL either, but AFAIK: yes.

    adamsdesk,
    @adamsdesk@fosstodon.org avatar

    @pixelherodev Seems like a logical approach, though not certain how that would work. @mike

    adamsdesk,
    @adamsdesk@fosstodon.org avatar

    @pixelherodev Another good point. The law should be adjusted in some form. @mike

    pixelherodev,

    @mike All AI is black box AI; that's half the point of using neural networks

    and I don't mean "oh, we don't want to have to explain it" (though that's definitely a factor); rather, if you're using AI, it's an acknowledgement that you don't know how to solve the problem.

    Of course you can't explain what the "AI" is thinking; if you could, you wouldn't need to use it!

    pixelherodev,

    @mike I sometimes have to append to a post and make it clear I'm not trying to be critical, after realizing the tone was wrong; this is the first for the opposite.

    My general stance is that burning enough energy overnight to power millions of people for CENTURIES for the sake of capitalist profit is not just stupid but genuinely evil, on par with murder, if not worse.

    It's the equivalent of putting a knife in the back of every value that makes civilization and, even, humanity possible.

    mike,
    @mike@fosstodon.org avatar

    @pixelherodev This is a sticking point for me with AI. I don't mind AI. I like it actually, but I don't like it when a few select entities control it. Microsoft and Google fighting over the collective future of AI is disturbing. I like seeing versions of the latest GPT software that will run locally. I like seeing a version of Stable Diffusion running locally. I understand that not everybody is going to want to do that, and it would be wasteful if they did. Maye decentralization to the rescue?

    pixelherodev,

    @mike I'm talking about the training of them.

    Creating Stable Diffusion, or GPT4, requires burning insane amounts of energy. That on its own is an insurmountable ethical problem IMO

    mike,
    @mike@fosstodon.org avatar

    @pixelherodev Energy consumption is an interesting conundrum. Using nonrenewable energy wastefully is unethical in my mind. I have less concerns about using renewables. IF (and that's a big if) the models for Stable Diffusion or GPT 4 (or similar) were created in a datacenter powered entirely by renewable energy, I'd have less problems with it.

    pixelherodev,

    @mike I completely disagree.

    I live in New Jersey, for instance; we're 50% nuclear, and 50% natural gas.

    That means that, contrary to initial intuition, any energy I use that I don't have to is being fed 100% by natural gas, and 0% by nuclear.

    We don't have enough renewables right now to power everything we already do.

    If you use 100TWh from renewables for a New Cool Thing, whatever that thing is, that's 100TWh that didn't go to reducing our dependency on oil.

    pixelherodev,

    @mike In other words, energy supply isn't tied to its usage.

    If you add renewable capacity, and use that to replace current usage of oil, you're reducing the problem.

    If you then also tack on demand equivalent to that renewables, you're making the problem worse again.

    There's also a limit to just how much energy we can actually get from renewables. Renewable means, effectively, infinite joules. It does NOT mean that we can handle infinite Watts.

    mike,
    @mike@fosstodon.org avatar

    @pixelherodev It seems like you're still more concerned with where the energy is coming from rather than its usage. If (hypothetically) we invented perpetual motion tomorrow and were able to supply every person in the world with as much clean, renewable energy as they could use, would the amount of energy being used still matter? I think the answer is no. The amount of energy we use is only unethical due to how it's produced and it's unequal distribution.

    pixelherodev,

    @mike I think it would still matter; waste is inherently wrong.

    Putting that aside, though, I do agree that misuse of energy from fossil fuels is far worse, which is why I think it's so important to be aware that "renewable-powered projects" are just a way for corpos to launder responsibility away from themselves to other people.

    mike,
    @mike@fosstodon.org avatar

    @pixelherodev It's only waste if something of value isn't gotten from its use, and "value" is a very subjective position. That's especially true when what's being "wasted" is in infinite supply.

    I don't disagree with your assessment of our current energy market, but I do think that this step may be necessary to get us to a position where we no longer require nonrenewable energy sources.

    pixelherodev,

    @mike Maybe, but I think we'd get there a lot faster if we started asking, "What parts of demand can we cut right now at minimal cost?" instead of "How can we do everything we do right now but, like, without hurting the environment?"

    mike,
    @mike@fosstodon.org avatar

    @pixelherodev I disagree, first and foremost because we use a finite amount. We can cut back, but how far we can cut back without living in caves is limited by practicality. We need to increase supply in a clean and efficient manner. This isn't to say that cutting back usage isn't important, but it's only a short term solution. We can't treat it like it's going to solve our problems in the long term.

    pixelherodev,

    @mike I think, practically, we can cut at least 5% without giving up any real luxury or major social shifts [many small changes, like increasing usage of public transit instead of cars, for instance].

    I think we could do 15-20% with just social changes, [restructure the economy, eliminate pointless jobs - which is a lot of them - thus eliminating energy wasted on said pointless jobs].

    I think with just small losses in luxury, we could live in abundance with <50% of the energy we use now.

    mike,
    @mike@fosstodon.org avatar

    @pixelherodev I think we've been trying that method for as long as I've been alive with extremely mixed results. How many things that would also be good for the environment aren't being done because of high energy costs? Desalination of sea water to increase clean water supplies? AeroFarms for increased food production? Etc. Advancements that sit basically idle because they take too much energy to accomplish at scale?

    pixelherodev,

    @mike A very good point, yeah

    Though, I'd say that we could cut back specifically on energy uses that are not good for the environment without too much trouble?

    I'll have to think about this more :)

    benjaminhollon,
    @benjaminhollon@fosstodon.org avatar

    @pixelherodev @mike
    I do not know why reading this discussion gives me so much joy. 😂

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • DreamBathrooms
  • ngwrru68w68
  • modclub
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • InstantRegret
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • mdbf
  • GTA5RPClips
  • JUstTest
  • tacticalgear
  • normalnudes
  • tester
  • osvaldo12
  • everett
  • cubers
  • ethstaker
  • anitta
  • provamag3
  • Leos
  • cisconetworking
  • megavids
  • lostlight
  • All magazines