‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says

L4sBot@lemmy.world · 10 months ago

‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says

S410@lemmy.ml · edit-2 9 months ago

deleted by creator

AntY@lemmy.world · 10 months ago

The main difference between the two in your analogy, that has great bearing on this particular problem, is that the machine learning model is a product that is to be monetized.

deweydecibel@lemmy.world · 10 months ago

And ultimately replace the humans it learned from.

Zoboomafoo@slrpnk.net · 10 months ago

Good, I want AI to do all my work for me

afraid_of_zombies@lemmy.world · 10 months ago

Yes clearly 90 years plus death of artist is acceptable

BURN@lemmy.world · 10 months ago

Also an “AI” is not human, and should not be regulated as such

afraid_of_zombies@lemmy.world · 10 months ago

Neither is a corporation and yet they claim first amendment rights.

BURN@lemmy.world · 10 months ago

That’s an entirely separate problem, but is certainly a problem

afraid_of_zombies@lemmy.world · 10 months ago

I don’t think it is. We have all these non-human stuff we are awarding more rights to than we have. You can’t put a corporation in jail but you can put me in jail. I don’t have freedom from religion but a corporation does.

BURN@lemmy.world · 10 months ago

Corporations are not people, and should not be treated as such.

If a company does something illegal, the penalty should be spread to the board. It’d make them think twice about breaking the law.

We should not be awarding human rights to non-human, non-sentient creations. LLMs and any kind of Generative AI are not human and should not in any case be treated as such.

afraid_of_zombies@lemmy.world · 10 months ago

Corporations are not people, and should not be treated as such.

Understand. Please tell Disney that they no longer own Mickey Mouse.

BURN@lemmy.world · 10 months ago

Again, I literally already said that it’s a problem.

IP law is also different than granting rights to corporations. Corporations SHOULD be allowed to own IP, provided they’ve compensated the creator.

LWD@lemm.ee · edit-2 10 months ago

deleted

testfactor@lemmy.world · 10 months ago

And real children aren’t in a capitalist society?

S410@lemmy.ml · edit-2 9 months ago

deleted by creator

MBM@lemmings.world · 10 months ago

Sounds like a solution would be to force, for any AI, to either share the source code or proof that it’s not trained on copyrighted data

GentlemanLoser@ttrpg.network · 10 months ago

Naive

Exatron@lemmy.world · 10 months ago

The difference here is that a child can’t absorb and suddenly use massive amounts of data.

S410@lemmy.ml · edit-2 9 months ago

deleted by creator

Barbarian@sh.itjust.works · edit-2 10 months ago

I really don’t understand this whole “learning” thing that everybody claims these models are doing.

A Markov chain algorithm with different inputs of text and the output of the next predicted word isn’t colloquially called “learning”, yet it’s fundamentally the same process, just less sophisticated.

They take input, apply a statistical model to it, generate output derived from the input. Humans have creativity, lateral thinking and the ability to understand context and meaning. Most importantly, with art and creative writing, they’re trying to express something.

“AI” has none of these things, just a probability for which token goes next considering which tokens are there already.

sus@programming.dev · edit-2 10 months ago

I don’t think “learning” is a word reserved only for high-minded creativeness. Just rote memorization and repetition is sometimes called learning. And there are many intermediate states between them.

agamemnonymous@sh.itjust.works · 10 months ago

Humans have creativity, lateral thinking and the ability to understand context and meaning

What evidence do you have that those aren’t just sophisticated, recursive versions of the same statistical process?

Barbarian@sh.itjust.works · edit-2 10 months ago

I think the best counter to this is to consider the zero learning state. A language model or art model without any training data at all will output static, basically. Random noise.

A group of humans socially isolated from the rest of the world will independently create art and music. It has happened an uncountable number of times. It seems to be a fairly automatic emergent property of human societies.

With that being the case, we can safely say that however creativity works, it’s not merely compositing things we’ve seen or heard before.

agamemnonymous@sh.itjust.works · 10 months ago

I disagree with this analysis. Socially isolated humans aren’t isolated, they still have nature to imitate. There’s no such thing as a human with no training data. We gather training data our whole life, possibly from the womb. Even in an isolated group, we still have others of the group to imitate, who in turn have ancestors, and again animals and natural phenomena. I would argue that all creativity is precisely compositing things we’ve seen or heard before.

testfactor@lemmy.world · 10 months ago

Out of curiosity, how far do you extend this logic?

Let’s say I’m an artist who does fractal art, and I do a line of images where I take jpegs of copywrite protected art and use the data as a seed to my fractal generation function.

Have I have then, in that instance, taken a copywritten work and simply applied some static algorithm to it and passed it off as my own work, or have I done something truly transformative?

The final image I’m displaying as my own art has no meaningful visual cues to the original image, as it’s just lines and colors generated using the image as a seed, but I’ve also not applied any “human artistry” to it, as I’ve just run it through an algorithm.

Should I have to pay the original copywrite holder?
If so, what makes that fundamentally different from me looking at the copywritten image and drawing something that it inspired me to draw?
If not, what makes that fundamentally different from AI images?

LWD@lemm.ee · edit-2 10 months ago

deleted

testfactor@lemmy.world · 10 months ago

I feel like you latched on to one sentence in my post and didn’t engage with the rest of it at all.

That sentence, in your defense, was my most poorly articulated, but I feel like you responded devoid of any context.

Am I to take it, from your response, that you think that a fractal image that uses a copywritten image as a seed to it’s random number generator would be copyright infringement?

If so, how much do I, as the creator, have to “transform” that base binary string to make it “fair use” in your mind? Are random but flips sufficient?
If so, how is me doing that different than having the machine do that as a tool? If not, how is that different than me editing the bits using a graphical tool?

LWD@lemm.ee · edit-2 10 months ago

deleted

testfactor@lemmy.world · 10 months ago

Fair on all counts. I guess my counter then would be, what is AI art other than running a bunch of pieces of other art through a computer system, then adding some “stuff you did” (to use your phrase) via a prompt, and then submitting the output as your own art.

That’s nearly identical to my fractal example, which I think you’re saying would actually be fair use?

HelloThere@sh.itjust.works · 10 months ago

It’s a question of scale. A single child cannot replace literally all artists, for example.

Exatron@lemmy.world · 10 months ago

The problem is that a human doesn’t absorb exact copies of what it learns from, and fair use doesn’t include taking entire works, shoving them in a box, and shaking it until something you want comes out.

S410@lemmy.ml · edit-2 9 months ago

deleted by creator

Exatron@lemmy.world · 10 months ago

Except they literally don’t. Human memory doesn’t retain an exact copy of things. Very good isn’t the same as exactly. And human beings can’t grab everything they see and instantly use it.

S410@lemmy.ml · edit-2 9 months ago

deleted by creator

PipedLinkBot@feddit.rocks · 10 months ago

Here is an alternative Piped link(s):

C418 - Haggstrom, but it’s composed by John Williams

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.