Absolutely not true. Regulations are both in place and in development, and none of them seem like they would prevent any of the applications currently in the market. I know the fearmongering side keeps arguing that a copyright case will stop the development of these but, to be clear, that’s not going to happen. All it’ll take is an extra line in an EULA to mitigate or investing in the dataset of someone who has a line in their EULA (Twitter, Reddit already, more to come for sure). The industry is actually quite fond of copyright-based training restrictions, as their main effect is most likely to be to close off open source alternatives and make it so that only Meta, Google, and MS/OpenAI can afford model training.
These are super not going away. Regulation is needed, but it’s not restricting or eliminating these applications in any way that would make a dent on the also poorly understood power consumption costs.
Yeah, who’s saying it doesn’t? It prevents the practices it prevents and allows the rest of the practices.
The regulation you’re going to see on this does not, in fact, prevent making LLMs or image generators, though. And it does not, in fact prevent running them and selling them to people.
You guys have gotten it in your head that training data permissions are going to be the roadblock here, and they’re absolutely not going to be. There will be common sense options, like opt-outs and opt-out defaults by mandate, just like there are on issues of data privacy under GDPR, but not absolute bans by any means.
So how much did opt-out defaults under GDPR stop social media and advertising companies from running social media and advertising data businesses?
Exactly.
What that will do is make it so you have to own a large set of accessible data, like social media companies do. They are positively salivating at the possibility that AI training will require paying them, since they’ll have a user agreement that demands allowing your data to be sold for training. Meanwhile, developers of open alternatives, who are currently running out of a combination of openly accessible online data and monetized datasets put together specifically for research, will face more cost to develop alternatives. Ideally, hope the large AI corporations, too much cost pressure and they will be bullied out of the market, or at least forced to lag behind in quality by several generations.
That’s what’s currently happening regarding regulation, along with a bunch of more reasonable guardrails about what you should and should not generate and so on. You’ll notice I didn’t mention anything about power or specific applications there. LLMs and image generators are not going away and their power consumption is not going to be impacted.
They exist at the current scale because we’re not regulating them, not whether we like it or not.
Absolutely not true. Regulations are both in place and in development, and none of them seem like they would prevent any of the applications currently in the market. I know the fearmongering side keeps arguing that a copyright case will stop the development of these but, to be clear, that’s not going to happen. All it’ll take is an extra line in an EULA to mitigate or investing in the dataset of someone who has a line in their EULA (Twitter, Reddit already, more to come for sure). The industry is actually quite fond of copyright-based training restrictions, as their main effect is most likely to be to close off open source alternatives and make it so that only Meta, Google, and MS/OpenAI can afford model training.
These are super not going away. Regulation is needed, but it’s not restricting or eliminating these applications in any way that would make a dent on the also poorly understood power consumption costs.
Regulating markets absolutely does prevent practices in those markets. Literally the point.
Yeah, who’s saying it doesn’t? It prevents the practices it prevents and allows the rest of the practices.
The regulation you’re going to see on this does not, in fact, prevent making LLMs or image generators, though. And it does not, in fact prevent running them and selling them to people.
You guys have gotten it in your head that training data permissions are going to be the roadblock here, and they’re absolutely not going to be. There will be common sense options, like opt-outs and opt-out defaults by mandate, just like there are on issues of data privacy under GDPR, but not absolute bans by any means.
So how much did opt-out defaults under GDPR stop social media and advertising companies from running social media and advertising data businesses?
Exactly.
What that will do is make it so you have to own a large set of accessible data, like social media companies do. They are positively salivating at the possibility that AI training will require paying them, since they’ll have a user agreement that demands allowing your data to be sold for training. Meanwhile, developers of open alternatives, who are currently running out of a combination of openly accessible online data and monetized datasets put together specifically for research, will face more cost to develop alternatives. Ideally, hope the large AI corporations, too much cost pressure and they will be bullied out of the market, or at least forced to lag behind in quality by several generations.
That’s what’s currently happening regarding regulation, along with a bunch of more reasonable guardrails about what you should and should not generate and so on. You’ll notice I didn’t mention anything about power or specific applications there. LLMs and image generators are not going away and their power consumption is not going to be impacted.