It is mentioned later in the article but I think it's important to clearly draw a distinction between cases where a) The "offender" is using the licensed work within the letter of the license but not the spirit b) The "offender" has broken both the letter and spirit of the license.
I've licensed multiple repositories under MIT, written under CC-BY, and published games under ORC. All of those licenses require attribution, something that AI, for example, explicitly ignores. In those situations "Wait, no, not like that" isn't "I didn't expect you'd use it this way" it's "you weren't authorized to use it this way."
The internet as an open resource has been a tremendous boon to society as a whole. AI, likewise, acts as a force multiplier on top of this knowledge. You can learn anything. But AI also depletes the resources on which it builds.
An obvious way to counteract this is for the AI companies to give back generously through monetary donations or, at the very least, attributions. But unfortunately we see exactly the opposite.
This is a perfect modern example of the Tragedy of the Commons, where the absence of governance mechanisms or social contracts around a shared resource (open knowledge) leads to exploitation that threatens the resource's sustainability for everyone.
If implementing a full game theory solution is challenging, a minimal viable approach could combine:
- Wikimedia's Enterprise API model for high-volume users
- Technical measures to identify and throttle non-contributing scrapers
- Public transparency reports on AI company usage and contributions
- Industry certification program for "commons-friendly" AI development
This hybrid approach uses game theory principles to realign incentives while being practically implementable with current technologies and organizational structures.
If you care about software freedom, you need to make all your software AGPL.
MIT is the "wait, no, not like that" license and GPL is a half-measure.
Non-commercial licenses are fine if you also provide a commercial option - who cares what the OSI thinks. (And you might want to look up who's a member of the OSI)
What I worry about is once the robots become the main source of information then how will the corps restrict what information is fed into them to support whatever bias they have.
For example, just yesterday someone posted "noam chomsky is a genocide denier" so I went internet sleuthing to see what they were talking about. I first asked google and then ended up on the "Bosnian genocide denial" wikipedia page. I read the argument and checked the sources and concluded that, maybe, someone could make that claim.
Today, in response to TFA, I asked deepseek and received a well-rounded and, IMHO, unbiased response to the same question which summarized the arguments from both sides. The only problem is they cite no sources so you just have to trust the response or do as I did yesterday and go to the google.
Personally, if someone makes an extraordinary claim I'm going to go digging to find out what they're talking about, if their argument is based on fact and if you can draw their conclusion from the facts. Take that ability away and we're just a bunch of sheep for the Silicon Valley Billionaires Club to fleece.
It is mentioned later in the article but I think it's important to clearly draw a distinction between cases where a) The "offender" is using the licensed work within the letter of the license but not the spirit b) The "offender" has broken both the letter and spirit of the license.
I've licensed multiple repositories under MIT, written under CC-BY, and published games under ORC. All of those licenses require attribution, something that AI, for example, explicitly ignores. In those situations "Wait, no, not like that" isn't "I didn't expect you'd use it this way" it's "you weren't authorized to use it this way."
This reminds me of “Fuck you, pay me” a talk[0] given by Mike Monteiro on contract work (I believe the title is based on a quote from Goodfellas[1]).
[0] https://m.youtube.com/watch?v=jVkLVRt6c1U
[1] https://m.youtube.com/watch?v=P4nYgfV2oJA
Great post. I love the vampirism metaphor.
The internet as an open resource has been a tremendous boon to society as a whole. AI, likewise, acts as a force multiplier on top of this knowledge. You can learn anything. But AI also depletes the resources on which it builds.
An obvious way to counteract this is for the AI companies to give back generously through monetary donations or, at the very least, attributions. But unfortunately we see exactly the opposite.
If you train your AI on the commons, everything it generates should be in the commons. And all your profits should be shared with everyone.
If running your AI is incompatible with respecting copyright and intellectual property then you should not get to own a single bit of its output.
This is a perfect modern example of the Tragedy of the Commons, where the absence of governance mechanisms or social contracts around a shared resource (open knowledge) leads to exploitation that threatens the resource's sustainability for everyone.
If implementing a full game theory solution is challenging, a minimal viable approach could combine:
- Wikimedia's Enterprise API model for high-volume users - Technical measures to identify and throttle non-contributing scrapers - Public transparency reports on AI company usage and contributions - Industry certification program for "commons-friendly" AI development
This hybrid approach uses game theory principles to realign incentives while being practically implementable with current technologies and organizational structures.
I always appreciate a good post by Molly White :D
This is such good writing, and manages to offer a nuanced and informative new angle on an issue which has already been discussed at great length.
If you care about software freedom, you need to make all your software AGPL.
MIT is the "wait, no, not like that" license and GPL is a half-measure.
Non-commercial licenses are fine if you also provide a commercial option - who cares what the OSI thinks. (And you might want to look up who's a member of the OSI)
What I worry about is once the robots become the main source of information then how will the corps restrict what information is fed into them to support whatever bias they have.
For example, just yesterday someone posted "noam chomsky is a genocide denier" so I went internet sleuthing to see what they were talking about. I first asked google and then ended up on the "Bosnian genocide denial" wikipedia page. I read the argument and checked the sources and concluded that, maybe, someone could make that claim.
Today, in response to TFA, I asked deepseek and received a well-rounded and, IMHO, unbiased response to the same question which summarized the arguments from both sides. The only problem is they cite no sources so you just have to trust the response or do as I did yesterday and go to the google.
Personally, if someone makes an extraordinary claim I'm going to go digging to find out what they're talking about, if their argument is based on fact and if you can draw their conclusion from the facts. Take that ability away and we're just a bunch of sheep for the Silicon Valley Billionaires Club to fleece.