If you use AI-generated code, you currently cannot claim copyright on it in the US.
-
@jamie The funny thing about this whole thread is apparently I'd already blocked that guy some time ago, so I'm only seeing your side of the conversation. And…that's all I need to know anyway.

@jaredwhite @jamie Thanks for the tip for another hateful person to block.
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf
@stroughtonsmith Is this relevant? I honestly don’t know a ton about this but I’m curious if you have thoughts on it…
-
-
@stroughtonsmith Is this relevant? I honestly don’t know a ton about this but I’m curious if you have thoughts on it…
@Verxion I think this is probably right:
-
@Verxion I think this is probably right:
@stroughtonsmith I think that’s fair. I seriously do and so I’m not disagreeing with you.
…the sad thing though (to me anyway) is that this means an indie dev is unlikely to be able to afford to retain ownership like a large corporation can.

-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf
@jamie I am afraid you are confusing registering copyright with the existence of copyright. They are not quite the same, and the differences are important.
Current law is that any human-created work is automatically copyrighted the moment it is created.
The link and screenshots you posted aren't about whether the human-written code mixed in with AI-written code is copyrighted—it is—they're about whether the copyright can be _registered_.
(1/2) -
@jamie I am afraid you are confusing registering copyright with the existence of copyright. They are not quite the same, and the differences are important.
Current law is that any human-created work is automatically copyrighted the moment it is created.
The link and screenshots you posted aren't about whether the human-written code mixed in with AI-written code is copyrighted—it is—they're about whether the copyright can be _registered_.
(1/2)@jamie A copyrighted work that isn't registered is still copyrighted. It's not "in the public domain."
Registration, in the U.S., allows for certain copyright enforcement actions that can't be taken for unregistered works. But whether or not a work is registered has no bearing on whether it is copyrighted vs. in the public domain.
(2/2) -
-
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf
@jamie Just waiting for someone finding derivates of their own GPL code in propritary AI generated code...
-
If you use AI-generated code, you currently cannot claim copyright on it in the US. If you fail to disclose/disclaim exactly which parts were not written by a human, you forfeit your copyright claim on *the entire codebase*.
This means copyright notices and even licenses folks are putting on their vibe-coded GitHub repos are unenforceable. The AI-generated code, and possibly the whole project, becomes public domain.
Source: https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf
@jamie so proprietary projects that are made with llms can be leaked legally since there's no copyright for it ?
-
@christianschwaegerl
maybe more like, sausages are vegan because an animal ate a vegan diet and then used those plant-based calories to grow it's animal body which was then packaged into a sausage.very vegan ; )
-
@fsinn @jamie My understanding was that training an AI model on copyrighted work was fair use, because the actual "distribution"--when the AI generates something from a prompt--uses a diminimus amount of copyrighted content from an individual work, except if the user explicitly prompted something like, "Give me Homer Simpson surfing a space orca," at which point the AI company would throw the user all the way under the bus.
-
@christianschwaegerl @jamie @Azuaron @fsinn
Yes. Any "direct quoting" of copyrighted works, as text files on a disk, for example, would > only be a bunch of numbers < too. ASCI, Unicode, UTF-8, etc. are ways of encoding text into numbers, and displaying text representations (glyphs) of them later.
So LLMs hold "indirect" and maybe "abstract" (or not) numbers related to the copyrighted works. Not sure how that will or should work out, from a legal perspective.
-
-
@katrinatransfem @fsinn @jamie If the material is acquired legally, they don't need a specific "license" to use it as training material. Copyright holders don't get to determine how their work is used after it's acquired, except to prevent its distribution.
Now, for the even larger than normal scumbags like Anthropic and Meta that torrented millions of books, that's certainly a problem. But Google, for instance, actually bought all the books they scanned.
@Azuaron @katrinatransfem @fsinn @jamie
I think that the careless, abusive, and harmful "gathering" practices need to be challenged as misuse of other's computing resources and the "distributed denial of service attacks" that they, in effect, are.
-
Additionally, AI generated code can be a copyright infringement if the AI basically generated a copy of some copyrighted code. And if we consider that AI is trained on lots of GPLed code there is a high probability it will generate code that would need to be licensed accordingly.
There is no clean room implementation of anything with AI. The code is immediately tainted.
@Lapizistik In the US, courts have determined (for now, at least) that training an AI model on copyrighted works is considered "fair use". So it's basically legalized copyright laundering. Even code released under the GPL loses its infectiousness when laundered through an LLM.
I'd be very interested to see what other countries do around that, because it would determine which models are legal to use where.
-
@Lapizistik In the US, courts have determined (for now, at least) that training an AI model on copyrighted works is considered "fair use". So it's basically legalized copyright laundering. Even code released under the GPL loses its infectiousness when laundered through an LLM.
I'd be very interested to see what other countries do around that, because it would determine which models are legal to use where.
@Lapizistik To be clear, I agree with you. It's a moral failure to make billions of dollars from other people's effort without compensating them at all.
-
@jamie A copyrighted work that isn't registered is still copyrighted. It's not "in the public domain."
Registration, in the U.S., allows for certain copyright enforcement actions that can't be taken for unregistered works. But whether or not a work is registered has no bearing on whether it is copyrighted vs. in the public domain.
(2/2)@jik In other parts of this thread, this is being discussed. I was limited on space, so I took shortcuts. What I meant is that, in order to enforce your copyright, you need to prove you own the copyright. Registering it is the single most effective way to do that.
If you can't register your copyright, you (effectively) can't enforce it.
If you can't enforce your copyright, your copyright vs public domain is a distinction without a meaningful difference.
I couldn't fit all that in the post.
-
@Lapizistik In the US, courts have determined (for now, at least) that training an AI model on copyrighted works is considered "fair use". So it's basically legalized copyright laundering. Even code released under the GPL loses its infectiousness when laundered through an LLM.
I'd be very interested to see what other countries do around that, because it would determine which models are legal to use where.
@jamie
This is not my point. Even if it _is_ “fair use”: if the llm produces a 1:1 copy (minus some renamed variables) of some relevant piece of code it is not producing something “new”. As a human I can learn from any code (copyrighted or not), but I cannot just take the code, rename some variables and publish it as my own creation. I would loose in court.¹So technically if you use an LLM to produce code for you you need to check if any relevant piece of it is a copy of anything that exists.
Clean room implementation requires the programmer to not have seen the original code but only the requirements.
__
¹otherwise you could just take any piece of copyrighted code, rename variables and say it is yours because an LLM has produced it.