I've asked Claude to implement a Rust port of JS library given source code in another directory: there's already one such implementation in OSS, mine.
-
@robertobottoni @horusiath I think the point here is, that AI is a way to bypass license restrictions. As an author of GPL software, the intention is to have it stay open and be the foundation of growing an open community. AI uses this intellectual property without marking it as such. This way the orginal work is violated. This is also the reason, why open source software will become less over time and that will have an impact on everyting!
-
@robertobottoni @horusiath I think the point here is, that AI is a way to bypass license restrictions. As an author of GPL software, the intention is to have it stay open and be the foundation of growing an open community. AI uses this intellectual property without marking it as such. This way the orginal work is violated. This is also the reason, why open source software will become less over time and that will have an impact on everyting!
@holtwick

️ Sorry, should have read that more thoroughly. My answer was absolute incoherent, deleted it. Thank you for pointing it out to me.@horusiath is of course absolutely correct in everything he says!
-
@holtwick

️ Sorry, should have read that more thoroughly. My answer was absolute incoherent, deleted it. Thank you for pointing it out to me.@horusiath is of course absolutely correct in everything he says!
@robertobottoni @horusiath No problem, these are challenging times. AI is certainly an epochal shift in how we use technology. But such breaks come with a lot of ethical questions as well. Hopefully we can put #AI on the right track, the train is unstoppable anyway.
-
@robertobottoni @horusiath No problem, these are challenging times. AI is certainly an epochal shift in how we use technology. But such breaks come with a lot of ethical questions as well. Hopefully we can put #AI on the right track, the train is unstoppable anyway.
@holtwick @horusiath so true. In my opinion negative consequences will dominate. As much as the technology can be of use, especially in our field, it will have tremendous impact regarding disinformation and making it more difficult to distinguish right from wrong with an increase of biased sources of information in everyday life.
The only solution is lying in educated people and maybe platform regulation. Which has to be discussed for FOSS from a different perspective.
-
It's sort of eye-opening experience, as person very familiar with the plagiarised source you can see how LLM is stitching together fragments of code seen somewhere else. It's just that most of the time we don't know the original code that AI reused and cannot notice the stitches, so we consider it to be an original writing.
gl;hf to all the people using AI to write code in a domain where GPL source was available.
@horusiath Their fundamental algorithm is to reproduce characteristics of the text they're trained on. That means writing words (which can be the same or a synonym) in the same order. In formal languages such as programming languages there are not many synonyms (that's the point: be concise and unambiguous). Dependencies across code blocks quickly constraint the possible word chainings.
So… basically they can only be a sophisticated code retrieval system.
-
@holtwick @horusiath so true. In my opinion negative consequences will dominate. As much as the technology can be of use, especially in our field, it will have tremendous impact regarding disinformation and making it more difficult to distinguish right from wrong with an increase of biased sources of information in everyday life.
The only solution is lying in educated people and maybe platform regulation. Which has to be discussed for FOSS from a different perspective.
@robertobottoni @holtwick @horusiath If you think that negative consequences will dominate then just stop using it. Every use is a promotion of the technology and will thus aggravate the negative consequences even more.
-
@horusiath Their fundamental algorithm is to reproduce characteristics of the text they're trained on. That means writing words (which can be the same or a synonym) in the same order. In formal languages such as programming languages there are not many synonyms (that's the point: be concise and unambiguous). Dependencies across code blocks quickly constraint the possible word chainings.
So… basically they can only be a sophisticated code retrieval system.
@Fedihacker IMO you underestimate on how many ways you could solve the same problem.
Besides I'm talking about:
- Porting types that were not present in source, but existed in past versions & my code.
- Porting names that didn't exist in source, but exist in my implementation.
- Using highly un-idiomatic design choices, not present in source. Tbh. I haven't found them anywhere outside my lib.It's way too specific to be considered "anyone would write it this way".
-
It's sort of eye-opening experience, as person very familiar with the plagiarised source you can see how LLM is stitching together fragments of code seen somewhere else. It's just that most of the time we don't know the original code that AI reused and cannot notice the stitches, so we consider it to be an original writing.
gl;hf to all the people using AI to write code in a domain where GPL source was available.
@horusiath ive seen this in specialist domains, giving it nothing to start it gave me back the opensource vendor example near verbatim and couldnt generalise to handle errors correctly or at all.
-
@robertobottoni @holtwick @horusiath If you think that negative consequences will dominate then just stop using it. Every use is a promotion of the technology and will thus aggravate the negative consequences even more.
@jlink @holtwick @horusiath I wrote that in respect to the technology being almost inevitable in everyday life. Text and images are not labelled AI generated. Today you still can distinguish in most cases but that will get harder.
As dev it is even harder to tell just by looking at code today. But still, the technology is a supporter and we will all have to use it to some extend in my opinion especially when it comes to tedious, boring or time consuming tasks like error tracking, doc writing.
-
I've asked Claude to implement a Rust port of JS library given source code in another directory: there's already one such implementation in OSS, mine. Result?
AI blatantly plagiarised my OSS code, including parts that were not present in source it was pointed to port.
Upon prohibiting it to touch outside implementations and focus on translating local directory... it ignored command and plagiarised my work again. It did the same with C# port and one existing impl.
@horusiath Generative AI should be called Derivative AI. It's a copyright/license washing machine for images, code, books...
-
I've asked Claude to implement a Rust port of JS library given source code in another directory: there's already one such implementation in OSS, mine. Result?
AI blatantly plagiarised my OSS code, including parts that were not present in source it was pointed to port.
Upon prohibiting it to touch outside implementations and focus on translating local directory... it ignored command and plagiarised my work again. It did the same with C# port and one existing impl.
@horusiath But you expected differently for ... why? Seems obvious that there's not much else it could do?
-
I've asked Claude to implement a Rust port of JS library given source code in another directory: there's already one such implementation in OSS, mine. Result?
AI blatantly plagiarised my OSS code, including parts that were not present in source it was pointed to port.
Upon prohibiting it to touch outside implementations and focus on translating local directory... it ignored command and plagiarised my work again. It did the same with C# port and one existing impl.
Yours was the only Markoff chain it had available to tug on.
-
It's sort of eye-opening experience, as person very familiar with the plagiarised source you can see how LLM is stitching together fragments of code seen somewhere else. It's just that most of the time we don't know the original code that AI reused and cannot notice the stitches, so we consider it to be an original writing.
gl;hf to all the people using AI to write code in a domain where GPL source was available.
@horusiath
fun fact: this is not only a problem with GPL, but almost all code licenses (except for public-domain-likes) require you to keep the license/copyright header *at least in the code*, often also in documentation shipped with binaries (e.g. MIT license) -
R ActivityRelay shared this topic