Are you using #Codeberg to host your favorite AI-assisted and otherwise vibecoded project because your desire for dopamine has utterly destroyed your willingness to learn new things?
-
@pojntfx There has, they used to call them "Google dorks" or "Advanced Search". Now if that didn't work 100% of the time, well, applies to both cases (readjusting a query in a different form); besides the point anyway because people are not looking for something that works for minor edge cases, they are looking for a way to look for informationβand you can't even look up for an omelette recipe anymore without SEO garbage taking up the first two pages.
@n0toose Yeah, the SEO slop is obviously horrendous. I'm ngl though, being able to use an LLM to search through say IndieWeb instead has been the first time in a long time that I've actually been able to find (non-code) answers to questions again. Used it for booking flights with niche airlines in a country I've never been to for example. That kind of stuff was always locked behind proprietary APIs for so long and now you can actually access them without them for the first time in forever.
-
@n0toose There were lots of proposals around criminalizing "unauthorized access" to services in the past few decades, about trying to make it so that only a "human" can access them, enforcing ToS legally ... I've really only seen them used against end users in practice (Reddit's anti-scraping policy/API shutdown, third-party clients for Signal, any reverse engineering project ever etc.)
A lot of these kinds of laws will have effects far, far worse than DDoSing public infrastructure IMHO.
@pojntfx I think one can be for scraping and making data available e.g. for researchers but against the specific manners in which startups break thingsβit's just hard to explain that to someone who doesn't operate infrastructure for people at scale.
Anyway, we just have to spend two or three times the price on SSDs I guess (see: greater societal impact), so if that's fine...
-
@pojntfx I also think that the notion of local LLMs letting you find niche papers exaggerates their abilities, and that the ability to use them depends on hardware that is not accessible anymore due to data center costs and IMO due to the overall war against general purpose computing.
@n0toose There are production constraints around all of this atm, yes. But much like how you can't fix the housing crisis without making it cheaper to build houses and actually building them, I don't think we can fix something like this without actually building out the fabs and getting supply up there w/ demand.
And local LLMs are very much "real" now. Try out Newelle or Alpaca on GNOME on your regular laptop - even mine can run them w/o issues now via Vulkan, and I don't have a lot of VRAM.
-
@n0toose Yeah, the SEO slop is obviously horrendous. I'm ngl though, being able to use an LLM to search through say IndieWeb instead has been the first time in a long time that I've actually been able to find (non-code) answers to questions again. Used it for booking flights with niche airlines in a country I've never been to for example. That kind of stuff was always locked behind proprietary APIs for so long and now you can actually access them without them for the first time in forever.
@pojntfx Yeah and you're treating this as if it's something to be taken for granted forever; the counterexamples are the equivalents of "searxng" or alternative search engines to me tbh.
-
@pojntfx I think one can be for scraping and making data available e.g. for researchers but against the specific manners in which startups break thingsβit's just hard to explain that to someone who doesn't operate infrastructure for people at scale.
Anyway, we just have to spend two or three times the price on SSDs I guess (see: greater societal impact), so if that's fine...
@n0toose I mean yes, optimally a law like you mention would try and fix this, but I have 0 trust in any jurisdiction actually making a law like that. I'm pretty certain we'll instead end up in a world where only massive companies that can pay for IP licensing agreements can train models.
-
@n0toose There are production constraints around all of this atm, yes. But much like how you can't fix the housing crisis without making it cheaper to build houses and actually building them, I don't think we can fix something like this without actually building out the fabs and getting supply up there w/ demand.
And local LLMs are very much "real" now. Try out Newelle or Alpaca on GNOME on your regular laptop - even mine can run them w/o issues now via Vulkan, and I don't have a lot of VRAM.
@pojntfx and you're using that to find obscure papers?
-
@pojntfx Yeah and you're treating this as if it's something to be taken for granted forever; the counterexamples are the equivalents of "searxng" or alternative search engines to me tbh.
@n0toose I don't know about your experience, but any legal options I've found that try to solve this problem are worse than useless. I find nothing of relevance on Marginalia and other things like it.
-
@pojntfx and you're using that to find obscure papers?
@n0toose Yeah! I mean I just did yesterday, for that CRIU one. Try it
All you need is a CPU or anything that can do Vulkan. I use fully OSS drivers, even runs on those. -
@n0toose Yeah! I mean I just did yesterday, for that CRIU one. Try it
All you need is a CPU or anything that can do Vulkan. I use fully OSS drivers, even runs on those.@pojntfx I mean, mildly interesting (and thanks for letting me know) but doesn't convince me on ethical grounds nevertheless.
-
@pojntfx I mean, mildly interesting (and thanks for letting me know) but doesn't convince me on ethical grounds nevertheless.
@n0toose Fair, I understand that. It's ultimately your choice. If you work best w/o those tools - then that's great too.
-
R ActivityRelay shared this topic