Q: I want to wash my car.
-
@knowmadd@mastodon.world for a second, I read the question as, "The car is 50 meters away, should I walk or drive?" Then I realized it said "The car wash is 50 meters away," and I got why this would trick the AI.
LLMs work on the "attention" model to predict what output comes next. It is trained on which parts of the sentence deserve the most focus when predicting the result and generating an answer. If the meaning of a sentence can be changed entirely by just one short word, it is more likely to trip-up an LLM.
-
@knowmadd clankers have no idea about real life. I hope we will see the end of this bullshit.
-
@knowmadd start pushing!
-
@knowmadd Did you also do a survey how many people would be tricked by this question? I, for one, admit am one, because my initial reaction to your post was: what's wrong with that answer?
-
I think you should walk to the carwash, dismantle it, walk back and rebuild it around your car. When tested everything, make sure your permits are okay, etc, then start the washing.
@bitchboss @knowmadd @MissGayle
Right. Of course, LLMs, lacking creative thinking, aren't able to come up with this by themselves. -
@bitchboss @knowmadd @MissGayle
Right. Of course, LLMs, lacking creative thinking, aren't able to come up with this by themselves.@GerardThornley @knowmadd @MissGayle
What did we expect from an optimised translator/spell checker? Creativity? Reasoning? Ethics? Meh. It loosely strings things together and searches for combinations that appear in a piece of text that was once ripped off, and assumes without even reasoning that it must be the holy truth.
-
@knowmadd Google's gets it right, but then goes on to ramble about stuff. Someone needs to instruct these things not to analyse or "break this down" so much.
All in all, as expected, disappointing. -
@knowmadd gemini

-
@knowmadd if you walk, you are, in fact, carrying heavy equipment: the car.

-
@knowmadd Did you also do a survey how many people would be tricked by this question? I, for one, admit am one, because my initial reaction to your post was: what's wrong with that answer?
@erwinrossen really? My first reaction in my head was was 'what a dumb question'
-
@knowmadd This is a very sad reflection on the minds of people today, the inability to read a question fully, the wrong standards, the assumptions made, everything.
-
@knowmadd Google's gets it right, but then goes on to ramble about stuff. Someone needs to instruct these things not to analyse or "break this down" so much.
All in all, as expected, disappointing. -
@knowmadd next, ask a reasonable question, and then simply state "Seahorse Emoji, now."
-
-
@knowmadd What I like most is that the Qwen website shows this little light bulb with the text “thinking completed.”

-
@knowmadd yeah, LLMs will replace us all ... they are so much better at {looking frantically through my notes} ... providing answers with high confidence that are utter nonsense.
-
@knowmadd I tried to reproduce the result with Gemini and ChatGPT. Either the AI has learned something new, or there is another reason for this. Neither fell for the trick question and even responded with irony in some cases.
-
@knowmadd i got this : "Verdict: Walking is the best choice here—it’s quick, eco-friendly, and practical for such a short distance. Plus, you’ll avoid driving a dirty car to the car wash!"
-
R ActivityRelay shared this topic