Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Darkly)
  • No Skin
Collapse
Brand Logo
  1. Home
  2. Uncategorized
  3. LLMs have no model of correctness, only typicality.

LLMs have no model of correctness, only typicality.

Scheduled Pinned Locked Moved Uncategorized
15 Posts 11 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Paul CantrellI Paul Cantrell

    LLMs have no model of correctness, only typicality. So:

    “How much does it matter if it’s wrong?”

    It’s astonishing how frequently both providers and users of LLM-based services fail to ask this basic question — which I think has a fairly obvious answer in this case, one that the research bears out.

    (Repliers, NB: Research that confirms the seemingly obvious is useful and important, and “I already knew that” is not information that anyone is interested in except you.)

    1/ https://www.404media.co/chatbots-health-medical-advice-study/

    Paul CantrellI This user is from outside of this forum
    Paul CantrellI This user is from outside of this forum
    Paul Cantrell
    wrote last edited by
    #2

    Despite the obviousness of the larger conclusion (“LLMs don’t give accurate medical advice”), this passage is…if not surprising, exactly, at least really really interesting.

    2/

    Paul CantrellI Dave bauerD Troed SångbergT Brian MarickM 4 Replies Last reply
    0
    • Paul CantrellI Paul Cantrell

      Despite the obviousness of the larger conclusion (“LLMs don’t give accurate medical advice”), this passage is…if not surprising, exactly, at least really really interesting.

      2/

      Paul CantrellI This user is from outside of this forum
      Paul CantrellI This user is from outside of this forum
      Paul Cantrell
      wrote last edited by
      #3

      There’s a lesson here, perhaps, about the tangled relationship between what is •typical• and what is •correct•, and what it is that LLMs actually do:

      When medical professionals ask medical questions in technical medical language, the answers they get are typically correct.

      When non-professional ask medical questions in a perhaps medically ill-formed vernacular mode, the answers they get are typically wrong.

      The LLM readily models both of these things. Despite having no notion of correctness in either case, correctness is more statistically typical in one than the other.

      3/

      VM Greg WhiteheadG Garrett WollmanW mirth@mastodon.sdf.orgM 4 Replies Last reply
      0
      • Paul CantrellI Paul Cantrell

        LLMs have no model of correctness, only typicality. So:

        “How much does it matter if it’s wrong?”

        It’s astonishing how frequently both providers and users of LLM-based services fail to ask this basic question — which I think has a fairly obvious answer in this case, one that the research bears out.

        (Repliers, NB: Research that confirms the seemingly obvious is useful and important, and “I already knew that” is not information that anyone is interested in except you.)

        1/ https://www.404media.co/chatbots-health-medical-advice-study/

        George BG This user is from outside of this forum
        George BG This user is from outside of this forum
        George B
        wrote last edited by
        #4

        @inthehands

        It has been so hard to explain that to family members who ask about LLMs "but it's right most of the time" is one of the most common responses when I talk about how there is no internal sense of reality or truth so they need to check every output to be sure

        1 Reply Last reply
        0
        • Paul CantrellI Paul Cantrell

          Despite the obviousness of the larger conclusion (“LLMs don’t give accurate medical advice”), this passage is…if not surprising, exactly, at least really really interesting.

          2/

          Dave bauerD This user is from outside of this forum
          Dave bauerD This user is from outside of this forum
          Dave bauer
          wrote last edited by
          #5

          @inthehands Obvious to me. Having the same family doctor who knows you all for 20 years really is important and an immense privilege.

          1 Reply Last reply
          0
          • Paul CantrellI Paul Cantrell

            Despite the obviousness of the larger conclusion (“LLMs don’t give accurate medical advice”), this passage is…if not surprising, exactly, at least really really interesting.

            2/

            Troed SångbergT This user is from outside of this forum
            Troed SångbergT This user is from outside of this forum
            Troed Sångberg
            wrote last edited by
            #6

            @inthehands This is why experienced developers can make use of LLMs, and why LLMs won't replace them.

            Greg LloydR 1 Reply Last reply
            0
            • Paul CantrellI Paul Cantrell

              There’s a lesson here, perhaps, about the tangled relationship between what is •typical• and what is •correct•, and what it is that LLMs actually do:

              When medical professionals ask medical questions in technical medical language, the answers they get are typically correct.

              When non-professional ask medical questions in a perhaps medically ill-formed vernacular mode, the answers they get are typically wrong.

              The LLM readily models both of these things. Despite having no notion of correctness in either case, correctness is more statistically typical in one than the other.

              3/

              VM This user is from outside of this forum
              VM This user is from outside of this forum
              V
              wrote last edited by
              #7

              @inthehands This result makes sense - they generate *statistically likely* text based on a prompt, and the stolen words of basically the entire internet and several libraries worth of books.
              If the prompt is such that the text it generates is statistically-likely to be correct - the language used closely aligns with a medical textbook, diagnostic manual, etc. - it's more likely to generate text based on sources like that.
              If it sounds like a tweet, you're more likely to get a shitpost.

              VM 1 Reply Last reply
              0
              • VM V

                @inthehands This result makes sense - they generate *statistically likely* text based on a prompt, and the stolen words of basically the entire internet and several libraries worth of books.
                If the prompt is such that the text it generates is statistically-likely to be correct - the language used closely aligns with a medical textbook, diagnostic manual, etc. - it's more likely to generate text based on sources like that.
                If it sounds like a tweet, you're more likely to get a shitpost.

                VM This user is from outside of this forum
                VM This user is from outside of this forum
                V
                wrote last edited by
                #8

                @inthehands It has no concept of what is correct, real, valuable, or meaningful - only what is statistically likely given a particular prompt.
                Which is a problem - because if you ask it a question, you need to know the correct answer, or have the means to verify it.
                Because it has no idea what the correct answer is.
                If you don't know enough to be able to verify the result, then you can't trust it.

                1 Reply Last reply
                0
                • Paul CantrellI Paul Cantrell

                  There’s a lesson here, perhaps, about the tangled relationship between what is •typical• and what is •correct•, and what it is that LLMs actually do:

                  When medical professionals ask medical questions in technical medical language, the answers they get are typically correct.

                  When non-professional ask medical questions in a perhaps medically ill-formed vernacular mode, the answers they get are typically wrong.

                  The LLM readily models both of these things. Despite having no notion of correctness in either case, correctness is more statistically typical in one than the other.

                  3/

                  Greg WhiteheadG This user is from outside of this forum
                  Greg WhiteheadG This user is from outside of this forum
                  Greg Whitehead
                  wrote last edited by
                  #9

                  @inthehands I continue to be well-served by treating LLMs as fancy autocomplete and not anthropomorphizing them. I feel like the chat interface is where things went sideways, making it too easy to believe that they "think"

                  1 Reply Last reply
                  0
                  • Paul CantrellI Paul Cantrell

                    Despite the obviousness of the larger conclusion (“LLMs don’t give accurate medical advice”), this passage is…if not surprising, exactly, at least really really interesting.

                    2/

                    Brian MarickM This user is from outside of this forum
                    Brian MarickM This user is from outside of this forum
                    Brian Marick
                    wrote last edited by
                    #10

                    @inthehands An aside. When people used to ask Dawn wasn’t it hard to treat animals because “they can’t tell you what’s wrong,” she’d answer that they also can’t lie about it. She thought the latter probably outweighed the former.

                    1 Reply Last reply
                    0
                    • Troed SångbergT Troed Sångberg

                      @inthehands This is why experienced developers can make use of LLMs, and why LLMs won't replace them.

                      Greg LloydR This user is from outside of this forum
                      Greg LloydR This user is from outside of this forum
                      Greg Lloyd
                      wrote last edited by
                      #11

                      @troed @inthehands

                      I see the high end #LLM experience like riding a good horse — exceptionally skilled in horsey things, moving fast, etc — an augmentation tool that’s exceptionally easy to use to augment your own abilities, not an #AI.

                      Ref 🧵https://federate.social/@Roundtrip/115549029949917075

                      1 Reply Last reply
                      0
                      • Paul CantrellI Paul Cantrell

                        LLMs have no model of correctness, only typicality. So:

                        “How much does it matter if it’s wrong?”

                        It’s astonishing how frequently both providers and users of LLM-based services fail to ask this basic question — which I think has a fairly obvious answer in this case, one that the research bears out.

                        (Repliers, NB: Research that confirms the seemingly obvious is useful and important, and “I already knew that” is not information that anyone is interested in except you.)

                        1/ https://www.404media.co/chatbots-health-medical-advice-study/

                        Tropical ChaosT This user is from outside of this forum
                        Tropical ChaosT This user is from outside of this forum
                        Tropical Chaos
                        wrote last edited by
                        #12

                        @inthehands chatbots are terrible, period.

                        1 Reply Last reply
                        0
                        • Paul CantrellI Paul Cantrell

                          There’s a lesson here, perhaps, about the tangled relationship between what is •typical• and what is •correct•, and what it is that LLMs actually do:

                          When medical professionals ask medical questions in technical medical language, the answers they get are typically correct.

                          When non-professional ask medical questions in a perhaps medically ill-formed vernacular mode, the answers they get are typically wrong.

                          The LLM readily models both of these things. Despite having no notion of correctness in either case, correctness is more statistically typical in one than the other.

                          3/

                          Garrett WollmanW This user is from outside of this forum
                          Garrett WollmanW This user is from outside of this forum
                          Garrett Wollman
                          wrote last edited by
                          #13

                          @inthehands Worth noting, however, that when the training set captures a lot of outdated or irrelevant information, because the field has advanced rapidly since the model was trained, "typical" can start to diverge again. This can be mitigated if the practitioner knows to consult the latest information (either by reading it or by feeding it to the model as a part of the query) but of course they have to be aware of that. This is I suppose no worse than relying on the practitioner's knowledge.

                          Garrett WollmanW 1 Reply Last reply
                          0
                          • Garrett WollmanW Garrett Wollman

                            @inthehands Worth noting, however, that when the training set captures a lot of outdated or irrelevant information, because the field has advanced rapidly since the model was trained, "typical" can start to diverge again. This can be mitigated if the practitioner knows to consult the latest information (either by reading it or by feeding it to the model as a part of the query) but of course they have to be aware of that. This is I suppose no worse than relying on the practitioner's knowledge.

                            Garrett WollmanW This user is from outside of this forum
                            Garrett WollmanW This user is from outside of this forum
                            Garrett Wollman
                            wrote last edited by
                            #14

                            @inthehands OTOH, as practitioners come to rely on stochastic information retrieval for more and more diagnoses, as it confirms what they already know, it may cause them to assign more weight to the information in the model than is justified, overruling their own second thoughts. ("Computer says...")

                            1 Reply Last reply
                            0
                            • Paul CantrellI Paul Cantrell

                              There’s a lesson here, perhaps, about the tangled relationship between what is •typical• and what is •correct•, and what it is that LLMs actually do:

                              When medical professionals ask medical questions in technical medical language, the answers they get are typically correct.

                              When non-professional ask medical questions in a perhaps medically ill-formed vernacular mode, the answers they get are typically wrong.

                              The LLM readily models both of these things. Despite having no notion of correctness in either case, correctness is more statistically typical in one than the other.

                              3/

                              mirth@mastodon.sdf.orgM This user is from outside of this forum
                              mirth@mastodon.sdf.orgM This user is from outside of this forum
                              mirth@mastodon.sdf.org
                              wrote last edited by
                              #15

                              @inthehands One of the factors in this mess is the heavily-boosted notion that LLM's contain facts or knowledge. Coincidentally, sort of, but not really. A safer mental model is to think of them as a fuzzy virtual machine of sorts, not unlike a vibe-y JVM but programmed in something dressed as plain language. Garbage-in-garbage-out. Often anything-in-garbage-out.

                              1 Reply Last reply
                              0
                              • R ActivityRelay shared this topic
                              Reply
                              • Reply as topic
                              Log in to reply
                              • Oldest to Newest
                              • Newest to Oldest
                              • Most Votes


                              • Login

                              • Don't have an account? Register

                              • Login or register to search.
                              Powered by NodeBB Contributors
                              • First post
                                Last post
                              0
                              • Categories
                              • Recent
                              • Tags
                              • Popular
                              • World
                              • Users
                              • Groups