Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Darkly)
  • No Skin
Collapse
Brand Logo
  1. Home
  2. Uncategorized
  3. "#News #publishers limit #InternetArchive access due to #AI scraping concerns."https://www.niemanlab.org/2026/01/news-publishers-limit-internet-archive-access-due-to-ai-scraping-concerns/

"#News #publishers limit #InternetArchive access due to #AI scraping concerns."https://www.niemanlab.org/2026/01/news-publishers-limit-internet-archive-access-due-to-ai-scraping-concerns/

Scheduled Pinned Locked Moved Uncategorized
newspublishersinternetarchivefairuse
3 Posts 1 Posters 6 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • petersuberP This user is from outside of this forum
    petersuberP This user is from outside of this forum
    petersuber
    wrote last edited by petersuber@fediscience.org
    #1

    "#News #publishers limit #InternetArchive access due to #AI scraping concerns."
    https://www.niemanlab.org/2026/01/news-publishers-limit-internet-archive-access-due-to-ai-scraping-concerns/

    PS: I'm one who thinks AI training on copyrighted content is #FairUse and (separate point) even desirable in the case of academic research.
    https://fediscience.org/@petersuber/113443473594224752

    But this kind of training will create huge collateral damage --indirectly through publisher action -- if it diminishes the @internetarchive.

    #Copyright #Journalism

    petersuberP 1 Reply Last reply
    0
    • petersuberP petersuber

      "#News #publishers limit #InternetArchive access due to #AI scraping concerns."
      https://www.niemanlab.org/2026/01/news-publishers-limit-internet-archive-access-due-to-ai-scraping-concerns/

      PS: I'm one who thinks AI training on copyrighted content is #FairUse and (separate point) even desirable in the case of academic research.
      https://fediscience.org/@petersuber/113443473594224752

      But this kind of training will create huge collateral damage --indirectly through publisher action -- if it diminishes the @internetarchive.

      #Copyright #Journalism

      petersuberP This user is from outside of this forum
      petersuberP This user is from outside of this forum
      petersuber
      wrote last edited by
      #2

      Update. It's happening. "News Publishers Are Now Blocking The Internet Archive, And We May All Regret It."
      https://www.techdirt.com/2026/02/13/news-publishers-are-now-blocking-the-internet-archive-and-we-may-all-regret-it/

      @mmasnick is right: "In our rush to punish #AI companies, we’re destroying public goods that serve everyone…We’re sacrificing the historical record not because of proven harm, but because publishers are worried about what might happen. That’s a hell of a tradeoff."

      #Copyright #InternetArchive #Journalism #Publishers
      @internetarchive

      petersuberP 1 Reply Last reply
      0
      • petersuberP petersuber

        Update. It's happening. "News Publishers Are Now Blocking The Internet Archive, And We May All Regret It."
        https://www.techdirt.com/2026/02/13/news-publishers-are-now-blocking-the-internet-archive-and-we-may-all-regret-it/

        @mmasnick is right: "In our rush to punish #AI companies, we’re destroying public goods that serve everyone…We’re sacrificing the historical record not because of proven harm, but because publishers are worried about what might happen. That’s a hell of a tradeoff."

        #Copyright #InternetArchive #Journalism #Publishers
        @internetarchive

        petersuberP This user is from outside of this forum
        petersuberP This user is from outside of this forum
        petersuber
        wrote last edited by petersuber@fediscience.org
        #3

        Update. But are #publishers right to worry that #AI companies can freely scrape the #WaybackMachine in order to train their tools? No, says Mark Graham, director of the Wayback Machine.
        https://www.techdirt.com/2026/02/17/preserving-the-web-is-not-the-problem-losing-it-is/

        "The Wayback Machine is built for human readers. We use rate limiting, filtering, and monitoring to prevent abusive access, and we watch for and actively respond to new scraping patterns as they emerge."

        #Copyright #InternetArchive #Journalism
        @internetarchive

        1 Reply Last reply
        0
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Don't have an account? Register

        • Login or register to search.
        Powered by NodeBB Contributors
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups