Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Darkly)
  • No Skin
Collapse
Brand Logo
  1. Home
  2. Uncategorized
  3. What's the state of the art in keeping scraping bots out of your webservers?

What's the state of the art in keeping scraping bots out of your webservers?

Scheduled Pinned Locked Moved Uncategorized
3 Posts 3 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • CMDR Yojimbosan 🅅⁂Y This user is from outside of this forum
    CMDR Yojimbosan 🅅⁂Y This user is from outside of this forum
    CMDR Yojimbosan 🅅⁂
    wrote last edited by
    #1

    What's the state of the art in keeping scraping bots out of your webservers?

    Rate limiting, a restrictive robots.txt plus user agent matching, plus a proof-of-work page loader?

    Is there a well-updated resource for the user agents?

    Tekniquelly correctT AlexA 2 Replies Last reply
    0
    • CMDR Yojimbosan 🅅⁂Y CMDR Yojimbosan 🅅⁂

      What's the state of the art in keeping scraping bots out of your webservers?

      Rate limiting, a restrictive robots.txt plus user agent matching, plus a proof-of-work page loader?

      Is there a well-updated resource for the user agents?

      Tekniquelly correctT This user is from outside of this forum
      Tekniquelly correctT This user is from outside of this forum
      Tekniquelly correct
      wrote last edited by
      #2

      @yojimbo This is what I'm using: https://honeypot.net/2025/12/22/i-read-yann-espositos-blog.html

      1 Reply Last reply
      0
      • CMDR Yojimbosan 🅅⁂Y CMDR Yojimbosan 🅅⁂

        What's the state of the art in keeping scraping bots out of your webservers?

        Rate limiting, a restrictive robots.txt plus user agent matching, plus a proof-of-work page loader?

        Is there a well-updated resource for the user agents?

        AlexA This user is from outside of this forum
        AlexA This user is from outside of this forum
        Alex
        wrote last edited by
        #3

        @yojimbo https://github.com/TecharoHQ/anubis

        1 Reply Last reply
        1
        0
        • R ActivityRelay shared this topic
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Don't have an account? Register

        • Login or register to search.
        Powered by NodeBB Contributors
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups