Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Darkly)
  • No Skin
Collapse
Brand Logo
  1. Home
  2. Uncategorized
  3. A few days ago, a client’s data center (well, actually a server room) "vanished" overnight.

A few days ago, a client’s data center (well, actually a server room) "vanished" overnight.

Scheduled Pinned Locked Moved Uncategorized
sysadminhorrorstoriesithorrorstoriesmonitoring
176 Posts 77 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Elena Rossini ⁂_ Elena Rossini ⁂

    @EnigmaRotor reading this at lunch in a cafe near my house and I keep chuckling and smiling from ear to ear. @stefano is such a treasure 🙌🏆

    ozonedO This user is from outside of this forum
    ozonedO This user is from outside of this forum
    ozoned
    wrote last edited by
    #27

    @_elena@mastodon.social When you direct the movie, can I star as the legendary @stefano@mastodon.bsd.cafe ​?

    Elena Rossini ⁂_ Stefano MarinelliS 2 Replies Last reply
    0
    • Bojan LandekićB Bojan Landekić

      @stefano so refreshing to read a quality tech tale on Mastodon. Thanks for sharing!

      Stefano MarinelliS This user is from outside of this forum
      Stefano MarinelliS This user is from outside of this forum
      Stefano Marinelli
      wrote last edited by
      #28

      @bojanlandekic thank you! I'm just trying to spread some real life experiences

      Bojan LandekićB 1 Reply Last reply
      0
      • James SewardJ James Seward

        @rhoot @stefano I have my cronjob scripts touch a file as their final action and my monitoring stuff alarms if the file is too old

        randomizedR This user is from outside of this forum
        randomizedR This user is from outside of this forum
        randomized
        wrote last edited by
        #29

        @jamesoff
        I have
        my backup scripts write their return code in a file.

        I monitor file content and mtime, get an alert if content not 0 or file too old

        I also regularly manually test backup restore.

        Then I can sleep

        @rhoot @stefano

        James SewardJ 1 Reply Last reply
        0
        • Ian Campbell 🏴N Ian Campbell 🏴

          @stefano This is such a good, if niche, example of "paying attention to the fundamentals and the alerts covers all sorts of things you'd never imagine happening."

          Thanks for sharing.

          Stefano MarinelliS This user is from outside of this forum
          Stefano MarinelliS This user is from outside of this forum
          Stefano Marinelli
          wrote last edited by
          #30

          @neurovagrant thank you! My rule is: we need moooarr alerts, as you never know how and when (not if - we know it will happen) your alertil system will break.

          1 Reply Last reply
          0
          • Stefano MarinelliS Stefano Marinelli

            A few days ago, a client’s data center (well, actually a server room) "vanished" overnight. My monitoring showed that all devices were unreachable. Not even the ISP routers responded, so I assumed a sudden connectivity drop. The strange part? Not even via 4G.

            I then suspected a power failure, but the UPS should have sent an alert.

            The office was closed for the holidays, but I contacted the IT manager anyway. He was home sick with a serious family issue, but he got moving.

            To make a long story short: the company deals in gold and precious metals. They have an underground bunker with two-meter thick walls. They were targeted by a professional gang. They used a tactic seen in similar hits: they identify the main power line, tamper with it at night, and send a massive voltage spike through it.

            The goal is to fry all alarm and surveillance systems. Even if battery-backed, they rarely survive a surge like that. Thieves count on the fact that during holidays, owners are away and fried systems can't send alerts. Monitoring companies often have reduced staff and might not notice the "silence" immediately.

            That is exactly what happened here. But there is a "but": they didn't account for my Uptime Kuma instance monitoring their MikroTik router, installed just weeks ago. Since it is an external check, it flagged the lack of response from all IPs without needing an internal alert to be triggered from the inside.

            The team rushed to the site and found the mess. Luckily, they found an emergency electrical crew to bypass the damage and restore the cameras and alarms. They swapped the fried server UPS with a spare and everything came back up.

            The police warned that the chances of the crew returning the next night to "finish" the job were high, though seeing the systems back online would likely make them move on. They also warned that thieves sometimes break in just to destroy servers to wipe any video evidence.

            Nothing happened in the end. But in the meantime, I had to sync all their data off-site (thankfully they have dual 1Gbps FTTH), set up an emergency cluster, and ensure everything was redundant.

            Never rely only on internal monitoring. Never.

            #IT #SysAdmin #HorrorStories #ITHorrorStories #Monitoring

            stux⚡️S This user is from outside of this forum
            stux⚡️S This user is from outside of this forum
            stux⚡️
            wrote last edited by
            #31

            @stefano Great job!

            This is why is always run up time on different servers in other places!

            Perfect!

            Stefano MarinelliS 1 Reply Last reply
            0
            • Elena Rossini ⁂_ Elena Rossini ⁂

              @EnigmaRotor reading this at lunch in a cafe near my house and I keep chuckling and smiling from ear to ear. @stefano is such a treasure 🙌🏆

              Stefano MarinelliS This user is from outside of this forum
              Stefano MarinelliS This user is from outside of this forum
              Stefano Marinelli
              wrote last edited by
              #32

              @_elena @EnigmaRotor thank you!

              1 Reply Last reply
              0
              • ozonedO ozoned

                @_elena@mastodon.social When you direct the movie, can I star as the legendary @stefano@mastodon.bsd.cafe ​?

                Elena Rossini ⁂_ This user is from outside of this forum
                Elena Rossini ⁂_ This user is from outside of this forum
                Elena Rossini ⁂
                wrote last edited by
                #33

                @ozoned @stefano maybe! Especially if it’s motivation enough for you to keep practicing your Italian! 😂 and definitely at the very least a cameo with a line from Spaceballs

                ozonedO 1 Reply Last reply
                0
                • ozonedO ozoned

                  @_elena@mastodon.social When you direct the movie, can I star as the legendary @stefano@mastodon.bsd.cafe ​?

                  Stefano MarinelliS This user is from outside of this forum
                  Stefano MarinelliS This user is from outside of this forum
                  Stefano Marinelli
                  wrote last edited by
                  #34

                  @ozoned @_elena 😆 sure, just continue to practice with your Italian

                  1 Reply Last reply
                  0
                  • randomizedR randomized

                    @jamesoff
                    I have
                    my backup scripts write their return code in a file.

                    I monitor file content and mtime, get an alert if content not 0 or file too old

                    I also regularly manually test backup restore.

                    Then I can sleep

                    @rhoot @stefano

                    James SewardJ This user is from outside of this forum
                    James SewardJ This user is from outside of this forum
                    James Seward
                    wrote last edited by
                    #35

                    @randomized @rhoot @stefano how do you monitor your sleep 😛

                    randomizedR 1 Reply Last reply
                    0
                    • Elena Rossini ⁂_ Elena Rossini ⁂

                      @stefano you’re a hero Stefano! As your Fedi friend and documentary filmmaker I hope I get preferential treatment when one of your amazing stories gets optioned for a film 🤗

                      Stefano MarinelliS This user is from outside of this forum
                      Stefano MarinelliS This user is from outside of this forum
                      Stefano Marinelli
                      wrote last edited by
                      #36

                      @_elena Thank you! Sure, I will 👍
                      But, to be honest, I don't think any of those stories will ever be a film.

                      The big, most scary one is yet to come, anyway...

                      Bob TregilusE 1 Reply Last reply
                      0
                      • stux⚡️S stux⚡️

                        @stefano Great job!

                        This is why is always run up time on different servers in other places!

                        Perfect!

                        Stefano MarinelliS This user is from outside of this forum
                        Stefano MarinelliS This user is from outside of this forum
                        Stefano Marinelli
                        wrote last edited by
                        #37

                        @stux thank you! Yes, that's a very wise approach. I have some internal and external monitoring tools. And the monitoring tools monitoring the monitoring tools, with different technologies (so a bug won't hit all the tools at the same time). Yet, I always feel I need moooarrr monitoring 🙂

                        1 Reply Last reply
                        0
                        • mkjM This user is from outside of this forum
                          mkjM This user is from outside of this forum
                          mkj
                          wrote last edited by
                          #38

                          @stefano But what monitors the monitor monitors? We need an audio technician in here, stat! 😉

                          @rhoot @jamesoff

                          James SewardJ 1 Reply Last reply
                          0
                          • Elena Rossini ⁂_ Elena Rossini ⁂

                            @ozoned @stefano maybe! Especially if it’s motivation enough for you to keep practicing your Italian! 😂 and definitely at the very least a cameo with a line from Spaceballs

                            ozonedO This user is from outside of this forum
                            ozonedO This user is from outside of this forum
                            ozoned
                            wrote last edited by
                            #39

                            @_elena@mastodon.social @stefano@mastodon.bsd.cafe ​How do I say "I knew it! I'm surrounded by assholes!" in Italian?

                            1 Reply Last reply
                            0
                            • James SewardJ James Seward

                              @randomized @rhoot @stefano how do you monitor your sleep 😛

                              randomizedR This user is from outside of this forum
                              randomizedR This user is from outside of this forum
                              randomized
                              wrote last edited by
                              #40

                              @jamesoff
                              Sport watch 😁
                              @rhoot @stefano

                              1 Reply Last reply
                              0
                              • mkjM mkj

                                @stefano But what monitors the monitor monitors? We need an audio technician in here, stat! 😉

                                @rhoot @jamesoff

                                James SewardJ This user is from outside of this forum
                                James SewardJ This user is from outside of this forum
                                James Seward
                                wrote last edited by
                                #41

                                @mkj @stefano @rhoot oh if audio's getting involved, you can use `ping -a` 😄

                                mkjM 1 Reply Last reply
                                0
                                • Stefano MarinelliS Stefano Marinelli

                                  A few days ago, a client’s data center (well, actually a server room) "vanished" overnight. My monitoring showed that all devices were unreachable. Not even the ISP routers responded, so I assumed a sudden connectivity drop. The strange part? Not even via 4G.

                                  I then suspected a power failure, but the UPS should have sent an alert.

                                  The office was closed for the holidays, but I contacted the IT manager anyway. He was home sick with a serious family issue, but he got moving.

                                  To make a long story short: the company deals in gold and precious metals. They have an underground bunker with two-meter thick walls. They were targeted by a professional gang. They used a tactic seen in similar hits: they identify the main power line, tamper with it at night, and send a massive voltage spike through it.

                                  The goal is to fry all alarm and surveillance systems. Even if battery-backed, they rarely survive a surge like that. Thieves count on the fact that during holidays, owners are away and fried systems can't send alerts. Monitoring companies often have reduced staff and might not notice the "silence" immediately.

                                  That is exactly what happened here. But there is a "but": they didn't account for my Uptime Kuma instance monitoring their MikroTik router, installed just weeks ago. Since it is an external check, it flagged the lack of response from all IPs without needing an internal alert to be triggered from the inside.

                                  The team rushed to the site and found the mess. Luckily, they found an emergency electrical crew to bypass the damage and restore the cameras and alarms. They swapped the fried server UPS with a spare and everything came back up.

                                  The police warned that the chances of the crew returning the next night to "finish" the job were high, though seeing the systems back online would likely make them move on. They also warned that thieves sometimes break in just to destroy servers to wipe any video evidence.

                                  Nothing happened in the end. But in the meantime, I had to sync all their data off-site (thankfully they have dual 1Gbps FTTH), set up an emergency cluster, and ensure everything was redundant.

                                  Never rely only on internal monitoring. Never.

                                  #IT #SysAdmin #HorrorStories #ITHorrorStories #Monitoring

                                  Lasse LeegaardL This user is from outside of this forum
                                  Lasse LeegaardL This user is from outside of this forum
                                  Lasse Leegaard
                                  wrote last edited by
                                  #42

                                  @stefano 10+ years ago i started volunteering at a festival. Everything was new that year including the small outdoor racks for the area field routers (Juniper MX80). They barely fit but we managed. The racks were left in the sun in the summer. It was only when we enabled Observium (LibreNMS predecessor) that graphs almost everything it gets from SNMP that we discovered the inlet temperature was getting close to 80 degrees C. #monitorallthethings

                                  Lasse LeegaardL 1 Reply Last reply
                                  0
                                  • Lasse LeegaardL Lasse Leegaard

                                    @stefano 10+ years ago i started volunteering at a festival. Everything was new that year including the small outdoor racks for the area field routers (Juniper MX80). They barely fit but we managed. The racks were left in the sun in the summer. It was only when we enabled Observium (LibreNMS predecessor) that graphs almost everything it gets from SNMP that we discovered the inlet temperature was getting close to 80 degrees C. #monitorallthethings

                                    Lasse LeegaardL This user is from outside of this forum
                                    Lasse LeegaardL This user is from outside of this forum
                                    Lasse Leegaard
                                    wrote last edited by
                                    #43

                                    @stefano since the racks were designed for outdoor use they were water tight, only had small holes in the bottom for cables and very limited infrastructure for air venting like downward facing holes in the “roof”. They could supposedly float.

                                    Lasse LeegaardL 1 Reply Last reply
                                    0
                                    • Lasse LeegaardL Lasse Leegaard

                                      @stefano since the racks were designed for outdoor use they were water tight, only had small holes in the bottom for cables and very limited infrastructure for air venting like downward facing holes in the “roof”. They could supposedly float.

                                      Lasse LeegaardL This user is from outside of this forum
                                      Lasse LeegaardL This user is from outside of this forum
                                      Lasse Leegaard
                                      wrote last edited by
                                      #44

                                      @stefano We ended up cutting some wide cable pipes at an angle and duct taping it to the router so we covered the air inlet with one pipe and the air exhaust with another pipe. The other end of the ducts were led to the outside of the rack, lifted off the ground and pointed downwards to avoid water. That provided new fresh air and a way to get rid of the hot air. We also fashioned some shadow with a sheet of plywood. The year after we put some smaller equipment in 😎

                                      1 Reply Last reply
                                      0
                                      • Rob\Viewdata UKR This user is from outside of this forum
                                        Rob\Viewdata UKR This user is from outside of this forum
                                        Rob\Viewdata UK
                                        wrote last edited by
                                        #45

                                        @darkling @stefano
                                        Ferranti Computer Systems, Cheadle (UK) circa 1982. I was a lowly apprentice, at the time working in the department that oversaw the various VAXen that most of the site used. Three full size machines and a handful of microVAX. Kept cool by *three* massive air conditioner units on the external wall. The server room was always chilly. /cont

                                        Rob\Viewdata UKR 1 Reply Last reply
                                        0
                                        • Stefano MarinelliS Stefano Marinelli

                                          A few days ago, a client’s data center (well, actually a server room) "vanished" overnight. My monitoring showed that all devices were unreachable. Not even the ISP routers responded, so I assumed a sudden connectivity drop. The strange part? Not even via 4G.

                                          I then suspected a power failure, but the UPS should have sent an alert.

                                          The office was closed for the holidays, but I contacted the IT manager anyway. He was home sick with a serious family issue, but he got moving.

                                          To make a long story short: the company deals in gold and precious metals. They have an underground bunker with two-meter thick walls. They were targeted by a professional gang. They used a tactic seen in similar hits: they identify the main power line, tamper with it at night, and send a massive voltage spike through it.

                                          The goal is to fry all alarm and surveillance systems. Even if battery-backed, they rarely survive a surge like that. Thieves count on the fact that during holidays, owners are away and fried systems can't send alerts. Monitoring companies often have reduced staff and might not notice the "silence" immediately.

                                          That is exactly what happened here. But there is a "but": they didn't account for my Uptime Kuma instance monitoring their MikroTik router, installed just weeks ago. Since it is an external check, it flagged the lack of response from all IPs without needing an internal alert to be triggered from the inside.

                                          The team rushed to the site and found the mess. Luckily, they found an emergency electrical crew to bypass the damage and restore the cameras and alarms. They swapped the fried server UPS with a spare and everything came back up.

                                          The police warned that the chances of the crew returning the next night to "finish" the job were high, though seeing the systems back online would likely make them move on. They also warned that thieves sometimes break in just to destroy servers to wipe any video evidence.

                                          Nothing happened in the end. But in the meantime, I had to sync all their data off-site (thankfully they have dual 1Gbps FTTH), set up an emergency cluster, and ensure everything was redundant.

                                          Never rely only on internal monitoring. Never.

                                          #IT #SysAdmin #HorrorStories #ITHorrorStories #Monitoring

                                          rasteriR This user is from outside of this forum
                                          rasteriR This user is from outside of this forum
                                          rasteri
                                          wrote last edited by
                                          #46

                                          @stefano I wonder how they generate a big enough power surge.

                                          Falk AppelF 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups