Global Fire Alarm When Your Power Regulator Calls It Quits During The Holidays

What happens when a data center goes up in smoke—literally? In this episode, we sift through the potential ashes of a not-so-great day. Join us as we chat about servers, batteries, paperwork and holidays, and the surprisingly flammable side of computing. It’s a smoky mess, and we’re here for it.

Listen now on Apple Music, Spotify, Deezer, Youtube or where-ever you get your panic attacks.

Welcome to the official blog post for the legendary episode “Our First Datacentre Fire” from Jack Smith’s IT Horror Stories podcast! If you’ve ever worked in IT, spat out your coffee upon hearing “the backup failed,” or been paged during New Year’s Eve, settle in: this story is your spirit animal.

From retro servers, smoky basements, and good old “MacGyver fixes,” come relive the drama, disasters, and some questionable change management decisions that defined IT in the early 2000s. It’s a tale packed with nostalgia, server clusters, and more than a hint of chaos.

Back to the Early 2000s

“Do you have a DeLorean?”
Sadly, no one did. But if we could rev one up to 88 mph, it would drop us straight into the world of the early 2000s—a time before Facebook, Twitter, or even MySpace.

“Back then, the Internet was still a nice place to be. No bubble, no algorithms, just raw forums and Ask Jeeves at your service.”

“Ask Jeeves, you say?”
Why not? If you could have a butler who also searched the web, wouldn’t you?

The story takes us back to that special age of chunky Nokia phones (with batteries that lasted two weeks), dial-up modems, and server rooms humming with Windows 2000 clusters.

The Setting: A Global Logistics Company

Picture this: You work IT at a massive logistics company. Think trucks, planes, trains, and a network that can’t stop. The company runs 24/7, 364 days a year. And the one day—New Year’s Eve—where virtually everything comes to a halt, you’re on call.

This is the era when offsite backup was swapping tapes, cloud storage was a pipe dream, and network failover meant running actual fiber between concrete buildings across the business park.

“If you’ve ever plugged your laptop into a Nokia and dialed in to troubleshoot, congratulations, you’re officially ‘old school IT.’”

The On-Call Drill

Let’s set the stage:

On-call system: Check
Laptops without WiFi: Check
A cell phone dial-in (Nokia’s, obviously): Check
Company on full shutdown for just this one day: Check
Management says, “What could go wrong?”: Check

The perfect recipe.

Disaster Strikes: The Call No One Wants

It’s New Year’s Eve, and the party is in full swing. Then, the on-call phone rings.

“I am holding my glass of champagne, I am… holding my dessert, and I get a phone call from the contact person: ‘Hi, yeah, we have a fire.’”

Of course, it’s New Year’s. Of course it’s a practical joke.
Except… it isn’t.

The Spark

Back at HQ, the security team spots an entire building drop offline—no sensors, no fire alarm, nothing. Because, well, everything was dead: “The system wants a day off, too,” they joke.

Half an hour later, someone checks, wanders in, and is greeted by a room full of smoke. No beeping, no blaring sirens—just ominous, silent smoke.

“Early fireworks? No, but no fire alarm. Because the building was offline.”

Tech Failsafe: How Clusters Saved the Day

Panic at the Cluster Console

As the fire department does its thing, it’s up to IT to check the true heart of operations: the server room.

The first step? Plug that Nokia into the laptop and remote in to the fabled Windows 2000 Advanced Server clusters—yes, with the legendary blue-and-white admin consoles.

Clustered file servers? Check.
Clustered print servers? Check.
Databases and applications, all split between multiple buildings? DOUBLE CHECK.

Despite panic on the scene, the clusters have failed over like champs. No “split-brain,” no servers thinking they’re the only one left. Just a clean, almost-pristine cluster failover.

“I see the server room in that building is offline. Luckily, all these servers are in a panic mode, but they failed over to the active building.”

Let the Calls Begin

Jack—the IT sanity anchor—starts phoning his fellow team members. 90% are deep into the New Year’s party, so the call chain is full of, “Happy New Year!”… “No, seriously, get sober, we have a fire.”

Preparation point? Make sure the crew isn’t fully in cocktail mode. “Good move,” as Bob notes.

Recovery Mode: MacGyvering Our Way to Uptime

Arriving at Ground Zero

Jack makes his way to the office.
His philosophy: “If I turn the corner and see flames shooting out of the building, I’m heading home.”

Instead, he’s met by a Darth Vader-style scene: the night shift supervisor, mask on, rising from the smoky basement.

“From the smoke, like in a Darth Vader style, the weekend shift supervisor from back then came up with a gas mask on.”

Why Did Everything Break?

The culprit: the building’s no-break system (basically a fancy UPS) failed spectacularly, cutting power not just to servers, but to emergency lighting, the alarm system, and crucially, the fan ventilating the no-break room itself.

No power to fire alarm: No heads-up to anyone
No fan: No oxygen, so the fire couldn’t rage

In a cosmic twist, this lack of oxygen was a lucky break—it suffocated the fire before it could hit the enormous paper archive next door.

“Here’s where we got lucky: the fire was starved of oxygen because the fan died, too.”

Cluster Heroics and the “Wooden Board Solution”

With the all-clear, the MacGyver-ing begins. An impromptu wooden board, crammed with power plugs, gets the critical systems live again. It’s January 1st, 5:30am… and the servers power up.

Clusters reactivated ✔️
Business impact minimized ✔️
Management still home… for now ✔️

“On January 1, 5:30am, I was bringing up the system… MacGyver style!”

Company Culture: Change Management?

If you expect slick change management, detailed documentation, and carefully rehearsed disaster recovery runbooks… well, welcome to IT in the 2000s.

“Somebody came up to me: ‘Hey, we should do change management.’ My response? ‘Hell no. All this paperwork—we know what we’re doing. We got this.’”

No runbooks, no lessons-learned sessions. Just handwritten notes, tribal knowledge, and—the big one—having literally everyone who ever matters physically onsite and sober “just in case.”

The Honor of Being There

The upside of New Year’s Eve? Every single chief was on call: electrics, plumbing, logistics, and IT. If you ever want to schedule a disaster, this was the golden window.

The “Lessons” Learned

How do you learn from a datacentre fire? It’s complicated:

The company didn’t lose data.
They didn’t lose infrastructure.
The only thing lost was… sleep and a bit more gray hair for IT.

Was anything changed to stop a future fire?
Separate backup power for alarms? Not really.
Dedicated circuits for safety? Probably not.

“I don’t even think it was lessons learned because, hey, the system broke, we handled it, there was no business loss, so everyone’s happy.”

The Aftermath and Reflections

Could It Have Been Worse?

Yes. Much. The paper archive could’ve gone up. If the fire had better oxygen, or the failover hadn’t worked, it would have been catastrophic—not just for the company, but for the surrounding neighborhood.

Everyone remembers the post-mortems, right? Well, not really:

No evidence the building burned down later (“Just turned into a parking lot”).
Most equipment eventually got scrapped as the company moved on.
Years later, they still struggled to implement process and change management.

“They advanced to the phase where they could say they had change management… but the process was just for KPIs and not for actually doing stuff.”

Tales from the Operations Trenches

Senior managers using their mail server to store a 60GB movie collection
Telnetting into mainframes trying to find Shift+F13
Servers living dangerously above unsuspecting accounting staff

“Did they ever figure out why the no-break failed?”
Nope. Official cause? “It just went poof.”

Classic Quotes

Let’s relive some of the iconic lines from this story:

“If I turn the corner and see flames coming out of the roof, I’m turning around.”

—Jack, keeping his New Year’s priorities straight

“We ordered an entire storage rack for the first floor. The floor couldn’t hold it, so we put some metal beams underneath to spread the weight. Problem solved.”

—Classic 2000s IT risk management

“There are no procedures for that.”

—The unofficial company motto

“I never got any reports. Hallway talk was: it just went poof.”

—When documentation fails, gossip wins

Conclusion

So what do you get when you mix early-2000s infrastructure, New Year’s Eve, and perfect cosmic timing? The almost disaster that was their first datacentre fire.

The takeaways:

Set up stretched clusters, and they might just save you.
Sometimes, having the dream team available is pure luck.
When the no-break goes, cross your fingers the fan dies, too.
Change management is still a myth at most companies.

Appendix: The 2000s Survival IT Checklist

Want to know if you’re prepared for a classic disaster?

[x] Nokia phone (charged for 2 weeks)
[x] Laptop with a dial-up modem
[x] Windows 2000 Advanced Server Cluster
[x] Knowledge of the building layout (and escape routes!)
[x] A wooden board for emergency “innovations”
[x] Coffee. Lots of coffee.

Final Thoughts

No goats were sacrificed, but plenty of servers were scared.
The more things change (cloud!), the more the old tales remain hilarious, stressful, and deeply relatable—for anyone who’s ever been the only sober sysadmin at the party.

Episode 1 : Our First Datacentre Fire