
It shouldnโt have been a problem. After all, what could possibly go wrong helping a vacationing neighbor whose plants need watering?
But something did.
I got a panicked call from my wife: โRob, I fell. Iโm next door in the backyard.โ
Apparently, Rachelโs ankle gave out as she was walking down the back steps from our neighborโs house.
At least, thatโs her story. Iโm wondering if all this Olympic watching had her trying some kind ofย Yurchenko Double Pikeย (or whatever) out the back door.
In any case, I ran next door, lifted Rachel, stuck her in the car, and drove to the emergency room. Three short hours later, she hobbled out on crutches with an aircast for her badly sprained ankle.
Fortunately, things are steadily improving. Over the past couple of days, her ankle has progressed from looking like it had a baseball in it, to a golf ball, to โfat foot,โ to what we now refer to as โchubby foot.โ (I know what youโre thinking: the medical industry could learn a lot from this cybersecurity guy.)
Not surprisingly, we have had several discussions about the event.
Rachelโs point of view: โI canโt believe I did that.โ
Robโs point of view: โIt could have been much worse.โ
The way I see it, itโs a good thing…
โฆ she had her phone
โฆ I was home
โฆ she didnโt hit her head
โฆ it was the bottom stair, not the top one
โฆ it was sprained, not broken
โฆ we live near an emergency room
โฆ she married an athletic powerhouse with the strength to pick her up and carry her to the car
Rach sees something that could have been avoided. I think about all the things that prevented a bad situation from becoming a terrible one.
Either way, and to improve future outcomes, we took some steps and made some mental notes:
- We ordered more elastic bandage wraps to replace the one we used
- We noted the importance of always having a cell phone with us
- We now know which door is the correct door to the hospital emergency room (oops)
- Rach probably shouldnโt wear flip flops as much. (Not discussed, but she will read this and see my suggestion. I love you, honey!)
As it turns out, these types of โafter action reviewsโ are useful for non-ankle-related things, too.
Reduce Your Risk Before the Next Incident
Iโm guessing you heard about last monthโs CrowdStrike incident in which a content update brought more than eight million Windows machines offline, causing disruption to airline, banking, healthcare, government, and industry operations worldwide.
Most of our clients did not experience any impact. But I participated in an after action session of a client that did โ it was a great discussion.
What impressed me most was my clientโs commitment to transparency and accountability. There were definitely some mistakes made, but they did not try to hide or sugarcoat them. They acknowledged what they did right and were not afraid to point out and discuss what went wrong. (Not surprisingly, they were back up and running within half a day.)
The output of the session had a lot of technical details that you probably donโt care about (more here if you do).ย Butย the structure of their approach is worth noting and borrowing from.ย They created a written report in the following format:
- Summary
- Impact โ what systems and processes were affected?
- Timeline โ what happened, exactly, and when?
- Lessons Learned
- What went well?
- What went wrong?
- Where did we get lucky? (Or, as I prefer to say, โWhere did our hard work pay off?โ)
- Action items โ these ranged from highly technical configuration changes to the nontechnical, but critical, ensuring that everyoneโs telephone number is correctly stored in the company database.
Here are some additional, generic takeaways for everyone, regardless of the size or type of business you operateโฆ
- Agents are scary.ย They are onย allย your machines and typically have administrative access. Perform due diligence on all vendors with this capability.
- Shut down your laptops every night.ย This simple step protects against malicious updates and other bad-guy things that may occur while you are gone.
- Establish backup communication channels.ย Itโs fine to use network-based means for staying in touch with your team (Slack, Microsoft Teams, etc.). But if your network and connected computers are all down, you are going to regret not having your team membersโ phone numbers or other channels of communication available.
- โAll-at-onceโ is super-risky.ย One thing that compounded the seriousness of the CrowdStrike incident was that the update was pushed out worldwide, at the same time. Taking a โwavedโ approach, in which you can investigate early failures and correct as needed, is a much better idea. If possible, do the first waves in close geographic proximity (in case you need to roll a truck to fix) or with low-risk computers.
- Hire quality humans.ย When things go wrong, you need people who will do whatever it takes, for however long it takes, to find the problem and fix it. The best companies were back up and running within a day or two โฆ not because of their technology, but (mostly) because a few heroic employees did what needed to be done. Others were not so fortunate (Iโm looking at you, Delta Airlines).
A Cautionary Tale
If the CrowdStrike incident had no direct impact on your business, Iโm glad to hear it. But consider yourself fortunate, not immune. This kind of thing could have happened with many other vendors on many other platforms.
To keep risk to a minimum,ย take precautions as best you can and be willing to have open and honest after-action discussions with your team when things go wrong.ย Oh, and maybe donโt wear flip flops as often.
Want to get great cybersecurity content delivered to your inbox?ย Click hereย to sign up for our monthly newsletter, Tales from the Click.