Incident response tabletop exercises are a great way to safely practice your cybersecurity Incident Response plan before a real emergency strikes. Just prepare a few scenarios, and then have your team role play how they would use the plan to guide their response. The simple act of talking through a few incident response tabletop exercise scenarios is often enough to highlight where your Incident Response plan might need a little polish.
As you prepare for your tabletop, you’re bound to find plenty of example scenarios, but most are just, well … boring! “There’s malware being reported on employee laptops, what do you do?” Sure, it’s important that your Incident Response plan is able to handle that, but it’s not very engaging for your incident response team.
Sometimes it makes sense to throw a curveball at your team – a scenario that’s interesting, unique, and specifically designed to target areas of weakness or ambiguity in your Incident Response plan. Think of this as a “stress test” for your plan. You want to intentionally push it to the point where something fails, and then figure out what you can do to make it stronger. It’s ok to take a few liberties when designing these scenarios and incorporate unlikely conditions or just intensely bad luck, because sometimes extreme things really do happen. (See related article, “Craziest Business Continuity tabletop scenarios: worldwide flu-like pandemic!”, circa 2018)
Let’s take a look at some examples!
1. Did we really lose data?
This incident response tabletop exercise scenario is the cyber equivalent to the old “if a tree falls in the woods…” question.
The starting script (read this to everyone):
A Cloud Ops engineer casually informs you about an issue the team just fixed. It was discovered that unencrypted backup files for your production database have been accidentally replicating for months to a public cloud storage bucket that was open to the world, and anyone could have accessed your data.
He tells you that this storage bucket was intentionally set up to help test how they might replicate backups out of your new production architecture before it went live, but it was only supposed to be temporary. At the time, they were only using dummy data, so they thought it was ok to configure the storage bucket this way to make it easier to test. Unfortunately, this replication job was forgotten about and no one ever deactivated it before the go-live cutover, and it started copying real client data. Since then, backups containing sensitive client records have been sitting exposed for several months.
The cloud team stopped the replication this morning and locked down the permissions on the bucket. You wonder how something this sloppy could ever have happened, and sensing your disapproval, the cloud ops engineer quickly says “we fixed the problem already, what else is there to do?” and rushes off to fight the next fire.
From here, your incident response team will likely ask questions about storage audit or access logs, but that’s when you tell them that because this storage location was only supposed to be temporary, the Cloud Ops team never set up any event monitoring . If they keep trying to figure out if anyone accessed the data, just keep sending them to dead ends.
The uncertainty here will be frustrating, but hey, so are real incidents! Keep it so the team never knows if there was an actual data loss or not. It certainly “feels” like a security incident, but is it? How would your Incident Response plan classify it? Many plans aren’t structured well to handle ambiguity with data loss.
Also to consider – what does your plan say about notifying customers when their data may have been compromised? What would that notice look like? Who would write, approve, and send it?
If your company manages regulated data, like PHI under HIPAA or PII under GDPR, then definitely also include that in the scenario. Does your plan make it clear if you need to report potential data breaches to the appropriate regulators? Do you have instructions on how, where, and when to report it?
As a follow up, ask the team if the length of time the data was exposed matters? The “several months” of exposure in the original incident response tabletop exercise scenario sounds like a long time, but what if it was only a day? An hour? 5 nanoseconds? Does your plan make any distinction or help the team make a risk-based decision?
Give your team bonus points if they discuss how to prevent future occurrences through better security architecture reviews, change management, or regular audits.
2. Know when to ask for help .
Does your Incident Response plan have details on when to call outside help?
The starting script (read this to everyone):
A shadowy and radical activist group, LLAMA, has been publically ratcheting up the pressure on one of your customers, AcmeCo, for the past several months. Not much is known about LLAMA, but they are seemingly well funded and profess an anti-globalist agenda that opposes your client’s rapid expansion into international markets and is trying to disrupt their operational abilities.
You hear in the news that LLAMA has started intimidating AcmeCo’s vendors and customers rather than directly attacking AcmeCo. Soon after, several of your employees start reporting issues with their devices. Users report various annoying problems – random files being encrypted for an hour, keyboards being unresponsive for 15 minutes – and then pop-ups appear from LLAMA saying this is just a warning. The messages threaten to delete all data unless you publicly announce that you are terminating AcmeCo’s contract within 7 days. There is no request for money or other ransom.
AcmeCo is a small contract now but with the global expansion it will soon be one of your biggest customers!
Your team may start debating the merits of paying the unusual “ransom” – terminating AcmeCo’s contract – but really that’s just a distraction.
What they should really be debating is calling the FBI (or other law enforcement) and their cybersecurity insurance provider. This is a major, targeted criminal threat with a clear potential for loss, and both can help in this sort of situation. If your team is going down a rabbit hole trying to investigate the malware, you may need to creatively steer them towards this more interesting debate.
Making a call for help is not a decision most organizations will take lightly, as there are reputational and other financial risks to consider. Is there any written policy guidance on when it is appropriate? Does your Incident Response plan make it clear who has the authority to make that decision? Does the Incident Response team know how and when to engage your Legal team?
Let’s say your team gets to the point where they want to make that call for help, but does your Incident Response plan even say how to engage with the FBI’s or your insurance carrier’s incident hotline? What information is needed to open a case? Who do you want as the point person? Probably not your jr. tech who started last week!
In any situation where you’re engaging law enforcement for a potential criminal investigation, your Incident Response plan should be very clear about evidence collection and retention and, when applicable, chain of custody. For example, does your Incident Response plan make any mention of how to collect logs, where to store them, and how to ensure they’re not altered?
Once the team makes those external contacts, you can let them off the hook and say that the FBI confirmed that the malware was for the most part just an annoyance, and the insurance company assisted in getting help with the cleanup.
Or not! If you want to make them squirm a bit more, you can say the FBI reports that they believe the threat to be very real, but they also know that LLAMA has been honoring their promise to back off anyone who terminates AcmeCo’s contract. How does your team proceed? If they don’t capitulate, you can then play out how you might engage with an emergency digital forensics service. Time’s ticking on the 7-day deadline!
3. Gary
We’ve all known someone like Gary.
The starting script (read this to everyone):
Gary, a senior administrator, is one of your smartest employees and has been around longer than almost anyone else in the company. There are critical legacy systems that “only Gary knows” how to fix when they break, and he spends countless hours late at night and on weekends keeping everything running.
But, he’s also a grouch with an abrasive personality and a stubborn unwillingness to compromise on even minor matters. For a long time his behavior has been slowly getting more combative and his outbursts more frequent. He recently publicly berated a junior admin for a minor mistake, causing the employee to quit on the spot, and you fear the employee will file a hostile workplace complaint with the EEOC. Everyone knows Gary is becoming a major problem. Management decides that this is the final straw and it’s time to let Gary go.
So, did your Incident Response team immediately start rattling off what they’d do? Or do they have a puzzled look on their faces?
Well, this is a trick situation … as described, this isn’t a security incident (not yet, at least). On the surface, it seems like this is just a lead-in to a Human Resources employee termination workflow. Kudos to your team if anyone notices that! On the contrary, if they start rattling off all the system passwords they’ll change and how they’ll disable Gary’s email when he leaves, then just ask them to explain who exactly convened the incident response team, and why? They haven’t even been notified about this yet!
If there are senior leaders in the exercise, you can let them discuss how they’d kick off Gary’s removal, or you can just skip ahead and say that Human Resources is informed by management that they need to initiate a sensitive involuntary termination.
Most companies have a termination or offboarding checklist that Human Resources and IT will follow to remove an employee’s access to company systems, data, and services. But Gary is no normal employee – he has access to far more than what any reasonable checklist would cover. For the purposes of this exercise, mention that Human Resources anticipates his extra access as a potential problem and wants to report it to you. If they don’t, they know it could lead to an actual incident, if for example Gary’s admin access was left in place on the legacy systems.
But where does Human Resources report this? Does your company have a way to report special situations that aren’t yet an incident? Does the incident response team have a way to triage and react to these potential threats? Also remember – this is all before Gary’s dismissal and has to be done discreetly and confidentially.
Let your team take off their incident response team hats for a moment and have a sidebar discussion about the intake process for things that maybe are, maybe aren’t yet an incident. Is it widely known throughout the company how to report those items? In general, who’s job is it to process them and to activate the incident response plan when needed?
If your Incident Response plan is entirely reactive – an incident happened, now fix it – then it may now dawn on your team that’s not good enough. It really also needs to be proactive – an incident may likely happen, prevent it! Most plans don’t account for this, but they should!
Have everyone put their team hats back on, and pick up the story here:
The situation with Gary is starting to spiral out of control. In anticipation of Gary’s dismissal later that day, IT was trying to discreetly prepare to remove his access and accidentally did a step early, and Gary quickly figured out that the company was planning to fire him. In a rage, he tells his manager that he’s quitting, and he won’t tell anyone about all of the admin passwords that only he knows about. The incident response team learns about this through the grapevine and recognizes this as a major security issue.
Let the team chew on this for a while and see what direction their thought process goes. This may spawn an interesting discussion about recovering admin access to those systems, or ways to prevent this problem with better documentation, cross-training, and job rotation. As long as the conversation has value, let them continue with it, and maybe that will feel like enough for you to wrap up this exercise.
If the conversation peters out, or if you really just want to push them harder, pick the story back up and mention that IT is now discovering timebomb scripts that are deleting data and shadow accounts on some systems – it looks like Gary anticipated being fired and has sabotaged your systems!
That will definitely be a tough incident response tabletop exercise scenario for them to sort out! Have fun with it and see where it goes!
Conclusion
Whew! Those were some tough incident response tabletop exercise scenarios for your team to navigate! Figuring out how to use your Incident Response plan in unanticipated ways like this is hard work, so congratulate your team for getting through it. But don’t forget the most important and final step – use what you learned today to update and improve your plan, because the next time you need it may be for a real emergency!
Want to get great cybersecurity content delivered to your inbox? Click here to sign up for our monthly newsletter, Tales from the Click.