Ryan McGeehan

Minimum viable probabilistic cyber risk quantification

This essay describes what the minimum viable probabilistic approach to cyber risk management looks like. That means, I’d do this even if nobody asked for a compliance report. It's free, requires no platforms, and allows you to start with no historical loss datasets. Any incident notes or internal metrics you already have will improve it, but they aren’t a prerequisite. Only effort and familiarity with probabilistic quant, which I will caveat is not a widely held skill. This is admittedly a niche subject in cybersecurity and requires comfort and understanding of the sharp corners of using quant risk methods, but I find it useful.

Here's what your outputs will be:

You can do the initial setup in ~1-2 hours, then refresh quarterly (or after major incidents/architecture changes). With familiarity it becomes very quick. Some tooling suggestions are linked inline, too.

Annual Loss Distribution

I start with a very loosely built expected annual loss distribution based on a company's definition of an incident. I wouldn't need a historical loss dataset. This will feel unreasonably fast. I would rely solely on elicitation from a panel (myself, maybe one or two others) to develop the distribution. This prevents the exercise from ballooning into a data project. We'd order a pizza and be done in an hour with reasonable buy-in.

The discussed forecast would be very cost inclusive, containing usually ignored factors like customer churn, disruption of engineering and product delivery, new friction in sales pipelines, and regulatory hangups, and additional spending to correct our expectations of good security. The panel itself would probably suggest a bunch of other company specific FUBAR scenarios, too.

Sidebar: If we haven't already decided on some definition of an incident, we'd do that now. Define ‘incident’ as meeting any of: (1) pages on-call, (2) triggers IR plan, (3) declared SEV/P0–P1, or (4) ≥X eng-hours in response.

We elicit a few percentiles (e.g., P50 and P90) and fit a parametric distribution (often lognormal) producing a full severity (cost) curve we can sample from. I usually do this with a quick Python notebook (e.g., using my elicited library) to fit the distribution and sample from it.

“For a single incident that meets our definition, what are P50 and P90 total costs?”

“Over the next 12 months, how many incidents meet our incident definition? P50 and P90?”

Combine the frequency distribution with the severity distribution (simple Monte Carlo is enough) and you get an annual loss distribution. Do it in a notebook, spreadsheet, CLI... doesn't matter.

The panel would anonymously forecast / debate / openly forecast to reduce bias. We reduce anchoring and dominance by doing an anonymous first round, discussing rationales, then re-eliciting percentiles. We also force tail-thinking by asking ‘what would have to be true for P90?’ in the discussion.

The minimum-viable drafty-ness of this is fine. The distribution is going to be wrong. Further, all risk assessments are wrong unless they have specific date, time, and description of the upcoming breach you're about to suffer. Treat risk assessments as spreadsheets of forecasts. It’s going to be wrong in the way all forecasts are wrong. The point isn’t truth; it’s calibration and prioritization: a number you can update, argue with, and use to choose between mitigations.

Here's some of my previous work on the subject which is a bit more involved that I'm describing here.

High Impact Scenario Attribution

This is my favorite part. Force the panel to think big.

The panel's next job is to brainstorm scenarios with big damage potential, and it must exceed the 90th percentile.

If we have an incident in the next 12 months and it’s >P90 cost severity, which scenario class will it be?

Build taxonomy out of these heavy hitters. Make them a Mutually Exclusive and Collectively Exhaustive list of options. Then the panel would forecast / talk / forecast which scenario is most likely to happen next. Include an "other" category, but if it's large (>20%), consider investigating why and re-eliciting. Define what sort of classification failure in an incident would make an incident be classified as "other".

Every mitigation initiative must map to one or more scenario classes.

Given time, and size of company, I would repeat #2 with a narrow focus on an area where there may be interest, like a massive risk area (Crypto cold storage, IP loss, etc).

Some previous work on that in particular:

Keep Score

Scoring isn’t to shame people; it’s to learn where our intuitions are well-calibrated and where the model needs revision.

Now we have a breakdown of where we believe tail losses come from. We can start confronting assumptions with how a security program actually applies resources day to day. During this period, you will likely see actual incidents or industry rumblings that update your knowledge. If you want, you can score your forecasts.

This is why I like to start with risk measurement brief and drafty, but structured so it can be updated and improved with whatever newly acquired rigor you can afford to throw at it. We map every major security initiative to one or more scenario classes. Then, observe whether the panel updates probability/impact for those classes over time.