Self-Pruning Spanning Tree

Frank Lynam

This is a fairly introductory article about graph theory and how I used it to find all possible attack paths in an attack graph. I won't be talking about how we constructed the attack graphs, but you can read more about that in the papers we wrote here. I will be talking about the algorithm I developed to solve this and how it works. I have no idea if it's novel, but it was faster and gave more complete answers than anything else I could readily find.

Disclaimer: I'm a nuclear engineer, not a computer scientist. This is about gaining an intuitive understanding of concepts, not about formal proofs. Sorry! Also, this is a blog, not a formal technical paper.

The Problem

So, we had developed a conceptual model of adversarial movement through networks and systems. Basically, a state machine describing steps like "find an interface that talks to you" and "get it to let you run code" and "use that to break the computer." Like little Bobby tables. We applied that to a painstakingly-developed network architecture and glued that onto a good understanding of which computers were connected to critical system functions, all for some key chunks of fairly complicated nuclear weapons control system.

Now we had a big, messy graph that, in theory, contained all the possible routes a dedicated adversary could possibly take to disrupt your mission, whether those exploits existed in the wild yet or not. When you're talking about defending nuclear weapons systems, "maybe they'll develop novel exploits in our top secret software running on hardware that no one else has" is definitely "in scope."

Which leads us to the problem we had to solve, given the data we'd collected: when your directed, cyclic graph has a thousand nodes and a few tens of thousands of connections across different technology stacks and communication paths, how do you measure risk? Particularly, how do you measure it in a way that lets you figure out how to make that risk go down?

The short answer there was that we were able to look at historical vulnerability discovery data, leveraging MITRE's CVE repository, to make reasonable estimates for how long a peer nation state adversary would take to develop novel vulnerabilities. That let us estimate how long it would take to develop any given path using exclusively zero-day exploits: our baseline threat model given our system criticality.

Honestly, this should all feel, like, super "duh." And it is! Nonetheless, it feels much more proactive than "install the latest patches."

I'm not going to go into all that boring "how would China hack our nukes" noise here. No one cares about that, and I already linked to a bunch of papers I wrote about that anyways. No, I'm going to talk about the really cool, interesting, exciting part: graph theory!

Graph Theory

Graph theory is about relationships between things. Typically, we abstract "things" as nodes (circles) and "relationships" as edges (lines), like so:

Simple concept!

We call this a directed graph, because of the arrows. They denote directionality; Thing 2 chases Thing 1, but not the other way around. Similarly, "someone on the internet" "injects code" into your edge system or whatever, moving them to the state of "executing arbitrary code."

You get the idea, no need to belabor the point: graph theory is handy for capturing attack paths. Obvs.

Now, imagine this random set of nodes is your graph (refresh for a new one!):

Nodes:

🎲

Edges per node:

So, this is just a silly random graph. A real network / mission model has a lot of clear structure; networks, cross-domain solutions, cryptographic links, and of course the mission model, which is typically very hierarchical. Also, this random graph could very well have disjoint subgraphs, which would be weird (although very secure!) in a real system. Not to mention this can generate nodes that connect back to themselves, which is not terribly meaningful for this use case...

I've previously written clever little features that created more prototypic graphs. These were handy when I was doing optimization of the actual risk analysis. It's not super relevant here, so I'm going to use the incredible power of not caring to leave it the way it is.

You will also notice, if you're into that kind of thing, that this graph may contain cycles. Cycles in a graph are when you can follow the arrows and get back where you started. Since these graphs can contain cycles, we call them cyclic, vice acyclic. If your graph has cycles, there's a ton of math that just doesn't work for it. Unfortunately for math, reality often contains cycles. Sooooo... Yay for random directed cyclic graphs!

Sources and Sinks

Attack paths go from anywhere a threat actor could access the graph to any critical mission function. The terms I chose for these concepts were "sources" and "sinks" when developing the algorithm. While a great algorithm would allow an arbitrary number of sources and an arbitrary number of sinks, just like a real system, you can also cheat by simply connecting all the sources to one common upstream point, and similarly downstream of the sinks. In practice, I noticed a lot of algorithms that already existed needed either one source / sink or one of both, so this ended up being not that unusual to do.

So, pick some sources. These are where the bad guys are, so we'll make them red:

And some sinks. These are our important things, so we'll make them green:

Spanning Trees

I don't know why I call them spanning trees, they aren't trees, which are undirected acyclic graphs. These are just directed acyclic graphs. But "self-pruning spanning directed acyclic graphs" produces a less compelling visual, I guess.

Regardless, the "spanning" part is the idea of starting at a node in a graph and spanning out from it to see where you can go. For us, that means starting at all our sources.

The implementation approach I like for the spanning algorithm is to create a stack, add all the sources to it, then iterate through the stack adding everything each node in the stack can reach to the stack as you go, assuming it isn't already in the stack. Then you've got all the nodes in the spanning tree! Keep track of any novel edges along the way, and that's the part that makes it a graph. Easy peasy.

The extra trick for us to use later on here is to store the edges backwards. This produces a map of all the roads leading to Rome, as it were. I'm going to leave that out of the visuals here, because, again, blog.

Let's watch!

"Frank, sometimes it doesn't stop when it finds the sink! This algorithm is dumb and bad."

You're right! Remember what I said about threat models and risk? We don't want to only think about the shortest path between two points; we want to assess all routes. Often, the best way to compromise a system is a weird side path you don't think they'll be paying attention to, even if it's 10x harder. We want to see those, too, and account for them. We want all paths, and so we keep going until we run out of roads.

So, this is the useful part of the algorithm: we span the opposite way through that tree, from the sinks to the sources. Honestly, I don't really care if it's novel, but it's proven useful, fast and simple enough for me to toss together to solve a few different problems. And easy to implement in a variety of languages: I've used it in C, python and JavaScript.

Which leaves us with just the nice subgraph that we care about:

There's lot of things about this approach that rankled the traditional graph theorists and attack path types I've worked with.

I've already mentioned the cycles, which I've gotten a lot of inexplicable push back for. Also some more reasonable intuitive aspects, like sometimes we'll see paths that go through a target and into another. Why? Because it's a valid path. We could pretend, like, the adversary doesn't have perfect information, even about what systems are critical, or something, but at the end of the day, the goal is "all valid paths" and that is one, even if it's sub-optimal.

That's It

Just a convenient little algorithm, nothing exciting. It is easy to implement, and easy to understand, gives all the paths, and seemed to be faster than any of the pre-packaged alternatives I found at the time.

The approach can be thought of most simply as: spread out down every route you can from every starting point you're allowed to start from. Now, using all those paths you just found, start from every end point you wanted to reach and only walk backwards down those same paths. Now you're guaranteed to have a map that contains every route from all your starting points to all your end points, with no dead ends.

For the attack path math, we could apply our numerical model to this subgraph and get a correct answer without having to compute any irrelevant paths, which was useful to reduce computation time. I've also used a variant of it for cyclic checks when managing nested groups, and a few other random places where it was nice to be able to extract a meaningful subgroup.