Securing Software Supply Chains with Zero Knowledge Proofs

Software supply chain attacks are on the rise, increasing a staggering 742% per year on average since 2019. Sometimes called “third party attacks,” these cyberattacks infiltrate third party or open source software libraries with malicious code, infecting vendors and components along the software supply chain. 

These days, any given software artifact may depend on hundreds or even thousands of third party or open source libraries (often comprising up to 85% of their total code base), each of which is trusted to properly implement its expected functionality. When a supply chain attack compromises and inserts malicious functionality into any of those libraries, the final software artifact is compromised as well. To make matters worse, the same library may be used by multiple software artifacts, so a successful supply chain attack against one library could adversely impact all the software artifacts that depend on it.

Back in 2020, a particularly infamous attack struck American software company SolarWinds. Hackers compromised the machines that signed and distributed updates for SolarWinds’ network and traffic monitoring product, known as Orion, modifying the software to grant them illicit access to the machines running the modified Orion code. With malicious code lurking unseen in the shadows, tens of thousands of customers—including Fortune 500 companies and government agencies—were impacted after downloading what should have been a routine software update.

In another example, in early 2022 the developer of `node-ipc`, a popular Javascript library, modified the code to delete random files for users located in Russia—evidently an act of protest against the invasion of Ukraine. 

These types of software supply chain attacks now number in the tens of thousands every year, with the number of attacks growing at an exponential rate. To keep pace with the mounting threats, it is vital that techniques are developed to mitigate them. One of the most promising approaches to preventing supply chain attacks—and one in which Galois has particular expertise and experience—involves combining formal methods and zero-knowledge proofs. 

An Introduction to Zero-Knowledge Proofs

A zero-knowledge proof (ZKP) is a cryptographic technique that allows an individual (or computer system), called the “prover,” to prove to another individual (or computer system), called the “verifier,” that the prover knows a secret without revealing what the secret is. ZKPs can be enormously helpful in scenarios where both security and privacy are of high importance. To illustrate, consider the metaphor of Ali Baba’s cave.

In this scenario, a cave has two entrances and a padlocked door that is not visible from the entrance. The prover wants to convince the verifier that they know the password to open the door, without revealing the password. The prover can convince the verifier by running the following protocol:

1. When the verifier isn’t looking, the prover randomly enters one of the entrances to the cave.
2. The verifier walks to the cave entrance (where they cannot see the prover) and they tell the prover which exit to take.
3. The prover takes the requested exit, and uses the password to open the door if necessary.

The prover and verifier repeat this protocol `N` times, and the verifier is eventually convinced that the prover knows the password since the probability that the prover doesn't know the password is `1/2N`.

In practice, ZKPs generalize beyond this simple illustration. In the world of computer science and cybersecurity, they allow a prover to convince the verifier that they know a secret that satisfies an arbitrary predicate. As a result, ZKPs have significant potential to have a positive impact in areas like improving vulnerability disclosure processes, compressing ledger sizes for cryptocurrencies, and securing global software supply chains.

An Introduction to Formal Proofs about Software

The field of formal methods offers a promising approach to assuring trustworthiness of software. Formal methods are a suite of design techniques that enable engineers to mathematically prove correctness and security properties about code. The holy grail in this field, which is still very much a work in progress, is to eliminate software bugs and vulnerabilities completely. Even though achieving this goal is in the distant future, other goals with formal methods are practical today. For example, at Galois we developed tools like SAW and Cryptol and used them to prove the security of s2n, Amazon’s open source TLS cryptographic library. Today, this kind of provable assurance that software acts appropriately for all possible inputs is being integrated into the Continuous Integration (CI) infrastructures of many software development environments.

A Sum Greater Than Its Parts

These formal method tools are limited in that they typically require access to the software’s source code. This leaves an open question about how to guarantee that the compiled software that end users receive has actually been verified by formal method tools. Furthermore, when the software in question is proprietary, end users may not have access to the source code so they cannot apply formal method tools. 

ZKPs can address these limitations and, when used in tandem with formal methods, establish a secure software supply chain between software developers and end users. 

ZKPs would improve traditional formal method tools in two ways. First, formal proofs can show that compiled software matches the source code that was formally verified. Developers can provide such proofs alongside their compiled software to show to everyone that it was properly compiled from its source code. 

Second, developers can prove that their (private) source code satisfies a public specification such as a set of correctness or security properties, without revealing that source code. This proof, in conjunction with the above proof that the compiled code is equivalent to the source code, gives developers an end-to-end proof: my source code always guarantees these properties; and my delivered executables are compiled from that verified source code.  Verifying those twin proofs about a received executable is easy for consumers given appropriate tooling. 

The combined capabilities of Formal Methods and ZKPs could help mathematically ensure that there are no cracks in the armor—the sorts of cracks that increasingly leave software applications open to attack—by providing an easy way to verify the correctness and security of third party components before integrating it at the risk of your own end product. 

In the SolarWinds attack, hackers compromised the servers that distributed Orion, enabling them to digitally sign the modified software. ZKPs combined with formal methods would eliminate the need to trust distributors. Instead, the ZKPs would guarantee the proven correctness and security properties of the software. 

Similarly, the `node-ipc` attack could have been prevented by using formal method tools to prove that its source code does not write to the filesystem. 

At Galois, we have been working towards this future. In addition to our work in formal methods, we have been doing research on encoding properties in ZK and optimizing ZK protocols.

As part of the DARPA SIEVE program, we developed zkUnsat along with collaborators at Yale, Columbia, and Northwestern. This tool enables users to efficiently prove unsatisfiability of SAT boolean formulas in ZK, which is a technique underlying many formal method verification tools. 

We are currently working on zkUnsmt, which is a generalization of zkUnsat that reasons about SMT. SMT itself is a generalization of SAT that is used in formal method tools like SAW and Cryptol. One open research area is how to better integrate formal method tools with ZK. Techniques are needed to efficiently translate high level source code to low level SAT or SMT statements in ZK. In addition, the compilation from source code to binary software needs to be efficiently verifiable in ZK. 

Clearly, there is still a lot of work to be done in this area. Yet the actively growing threat demands a response. The remarkable potential and current capabilities already demonstrated by the combination of formal methods with ZKPs may hold the key to a future where software supply chain attacks are not just mitigated against, but eliminated completely. 

Acknowledgments

This post mentions work supported by the Defense Advanced Research Projects Agency

(DARPA) under Contract No. HR001120C0085. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Defense Advanced Research Projects Agency (DARPA). Distribution Statement A: Approved for Public Release, Distribution Unlimited.