Building a Concurrency Verifier Using Crucible

Many of the verification and static analysis tools we build at Galois are based on the same technology: a symbolic execution engine for a language called Crucible. There are a lot of advantages to doing this. It’s what makes it possible for SAW to reason about C, C++, Rust, and x86 assembly, all through the same interface, just by translating into Crucible. Improvements to Crucible improve all our tools at once, and the systems we’ve verified form a natural test set that helps us avoid bugs and performance regressions. 

Crucible also makes it much easier to build new tools. For instance, Galois recently announced Crux, a verification tool based on symbolic testing. A user of Crux can write a test harness with symbolic inputs and then check whether assertions in the test case could ever fail. Behind the scenes, Crux translates programs to Crucible. 

This works great for sequential programs, but it made us wonder whether we could use the same technology to build a simple verifier for multi-threaded Rust programs. It turns out that Crucible makes that pretty easy! Let’s take a look at what we did.  

Example

Let’s examine the following simple program:

fn inc(val: u32) -> u32
{
   if val == u32::MAX { val } else { val + 1 }
}

fn dec(val: u32) -> u32
{
   if val == 0 { val } else { val - 1 }
}

fn action(do_inc: bool, value: u32) -> u32
{
   if do_inc { inc(value) } else { dec(value) }
}

We might want to check that if we chain action three times, then the difference between the resulting value and the original value is no more than 3. We can devise the following test in crux-mir (the Rust version of Crux) to check this is the case for an arbitrary starting value and an arbitrary sequence of three actions:

#[cfg_attr(crux, crux_test)]
fn test() {
   let v0 = u32::symbolic("value");
   let mut v = v0;

   for i in 0..3 {
       let do_inc = bool::symbolic("do_inc");
       v = action(do_inc, v);
   }

   crucible_assert!(if v >= v0 { v - v0 } else { v0 - v } <= 3 );
}

Here, the declaration u32::symbolic("value") tells Crux to consider any possible value rather than a single one.  Crux-mir verifies that the assertion succeeds, meaning the assertion is true for any choice of v0 and do_inc. Pretty cool! 

The Crucible language (and, hence, the symbolic execution engine) is itself single-threaded, leaving verifying properties of multithreaded programs out of Crux’s reach. For example, we would like to be able to modify the above program so that each action is performed by a separate thread concurrently:

fn inc(val: u32) -> u32
{
   if val == u32::MAX { val } else { val + 1 }
}

fn dec(val: u32) -> u32
{
   if val == 0 { val } else { val - 1 }
}

fn action(do_inc: bool, value: &Arc<Mutex<u32>>)
{
   let mut pval = value.lock().unwrap();
   *pval = if do_inc { inc(*pval) } else { dec(*pval) };

}

#[cfg_attr(crux, crux_test)]
fn main() {
   let v0 = u32::symbolic("value");
   let value = Arc::new(Mutex::new(v0));

   for i in 0..NUM_THREADS {
       let do_inc = bool::symbolic("do_inc");
       let t_value = value.clone();
       thread::spawn(move || action(do_inc, &t_value));
   }

   let mut pval = value.lock().unwrap();
   let v = *pval;
   crucible_assert!(if v >= v0 { v - v0 } else { v0 - v } <= NUM_THREADS as u32 );
}

Fortunately, it turns out that we can support threads in Crucible without modifying the engine at all. This means we can benefit from the years of engineering that have been poured into Crucible but would benefit from any future improvements as well. Likewise, leaving threads out of the language supported by Crucible reduces the complexity of maintenance on, and improvements, to the symbolic executor. 

Adding Concurrency

We can model the behavior of the multithreaded program above as a collection of all of the different executions of the program. For example, we might have the following two execution orders: 

  • Thread 0: inc → Thread 1: dec → Thread 2: inc
  • Thread 1: dec → Thread 2: inc → Thread 0: inc 

Each of these executions corresponds to a particular thread interleaving, or ordering of the different atomic statements executed by all of the threads.

At a high level, then, we approach the problem by searching for new interleavings and then using Crucible to verify each one. We built out a prototype implementation of this idea as a new library and instantiated it for use in crux-mir. 

Conceptually, our approach works as follows:

  1. A Scheduler steps through a Crucible program (much like how you would use gdb to step through a C program). The scheduler adds threads to a pool of running threads as they are created. The Scheduler picks a thread and its next instruction at each step and asks Crucible to execute the instruction. In this way, the Scheduler uses Crucible to execute an interleaving of the programs concurrently executing threads. This interleaving is recorded for use in the next step.
  1. When there are no more active threads to run, the Scheduler reviews the interleaving and decides if it should restart the program in an attempt to produce a new interleaving.
  1. When the Scheduler determines that it has explored all interesting interleavings, it terminates execution. Any assertion failures discovered by Crucible in any interleaving can be reported to the user.

Rust Instantiation. We have extended crux-mir with the capability to use this new library for the exploration of multithreaded programs. The only additional work that was required was to teach the Scheduler which Rust function calls correspond to interesting primitives, such as thread creation, lock acquisition, etc. Each crux-based tool that wishes to use these new concurrency-aware features needs to provide this information. 

Our prototype currently recognizes Mutex operations (lock and unlock) and sequentially consistent operations on atomics (std::sync::atomic::Atomic*). To help guide the development of the tool, we ported some SV-COMP verification tasks from C to Rust: these benchmarks are accessible here.

Scalability 

This project aimed to show what can be done in Crucible, and our prototype explores thread interleavings in a naive way by enumeration. To be sure to discover all errors, we can simply explore all such interleavings. This means our prototype only scales to small programs and small numbers of threads. 

If we wanted to build a scalable concurrency verification tool, the next step would be to detect equivalences between classes of threads, meaning we could reduce the number of interleavings that need to be explored. We architected our current prototype to make it easy to experiment with different approaches. This is an active area of research, and there are many strategies that have been proposed (for example see this CAV 2020 paper, which includes a discussion of previous work in the space). 

Crucible as a Platform

We were very happy with the outcome of this project. Besides the new ability to verify properties of multithreaded programs, this project has demonstrated that Crucible can be an effective platform for experimenting with and building out different sorts of verification tools. For example, via What4, tools built on top of Crucible all share support for a variety of SMT solvers and theories. Many simulation options, such as optimizations, counterexample reporting, and search strategies, are implemented at the Crucible level. As such, tools built on top of Crucible and Crux inherit these capabilities with minimal additional work. 

Our concurrency verification library is open source and available as part of Crucible. More information on Crux and Crucible is available on the Crux website. We would love to hear from you if you are interested in building on Crux, Crucible, or any of our other tools.