UC-Crux is an open-source command-line tool and Haskell library for performing under-constrained symbolic execution on LLVM code for the sake of exposing bugs or verifying the absence of certain types of undefined behavior. It requires only LLVM bitcode as input. It is built on Galois’s Crucible library for symbolic execution.
Under-constrained symbolic execution (developed by Ramos and Engler) is a technique that can be used for many of the same purposes as more traditional symbolic execution, but sidesteps some scalability issues and requires little to no user intervention or setup. Its greatest strength and fundamental limitation is that it can start at any function in the target program.
If you’d like to try out UC-Crux before reading on, the README has a small demo.
A great way to understand the current capabilities and limitations of UC-Crux (and its particular approach to under-constrained symbolic execution) is to look at what it can do with a large, heavily used codebase like OpenSSH [1]:
Overall results:
Unclear result, errors are either false or true positives (or timeouts were hit): 116
Found likely bugs: 2
Function is safe if deduced preconditions are met: 95
Function is safe up to the specified bounds on loops/recursion: 3
Function is always safe: 17
This summary shows that:
This report is truncated, but the full output also indicates the source locations and nature of the purported “likely bugs,” descriptions of what kinds of contracts were inferred, an accounting of any unimplemented features that prevented analysis (in this case, 2 instances of variable-arity functions), and more.
UC-Crux is designed to support several related use cases:
Symbolic execution is a powerful technique for understanding the behavior of a program. Programs are normally executed on concrete inputs, such as the integer 7 or the string "hello". Symbolic execution can be thought of as interpreting a program on values that can be either concrete or symbolic, where symbolic values represent an unknown but possibly constrained value. If the symbolic executor reaches a branch with a symbolic condition, it follows both the true and false branches. Symbolic values and symbolic branching enable symbolic execution to explore all possible paths through a program. The result of symbolic execution is a description of how the function behaves for some class of inputs: we can ask whether the program can crash, or whether it computes the correct result from its inputs. Symbolic execution lies at the heart of Galois’s verification toolchain, and in particular, our industry-strength symbolic testing tool, Crux. For more information about Crux and symbolic execution see the Galois blog post introducing Crux and crux.galois.com.
Symbolic execution has to start somewhere. There are a few common approaches present in tools like KLEE or Crux-LLVM:
Each approach has its advantages and disadvantages. The first may have difficulty achieving adequate coverage due to path explosion, whereas the second requires more time and energy spent writing harnesses. Unit and property testing share many of the same disadvantages as the second approach. Under-constrained symbolic execution presents a third alternative, with its own set of trade-offs.
Roughly speaking, under-constrained symbolic execution uses heuristics to uncover what kind of inputs and environments are needed for the program to run without undefined behavior. (Or conversely, what kinds of inputs and environments can cause undefined behavior to occur.) Under-constrained symbolic execution can be thought of as a kind of contract (or precondition) inference: under-constrained symbolic execution tries to find out what preconditions are sufficient for successful execution of a given program (or procedure).
For a function like div that takes integers as arguments, it’s reasonably clear how this should work: run the function with symbolic integers as arguments and assert that no undefined behavior occurs in its execution.
Things get more interesting when considering functions that take pointer-rich data structures (e.g., linked lists) as arguments. Should the tool generate null pointers as input? Circular linked lists? Can the tool support heap structures of unbounded or symbolic size?
There’s another tricky issue: What constitutes a bug? Sure, perhaps your library function my_div
has undefined behavior when it’s second argument is zero, but that’s probably expected, it might even be well-documented! Similarly, lots of functions that take pointer arguments assume that they’ll be non-null and freely dereference them. This is where the name under-constrained comes from: inputs to arbitrary functions may have preconditions that are not evident from the function’s signature nor even implementation, nothing in the code describes all the constraints on such input.
To address both of these issues, UC-KLEE (the tool described by Dawson and Engler) uses a technique called lazy initialization. Essentially, as the program under test reads and writes from memory during symbolic execution, the tool lazily/dynamically allocates and initializes such memory as necessary to make the read or write operation safe. For a more precise explanation of the behavior of UC-KLEE, see the paper that introduces it. UC-Crux takes a somewhat different approach (described in the next section), which enables a clean separation of constraint instantiation heuristics from the implementation of the memory model.
UC-Crux takes an iterative approach to generating symbolic inputs. That is, the tool begins with a simple candidate input, and executes the procedure. If execution fails, this indicates the symbolic input must be refined. The tool examines the error, heuristically updates the input, and restarts the analysis. Of course, it may happen that the heuristics cannot understand and fix the failure: in this case, the program may contain a bug, or it may be that the heuristics are not sophisticated enough to compute appropriate constraints on the input. [8]
The following subsections explain this process in detail. They are pretty technical and can be safely skipped.
To understand how UC-Crux generates symbolic inputs, we first need to take a look at its memory model (inherited from Crucible-LLVM). Reading from the Crucible-LLVM documentation on the subject (permalink):
Each allocation in the memory model is completely isolated from all other allocations and has a unique block identifier. Pointers into the memory model have two components: the identifier of the block they reference and an offset into that block. Note that both the identifier and the offset can be symbolic.
Said another way, the memory model is two-dimensional: One “axis” of a pointer determines which allocation (possibly allocations in the case of a symbolic block identifier) it points to, and the other “axis” is the offset within that allocation. The crucial point for UC-Crux is that pointers consist of a pair of data: a block identifier and an offset. [9] For example, the null pointer is represented by the pair (0, 0x0)
.
Once a procedure in the target program has been selected [10] (this can be done manually by the user or every function in a target program can be explored), UC-Crux will generate the “smallest” fully symbolic input for that function based on its type signature, and then gradually “expand” it as necessary. For instance, if a function takes a single integer as an argument, UC-Crux generates a symbolic integer of the appropriate width. If it takes a pointer to an integer, UC-Crux generates a pointer with a block identifier of 0 and a symbolic offset. Call these initial pointers. Initial pointers don’t point to any valid memory, but they also don’t (concretely) compare equal to the null pointer.
UC-Crux will then symbolically execute the procedure on this input:
If the function reads from or writes to an initial pointer, such a read or write will always be undefined and cause that pointer to point to allocated memory on the next iteration. That is, the heuristic always thinks it's reasonable (not a bug) for a procedure to read from and write to pointers that are reachable from its arguments or global variables. To ensure termination, UC-Crux (just like UC-KLEE before it) uses k-bounding (also known as k-limiting), i.e., it will only expand data structures up to some (optionally user-specified) constant depth. [11]
Any two initial pointers may alias and may be null. However, when UC-Crux decides that a given pointer needs to be non-null and allocates memory to back it, each such pointer will have its own unique block of memory. Therefore, no valid pointer will alias any initial pointer, nor will two valid pointers alias. [12] In particular, UC-Crux won’t generate cyclic data structures.
Global variables are treated identically to arguments: They are initialized to the “smallest” symbolic value compatible with their type, and then “expanded” if/when the procedure under test accesses them. Therefore, the same caveats around aliasing apply to globals: No (valid) pointers that appear inside of the values of global variables will alias with one another, nor with any pointers that are reachable from the function’s arguments.
Symbolic execution can either be used for verification (as in Crux and KLEE), or for bug hunting (with tools like angr, Manticore, KLEE, and many others). During verification, [13] it’s crucial that the symbolic execution tool over-approximates the set of all possible program behaviors (or signals an error if this isn’t feasible). You don’t want to miss a bug and claim that your program has been proved correct! During bug-hunting, tools have more leeway to under-approximate possible behaviors. For instance, if the symbolic execution tool encounters a call to malloc with a symbolic argument, it can fork and explore just the minimum and maximum possible values (if they can be determined, and the tool designers think that this is a useful strategy for finding bugs).
UC-Crux is intended to be usable as both a (limited) verification tool and for bug hunting. Accordingly, it has several features that intentionally sacrifice soundness for code coverage.
Programs often call functions that are defined externally, which are eventually resolved in a later build step or at runtime by the linker. It's difficult to over-approximate the behavior of these functions (for example, they might have arbitrary effects on the values of global variables), but it's a lot easier to under-approximate them: Simply assume that the function doesn't mutate its arguments or any globals, and generate a fully symbolic return value (using the same strategy outlined above).
In a similar vein, it can be a tedious and error-prone process for tool designers to implement over-approximate versions of libc functions and system calls. It's much easier, for example, to implement a version of gethostname that simply always writes the constant string "galois" into its argument, rather than a version that chooses a string with symbolic length and content. Accordingly, UC-Crux ships with under-approximate versions of a few functions that Crux doesn't fully support.
strlen
will only work on concrete pointers. These operations are very difficult to handle in a performant way in full generality, as each additional character introduces a symbolic branch. Experiments with the SMT String theory may improve scalability, but it will likely remain a significant challenge.Recall that under-constrained symbolic execution can be considered a form of contract inference. There are some important limitations on the situations in which UC-Crux can infer adequate contracts.
The algorithm UC-Crux uses for generating inputs can produce contracts/preconditions that are too strong, i.e., they are more restrictive than necessary. For example, UC-Crux would conclude that the pointer p
in the following function must be non-null because it is dereferenced along one path through the function:
int f(int *p, bool x) {
if (x) {
return *p;
}
return x;
}
A more precise contract would state that p
must be non-null whenever x
is true
, but UC-Crux doesn’t (yet) have the ability to deduce contracts involving disjunctions.
UC-Crux can’t yet infer constraints involving two or more separate pieces of data (function arguments or global variables). For example, UC-Crux can’t deduce that in the following function, p
needs to point to an allocation of size greater than the value of i
:
int g(int *p, int i) {
return p[i];
}
UC-Crux is still young and more exciting features are on the way. This new open-source tool uses under-constrained symbolic execution and will hopefully be useful to programmers, security researchers, and proof engineers alike. Pull requests, issues, and questions are welcome! Please reach out to crux@galois.com to start a conversation.
[1]: The following script records the steps taken to collect these results (after building uc-crux-llvm from commit d4d8d846
in GaloisInc/crucible). The output has been lightly edited for clarity. The results may differ from run to run depending on the SMT solver version, how busy the CPU is, the price of eggs, and cosmic rays.
git clone https://github.com/langston-barrett/nixpkgs-blight
cd nixpkgs-blight
git checkout 7741d44b359ab255b09fe2e6fe944b5f1c38da5c
make out/bitcode/openssh
uc-crux-llvm -- --solver=z3 --path-sat-solver=z3 --verbosity=3 --path-sat --path-strategy=split-dfs --explore --explore-parallel --explore-budget=500 --explore-timeout=5 --fail-fast --re-explore --no-compile out/bitcode/openssh/sshd.bc
[2]: With caveats
[3]: Again, with caveats.
[4]: Or binaries lifted to LLVM with tools like reopt or Remill.
[5]: Again, with caveats.
[6]: This feature is not currently supported via the command-line interface, but is possible with the Haskell library API.
[7]: For examples of this approach, see the “Using Symbolic Environment” KLEE tutorial, or the Manticore README.
[8]: From a very technical point of view, there’s always a precondition that would make any function execute successfully: an always-false one. The logical “principle of explosion” applies in this case. However, such a precondition isn’t very helpful, and UC-Crux won’t generate one.
[9]: Non-pointer bitvectors (e.g. the 5 in int x = 5;)
are represented the same way as pointers, but with a block identifier that’s concretely 0.
[10]: While UC-Crux will start at a specific procedure, it will follow control flow into other functions as they’re called.
[11]: Technically, UC-Crux doesn’t specifically bound the depth/size of inputs, but rather has bounds on the number of loop iterations and recursive calls it will follow, which amounts to the same thing.
[12]: This isn’t a fundamental limitation, but rather an artifact of the current progress in developing UC-Crux. We hope to improve the input generation algorithm to consider aliasing in the future.
[13]: “Verification” can mean a lot of things - let’s focus on verifying the absence of undefined behavior for the moment, though some of these comments apply to functional specification as well.
[14]: As well as virtually every other language that compiles to LLVM, and pretty much every even halfway mainstream programming language with the sole, venerable exception of Standard ML.
[15]: In addition to these practical limitations, there’s also a more fundamental issue: SMT solvers are built to solve an NP-complete problem. Therefore, their performance can’t be uniformly good in all cases.