A Visual Guide to Pointer Analysis with cclyzer++: Part 2

This is the second part of a two-part blog post series that visually illustrates the essence of cclyzer++ by means of examples. You'll probably want to read part 1 before continuing.

The previous post explained that the goal of pointer analysis is to compute an overapproximation of the points-to relation of a program, which relates program variables and allocations to the sets of allocations they can point to at runtime. All of the examples in that post only showed analysis of a single function. In this installment, we'll examine how cclyzer++ performs inter-procedural analysis on whole programs. We'll also see two sources of imprecision for pointer analyses and how cclyzer++ ameliorates them.

Computing the Fixed Point

In part 1, we saw the core rules of a flow-insensitive pointer analysis for LLVM. To compute the points-to relation, pointer analyses apply these rules until they reach a fixed point, that is, until no rule can generate new points-to information. This analysis is guaranteed to reach such a fixed point and terminate, because each rule adds an edge/pair to the points-to relation and the allocation site abstraction guarantees that the points-to relation has a finite maximum size. In the limit, each variable could point to each allocation, but even such an exceedingly large points-to relation would be finite.

Interprocedural Analysis

Consider the following C program and its LLVM translation:

#include <stdlib.h>

void *snd(char *u, int *v) {
  void *w = (void *)v;
  return w;
}

int main() {
  char *x = malloc(sizeof(char));
  int *y = malloc(sizeof(int));
  void *z = snd(x, y);
  return 0;
}

define i8* @snd(i8* %u, i32* %v) {
entry:
    %w = bitcast i32* %v to i8*
    ret i8* %w
}

define i32 @main() {
entry:
    %x = call i8* @malloc(i64 1)
    %y1 = call i8* @malloc(i64 4)
    %y2 = bitcast i8* %y1 to i32*
    %z = call i8* @snd(i8* %x, i32* %y2)
    ret i32 0
}


(Note that LLVM doesn't distinguish between void pointers and byte pointers, translating both void* and char* as i8*.)

To see what %z might point to, cclyzer++ must reason about the call to @snd. When cclyzer++ analyzes this call, it looks at the arguments to the call and matches them up with the formal parameters of the callee. In this case, %x gets matched with %u and %y2 gets matched with %v. For each argument, the points-to information from the actual argument gets copied to the formal argument. So before the call, the points-to graph looks like this:

Right at the beginning of the execution of @snd, the points to information for %x gets copied to %u, and similarly for %y2 and %v, giving

Then the body of the function (i.e., the bitcast instruction) is analyzed, yielding

Finally, the points-to information for the return value is copied back into the variable that captures the return value of the call. In this example, points-to information is copied from %w to %z.

Context Sensitivity

The interprocedural analysis sketched above loses precision in some cases. Consider the following program:

#include <stdlib.h>

void *id(void *z) {
  return z;
}

int main() {
  char u;
  char v;
  char *x = id(&u);
  char *y = id(&v);
  return 0;
}
define i8* @id(i8* %z) {
entry:
    ret i8* %z
}

define i32 @main() {
entry:
    %u = alloca i8, align 1
    %v = alloca i8, align 1
    %x = call i8* @id(i8* %u)
    %y = call i8* @id(i8* %v)
    ret i32 0
}


After the first call to @id, the points-to graph would look like this:

However, after the second call, %z additionally points to stack@main[%v], which then gets propagated back to %x (and vice versa for stack@main[%u] and %y).

This result is imprecise. At runtime, %x really can only point to the allocation represented by stack@main[%u] and %y can only point to stack@main[%v], but the two get conflated when they both flow through @id. To avoid this issue, cclyzer++ can analyze functions context-sensitively. A context-sensitive analysis analyzes each function once per context. The most common choice of context is the callsite.

Let's look at this program again, but analyze @id once per context. Contexts are drawn as boxes and labeled with their callsite. Since @main has no callers, its context is just labeled @main. After the first call to @id, the points-to graph would look like this:

Since @id gets analyzed separately in a new context at the second callsite, the parameters and return values don't get conflated.

While this example program was fairly contrived, context sensitivity turns out to be a crucial technique for pointer analyses to retain precision on realistic programs. cclyzer++ provides several settings for tuning context-sensitivity, including types of contexts other than callsites, and increased context depth (this example showed a context depth of 1).

Field Sensitivity

The analysis described so far loses precision when applied to programs that use structs. Consider the following program:

typedef struct { int *x; int *y; } s;

int main() {
    int u;
    int v;
    s w;
    w.x = &u;
    w.y = &v;
    int *z = w.x;
    return 0;
}


At the LLVM level, a field assignment like w.x = &u (that is, a reference to a field on the left-hand side of an assignment) gets split into two instructions, a getelementptr instruction that calculates the address of the field and a store that stores the value to it. Likewise, a reference to a field on the right-hand side of an assignment turns into a getelementptr followed by a load. The getelementptr instruction looks a bit intimidating, but suffice to say that in this example, the three getelementptr instructions correspond exactly to the three field references in the source program. (See this blog post, some LLVM docs, and the official language reference for more details on getelementptr.)

%struct.s = type { i32*, i32* }

define i32 @main() {
entry:
    %u = alloca i32, align 4
    %v = alloca i32, align 4
    %w = alloca %struct.s, align 8
    %x = getelementptr %struct.s, %struct.s* %w, i32 0, i32 0
    store i32* %u, i32** %x, align 8
    %y = getelementptr %struct.s, %struct.s* %w, i32 0, i32 1
    store i32* %v, i32** %y, align 8
    %x1 = getelementptr %struct.s, %struct.s* %w, i32 0, i32 0
    %z = load i32*, i32** %x1, align 8
    ret i32 0
}


With the field-insensitive analysis described above, the points-to graph would look like this:

The results for %z are imprecise - at runtime, %z can actually only point to stack@main[%u]. To fix this issue, cclyzer++ creates field suballocations for different struct fields. getelementptr instructions return the appropriate field suballocation when applied to a struct-typed allocation. Otherwise, field suballocations act just like normal abstract allocations.

Graphically, a struct-typed allocation is linked to its suballocations with a dotted arrow. The first suballocation of a struct is also linked to its parent by a thick, undirected edge which indicates that they are must-aliases, i.e., everything that one points to is also pointed to by the other. For the sake of presentation, outgoing edges of must-aliases aren't actually duplicated.

Here's what the points-to graph would look like just after the stack allocation of %w:

After the rest of the function:

Now, %x1 points to the first field of %w (stack@main[%w].u) instead of all of it, and the result for %z is correspondingly more precise.

There's an analogous precision issue for programs with arrays, which can be solved in an analogous way with array sensitivity.

An Extended Example

We're now ready to examine the example that was shown at the beginning of Part 1 of this series. It's a program that constructs a linked list.

#include <stdlib.h>

typedef struct list list_t;
typedef struct list {
  void *data;
  list_t *next;
} list_t;

list_t *head;

list_t *mk(void *d) {
  list_t *node = malloc(sizeof(node));
  node->data = d;
  node->next = NULL;
  return node;
}

int main(int argc, char *argv[]) {
  int x = 0;
  head = mk(&x);
  list_t *last = head;
  for (int i = 0; i next = new;
    last = new;
  }
  return 0;
}


Now, the LLVM. This example has more conditional control flow using the icmp, br, and phi instructions, but it can be ignored (recall that cclyzer++ is flow-insensitive). It's big, but we'll walk through it step by step.

%struct.list = type { i8*, %struct.list* }

@head = global %struct.list* null, align 8

define %struct.list* @mk(i8* %d) {
entry:
  %node1 = call i8* @malloc(i64 8)
  %node2 = bitcast i8* %node1 to %struct.list*
  %data = getelementptr %struct.list, %struct.list* %node2, i32 0, i32 0
  store i8* %d, i8** %data, align 8
  %nx = getelementptr %struct.list, %struct.list* %node2, i32 0, i32 1
  store %struct.list* null, %struct.list** %nx, align 8
  ret %struct.list* %node2
}

define i32 @main(i32 %argc, i8** %argv) {
entry:
  %x1 = alloca i32, align 4
  %iptr1 = alloca i32, align 4
  store i32 0, i32* %x1, align 4
  %x2 = bitcast i32* %x1 to i8*
  %call = call %struct.list* @mk(i8* %x2)
  store %struct.list* %call, %struct.list** @head, align 8
  %head = load %struct.list*, %struct.list** @head, align 8
  store i32 0, i32* %iptr1, align 4
  br label %for.cond

for.cond:
  %last = phi %struct.list* [ %head, %entry ], [ %new, %for.inc ]
  %i1 = load i32, i32* %iptr1, align 4
  %cmp = icmp slt i32 %i1, %argc
  br i1 %cmp, label %for.body, label %for.end

for.body:
  %iptr2 = bitcast i32* %iptr1 to i8*
  %new = call %struct.list* @mk(i8* %iptr2)
  %next = getelementptr %struct.list, %struct.list* %last, i32 0, i32 1
  store %struct.list* %new, %struct.list** %next, align 8
  br label %for.inc

for.inc:
  %i2 = load i32, i32* %iptr1, align 4
  %inc = add i32 %i2, 1
  store i32 %inc, i32* %iptr1, align 4
  br label %for.cond

for.end:
  ret i32 0
}


Let's start with just the global variable @head, which initially points to null. For the sake of simplicity, the analysis is context-insensitive.

Then the program makes a few stack allocations before the first call to @mk:

At the first call to @mk, %x2 gets copied to %d:

Then the body of the function executes, creating a new struct-typed heap allocation (with corresponding field suballocations) and initializing it with some data.

The call returns, assigning %node2 to the return value %call. @main stores %call to the global variable @head, and loads to the local variable %head from the global @head, so that it aliases %call.

At the beginning of the for-loop, there's a phi instruction. In the first iteration of the loop it assigns %head to %last, and on the rest of the iterations it assigns %new to %last.

At the basic block label for.body, %iptr1 is cast to an i32* and then passed to @mk. Due to the allocation site abstraction and context insensitivity, this only results in minor updates to the points-to facts in the body of @mk. In particular, the writes to heap@mk[%node1].data are combined, even though at runtime they happen to different concrete allocations. This imprecision would be avoided with a 1-callsite-sensitive analysis with a 1-callsite-sensitive heap.

After @mk returns, the freshly-allocated link %new is stored to the next field of %last. This results in a sort of cycle in the points-to graph, where heap@mk[%node1].next points to heap@mk[%node1]. This demonstrates two ways that the context-insensitive allocation site abstraction loses precision on this program:

  1. It loses information about the length of the linked list, which must be of length >1 at this point in the program.
  2. It doesn't rule out this program constructing a circular list.

When returning to the head of the loop, no additional points-to facts are computed because %head and %new point to the same abstract allocation. The for.inc block has no effect on the points-to graph, so we've computed the fixed point and we're done! Phew!

Conclusion

Pointer analysis is a foundational static analysis. We hope you've enjoyed learning a bit more about it! More information about cclyzer++ can be found in the project documentation.