Cloning-Based Context-Sensitive Pointer Alias Analysis Using Binary Decision Diagrams

Page created by Dustin Obrien
 
CONTINUE READING
Cloning-Based Context-Sensitive Pointer Alias Analysis
              Using Binary Decision Diagrams

                                                        John Whaley      Monica S. Lam
                                                         Computer Science Department
                                                              Stanford University
                                                              Stanford, CA 94305
                                                          {jwhaley, lam}@stanford.edu

ABSTRACT                                                                             1.    INTRODUCTION
This paper presents the first scalable context-sensitive, inclusion-                    Many applications of program analysis, such as program opti-
based pointer alias analysis for Java programs. Our approach to                      mization, parallelization, error detection and program understand-
context sensitivity is to create a clone of a method for every con-                  ing, need pointer alias information. Scalable pointer analyses
text of interest, and run a context-insensitive algorithm over the ex-               developed to date are imprecise because they are either context-
panded call graph to get context-sensitive results. For precision,                   insensitive[3, 17, 19, 33] or unification-based[15, 16]. A context-
we generate a clone for every acyclic path through a program’s call                  insensitive analysis does not distinguish between different calling
graph, treating methods in a strongly connected component as a sin-                  contexts of a method and allows information from one caller to
gle node. Normally, this formulation is hopelessly intractable as a                  propagate erroneously to another caller of the same method. In
call graph often has 1014 acyclic paths or more. We show that these                  unification-based approaches, pointers are assumed to be either un-
exponential relations can be computed efficiently using binary de-                   aliased or are pointing to the same set of locations[28]. In contrast,
cision diagrams (BDDs). Key to the scalability of the technique is                   inclusion-based approaches are more efficient but also more expen-
a context numbering scheme that exposes the commonalities across                     sive, as they allow two aliased pointers to point to overlapping but
contexts. We applied our algorithm to the most popular applications                  different sets of locations.
available on Sourceforge, and found that the largest programs, with                     We have developed a context-sensitive and inclusion-based
hundreds of thousands of Java bytecodes, can be analyzed in under                    pointer alias analysis that scales to hundreds of thousands of Java
20 minutes.                                                                          bytecodes. The analysis is field-sensitive, meaning that it tracks
   This paper shows that pointer analysis, and many other queries                    the individual fields of individual pointers. Our analysis is mostly
and algorithms, can be described succinctly and declaratively using                  flow-insensitive, using flow sensitivity only in the analysis of local
Datalog, a logic programming language. We have developed a sys-                      pointers in each function. The results of this analysis, as we show in
tem called bddbddb that automatically translates Datalog programs                    this paper, can be easily used to answer users’ queries and to build
into highly efficient BDD implementations. We used this approach                     more advanced analyses and programming tools.
to develop a variety of context-sensitive algorithms including side
effect analysis, type analysis, and escape analysis.                                 1.1    Cloning to Achieve Context Sensitivity
                                                                                        Our approach to context sensitivity is based on the notion of
Categories and Subject Descriptors                                                   cloning. Cloning conceptually generates multiple instances of a
D.3.4 [Programming Languages]: Processors—Compilers; E.2                             method such that every distinct calling context invokes a different
[Data]: Data Storage Representations                                                 instance, thus preventing information from one context to flow to
                                                                                     another. Cloning makes generating context-sensitive results algo-
                                                                                     rithmically trivial: We can simply apply a context-insensitive algo-
General Terms                                                                        rithm to the cloned program to obtain context-sensitive results. Note
Algorithms, Performance, Design, Experimentation, Languages                          that our analysis does not clone the code per se; it simply produces
                                                                                     a separate answer for each clone.
Keywords                                                                                The context of a method invocation is often distinguished by its
                                                                                     call path, which is simply the call sites, or return addresses, on the
context-sensitive, inclusion-based, pointer analysis, Java, scalable,
                                                                                     invocation’s call stack. In the case of a recursive program, there
cloning, binary decision diagrams, program analysis, Datalog, logic
                                                                                     are an unbounded number of calling contexts. To limit the number
programming
                                                                                     of calling contexts, Shivers proposed the concept of k-CFA (Con-
                                                                                     trol Flow Analysis) whereby one remembers only the last k call
                                                                                     sites[26]. Emami et al. suggested distinguishing contexts by their
Permission to make digital or hard copies of all or part of this work for            full call paths if they are acyclic. For cyclic paths, they suggested
personal or classroom use is granted without fee provided that copies are            including each call site in recursive cycles only once[14]. Our ap-
not made or distributed for profit or commercial advantage and that copies           proach also uses entire call paths to distinguish between contexts
bear this notice and the full citation on the first page. To copy otherwise, to      in programs without recursion. To handle recursion, call paths are
republish, to post on servers or to redistribute to lists, requires prior specific   reduced by eliminating all invocations whose callers and callees be-
permission and/or a fee.
PLDI’04, June 9–11, 2004, Washington, DC, USA.                                       long to the same strongly connected component in the call graph.
Copyright 2004 ACM 1-58113-807-5/04/0006 ...$5.00.                                   These reduced call paths are used to identify contexts.
It was not obvious, at least to us at the beginning of this project,   algorithms here directly. All the experimental results reported in
that a cloning-based approach would be feasible. The number of re-        this paper are obtained by running the BDD programs automatically
duced call paths in a program grows exponentially with the number         generated by bddbddb.
of methods, and a cloning-based approach must compute the result             Context-sensitive queries and other analyses. The context-
of every one of these contexts. Emami et al. have only reported           sensitive points-to results, the simple cloning-based approach to
context-sensitive points-to results on small programs[14]. Realistic      context sensitivity, and the bddbddb system make it easy to write
programs have many contexts; for example, the megamek applica-            new analyses. We show some representative examples in each of
tion has over 1014 contexts (see Section 6.1). The size of the final      the following categories:
results alone appears to be prohibitive.
                                                                               1. Simple queries. The results from our context-sensitive pointer
   We show that we can scale a cloning-based points-to analysis
                                                                                  analysis provide a wealth of information of interest to pro-
by representing the context-sensitive relations using ordered binary
                                                                                  grammers. We show how a few lines of Datalog can help
decision diagrams (BDDs)[6]. BDDs, originally designed for hard-
                                                                                  programmers debug a memory leak and find potential secu-
ware verification, have previously been used in a number of pro-
                                                                                  rity vulnerabilities.
gram analyses[2, 23, 38], and more recently for points-to analy-
                                                                               2. Algorithms using context-sensitive points-to results. We show
sis[3, 39]. We show that it is possible to compute context-sensitive
                                                                                  how context-sensitive points-to results can be used to cre-
points-to results for over 1014 contexts.
                                                                                  ate advanced analyses. We include examples of a context-
   In contrast, most context-sensitive pointer alias analyses devel-
                                                                                  sensitive analysis to compute side effects (mod-ref) and an
oped to date are summary-based[15, 34, 37]. Parameterized sum-
                                                                                  analysis to refine declared types of variables.
maries are created for each method and used in creating the sum-
maries of its callers. It is not necessary to represent the results for        3. Other context-sensitive algorithms. Cloning can be used to
the exponentially many contexts explicitly with this approach, be-                trivially generate other kinds of context-sensitive results be-
cause the result of a context can be computed independently using                 sides points-to relations. We illustrate this with a context-
the summaries. However, to answer queries as simple as “which                     sensitive type analysis and a context-sensitive thread escape
variables point to a certain object” would require all the results                analysis. Whereas previous escape analyses require thou-
to be computed. The readers may be interested to know that, de-                   sands of lines of code to implement[34], the algorithm here
spite much effort, we tried but did not succeed in creating a scalable            has only seven Datalog rules.
summary-based algorithm using BDDs.                                          Experimental Results. We present the analysis time and mem-
                                                                          ory usage of our analyses across 21 of the most popular Java appli-
1.2    Contributions                                                      cations on Sourceforge. Our context-sensitive pointer analysis can
   The contributions of this paper are not limited to just an algorithm   analyze even the largest of the programs in under 19 minutes. We
for computing context-sensitive and inclusion-based points-to infor-      also compare the precision of context-insensitive pointer analysis,
mation. The methodology, specification language, representation,          context-sensitive pointer analysis and context-sensitive type analy-
and tools we used in deriving our pointer analysis are applicable to      sis, and show the effects of merging versus cloning contexts.
creating many other algorithms. We demonstrate this by using the          1.3      Paper Organization
approach to create a variety of queries and algorithms.
   Scalable cloning-based context-sensitive points-to analysis               Here is an overview of the rest of the paper. Section 2 ex-
using BDDs. The algorithm we have developed is remarkably sim-            plains our methodology. Using Berndl’s context-insensitive points-
ple. We first create a cloned call graph where a clone is created         to algorithm as an example, we explain how an analysis can be
for every distinct calling context. We then run a simple context-         expressed in Datalog and, briefly, how bddbddb translates Dat-
insensitive algorithm over the cloned call graph to get context-          alog into efficient BDD implementations. Section 3 shows how
sensitive results. We handle the large number of contexts by rep-         we can easily extend the basic points-to algorithm to discover
resenting them in BDDs and using an encoding scheme that al-              call graphs on the fly by adding a few Datalog rules. Section 4
lows commonalities among similar contexts to be exploited. We             presents our cloning-based approach and how we use it to compute
improve the efficiency of the algorithm by using an automatic tool        context-sensitive points-to results. Section 5 shows the represen-
that searches for an effective variable ordering.                         tative queries and algorithms built upon our points-to results and
   Datalog as a high-level language for BDD-based program                 the cloning-based approach. Section 6 presents our experimental
analyses. Instead of writing our program analyses directly in terms       results. We report related work in Section 7 and conclude in Sec-
of BDD operations, we store all program information and results as        tion 8.
relations and express our analyses in Datalog, a logic programming
language used in deductive databases[30]. Because Datalog is suc-         2.     FROM DATALOG TO BDDS
cinct and declarative, we can express points-to analyses and many            In this section, we start with a brief introduction to Datalog.
other algorithms simply and intuitively in just a few Datalog rules.      We then show how Datalog can be used to describe the context-
   We use Datalog because its set-based operation semantics               insensitive points-to analysis due to Berndl et al. at a high level. We
matches the semantics of BDD operations well. To aid our al-              then describe how our bddbddb system translates a Datalog pro-
gorithm research, we have developed a deductive database system           gram into an efficient implementation using BDDs.
called bddbddb (BDD Based Deductive DataBase) that automati-
cally translates Datalog programs into BDD algorithms. We provide         2.1      Datalog
a high-level summary of the optimizations in this paper; the details         We represent a program and all its analysis results as relations.
are beyond the scope of this paper[35].                                   Conceptually, a relation is a two-dimensional table. The columns
   Our experience is that programs generated by bddbddb are faster        are the attributes, each of which has a domain defining the set of
than their manually optimized counterparts. More importantly, Dat-        possible attribute values. The rows are the tuples of attributes that
alog programs are orders-of-magnitude easier to write. They are so        share the relation. If tuple (x, y, z) is in relation A, we say that
succinct and easy to understand that we use them to explain all our       predicate A(x, y, z) is true.
A Datalog program consists of a set of rules, written in a Prolog-        allows bddbddb to communicate with the users with meaningful
style notation, where a predicate is defined as a conjunction of other       names. A relation declaration has an optional keyword specifying
predicates. For example, the Datalog rule                                    whether it is an input or output relation, the name of the relation,
                                                                             and the name and domain of every attribute. A relation declared
            D(w, z)        :−    A(w, x), B(x, y), C(y, z).
                                                                             as neither input nor output is a temporary relation generated in the
says that “D(w, z) is true if A(w, x), B(x, y), and C(y, z) are all          analysis but not written out. Finally, the rules follow the standard
true.” Variables in the predicates can be replaced with constants,           Datalog syntax. The rule numbers, introduced here for the sake of
which are surrounded by double-quotes, or don’t-cares, which are             exposition, are not in the actual program.
signified by underscores. Predicates on the right side of the rules             We can express all information found in the intermediate repre-
can be inverted.                                                             sentation of a program as relations. To avoid inundating readers
   Datalog is more powerful than SQL, which is based on relational           with too many definitions all at once, we define the relations as they
calculus, because Datalog predicates can be recursively defined[30].         are used. The domains and relations used in Algorithm 1 are:
If none of the predicates in a Datalog program is inverted, then there
is a guaranteed minimal solution consisting of relations with the             V is the domain of variables. It represents all the allocation sites,
least number of tuples. Conversely, programs with inverted pred-                    formal parameters, return values, thrown exceptions, cast op-
icates may not have a unique minimal solution. Our bddbddb                          erations, and dereferences in the program. There is also a
system accepts a subclass of Datalog programs, known as strati-                     special global variable for use in accessing static variables.
fied programs[7], for which minimal solutions always exist. Infor-            H is the domain of heap objects. Heap objects are named by the
mally, rules in such programs can be grouped into strata, each with                 invocation sites of object creation methods. To increase pre-
a unique minimal solution, that can be solved in sequence.                          cision, we also statically identify factory methods and treat
                                                                                    them as object creation methods.
2.2     Context-Insensitive Points-to Analysis                                F is the domain of field descriptors in the program. Field descrip-
   We now review Berndl et al.’s context-insensitive points-to anal-                tors are used when loading from a field (v2 = v1 .f;) or
ysis[3], while also introducing the Datalog notation. This algorithm                storing to a field (v1 .f = v2 ;). There is a special field de-
assumes that a call graph, computed using simple class hierarchy                    scriptor to denote an array access.
analysis[13], is available a priori. Heap objects are named by their          vP 0 : V × H is the initial variable points-to relation extracted
allocation sites. The algorithm finds the objects possibly pointed to               from object allocation statements in the source program.
by each variable and field of heap objects in the program. Shown in                 vP 0 (v, h) means there is an invocation site h that assigns
Algorithm 1 is the exact Datalog program, as fed to bddbddb, that                   a newly allocate object to variable v.
implements Berndl’s algorithm. To keep the first example simple,              store: V × F × V represents store statements. store(v1 , f, v2 )
we defer the discussion of using types to improve precision until                   says that there is a statement “v1 .f = v2 ;” in the program.
Section 2.3.                                                                  load : V × F × V represents load statements. load (v1 , f, v2 ) says
   A LGORITHM 1. Context-insensitive points-to analysis with a                      that there is a statement “v2 = v1 .f;” in the program.
precomputed call graph.                                                       assign: V × V is the assignments relation due to passing of argu-
                                                                                    ments and return values. assign(v1 , v2 ) means that variable
D OMAINS                                                                            v1 includes the points-to set of variable v2 . Although we do
                                                                                    not cover return values here, they work in an analogous man-
        V        262144 variable.map                                                ner.
        H        65536 heap.map
                                                                              vP : V × H is the output variable points-to relation. vP (v, h)
        F        16384 field.map
                                                                                    means that variable v can point to heap object h.
R ELATIONS                                                                    hP : H × F × H is the output heap points-to relation.
                                                                                    hP (h1 , f, h2 ) means that field f of heap object h1 can point
        input     vP 0          (variable : V, heap : H)
                                                                                    to heap object h2 .
        input     store         (base : V, field : F, source : V)
        input     load          (base : V, field : F, dest : V)                 Note that local variables and their assignments are factored away
        input     assign        (dest : V, source : V)                       using a flow-sensitive analysis[33]. The assign relation is derived
        output    vP            (variable : V, heap : H)                     by using a precomputed call graph. The sizes of the domains are
        output    hP            (base : H, field : F, target : H)            determined by the number of variables, heap objects, and field de-
RULES                                                                        scriptors in the input program.
        vP (v, h)           :−      vP 0 (v, h).                       (1)      Rule (1) incorporates the initial variable points-to relations into
                                                                             vP . Rule (2) finds the transitive closure over inclusion edges. If
        vP (v1 , h)         :−      assign(v1 , v2 ), vP (v2 , h).     (2)   v1 includes v2 and variable v2 can point to object h, then v1 can
        hP (h1 , f, h2 )    :−      store(v1 , f, v2 ),                      also point to h. Rule (3) models the effect of store instructions on
                                    vP (v1 , h1 ), vP (v2 , h2 ).      (3)   heap objects. Given a statement “v1 .f = v2 ;”, if v1 can point
                                                                             to h1 and v2 can point to h2 , then h1 .f can point to h2 . Rule (4)
        vP (v2 , h2 )       :−      load (v1 , f, v2 ),                      resolves load instructions. Given a statement “v2 = v1 .f;”, if v1
                                    vP (v1 , h1 ), hP (h1 , f, h2 ).   (4)   can point to h1 and h1 .f can point to h2 , then v2 can point to h2 .
 2                                                                           Applying these rules until the results converge finds all the possible
                                                                             context-insensitive points-to relations in the program.
   A Datalog program has three sections: domains, relations, and
rules. A domain declaration has a name, a size n, and an optional            2.3    Improving Points-to Analysis with Types
file name that provides a name for each element in the domain, inter-          Because Java is type-safe, variables can only point to objects of
nally represented as an ordinal number from 0 to n − 1. The latter           assignable types. Assignability is similar to the subtype relation,
A LGORITHM 2.          Context-insensitive points-to analysis with          2.4     Translating Datalog into Efficient BDD
type filtering.
                                                                                       Implementations
D OMAINS                                                                          We first describe how Datalog rules can be translated into opera-
  Domains from Algorithm 1, plus:                                              tors from relational algebra such as “join” and “project”, then show
                                                                               how to translate these operations into BDD operations.
        T            4096        type.map
                                                                               2.4.1    Query Resolution
R ELATIONS
                                                                                  We can find the solution to an unstratified query, or a stratum of
  Relations from Algorithm 1, plus:
                                                                               a stratified query, simply by applying the inference rules repeatedly
         input       vT          (variable : V, type : T)                      until none of the output relations change. We can apply a Datalog
         input       hT          (heap : H, type : T)                          rule by performing a series of relational natural join, project and re-
         input       aT          (supertype : T, subtype : T)                  name operations. A natural join operation combines rows from two
                     vPfilter    (variable : V, heap : H)                      relations if the rows share the same value for a common attribute.
                                                                               A project operation removes an attribute from a relation. A rename
RULES
                                                                               operation changes the name of an attribute to another one.
   vPfilter (v, h)       :−     vT (v, tv ), hT (h, th ), aT (tv , th ). (5)      For example, the application of Rule (2) can be implemented as:
     vP (v, h)           :−     vP 0 (v, h).                             (6)
                                                                                       t1 = rename(vP , variable, source);
     vP (v1 , h)         :−     assign(v1 , v2 ), vP (v2 , h),                         t2 = project(join(assign, t1 ), source);
                                vPfilter (v1 , h).                       (7)           vP = vP ∪ rename(t2 , dest, variable);
     hP (h1 , f, h2 )    :−     store(v1 , f, v2 ),                            We first rename the attribute in relation vP from variable to source
                                vP (v1 , h1 ), vP (v2 , h2 ).            (8)   so that it can be joined with relation assign to create a new points-to
     vP (v2 , h2 )       :−     load (v1 , f, v2 ), vP (v1 , h1 ),             relation. The attribute dest of the resulting relation is changed to
                                hP (h1 , f, h2 ), vPfilter (v2 , h2 ).   (9)   variable so that the tuples can be added to the vP tuples accumu-
                                                                               lated thus far.
 2                                                                                The bddbddb system uses the three following optimizations to
                                                                               speed up query resolution.
with allowances for interfaces, null values, and arrays[22]. By drop-             Attributes naming. Since the names of the attributes must match
ping targets of unassignable types in assignments and load state-              when two relations are joined, the choice of attribute names can
ments, we can eliminate many impossible points-to relations that               affect the costs of rename operations. Since the renaming cost is
result from the imprecision of the analysis. 1                                 highly sensitive to how the relations are implemented, the bddbddb
   Adding type filtering to Algorithm 1 is simple in Datalog. We               system takes the representation into account when minimizing the
add a new domain to represent types and new relations to repre-                renaming cost.
sent assignability as well as type declarations of variables and heap             Rule application order. A rule needs to be applied only if the
objects. We compute the type filter and modify the rules in Algo-              input relations have changed. bddbddb optimizes the ordering of
rithm 1 to filter out unsafe assignments and load operations.                  the rules by analyzing the dependences between the rules. For ex-
                                                                               ample, Rule 1 in Algorithm 1 does not depend on any of the other
 T is the domain of type descriptors (i.e. classes) in the program.            rules and can be applied only once at the beginning of the query
                                                                               resolution.
 vT : V × T represents the declared types of variables. vT (v, t)
                                                                                  Incrementalization. We only need to re-apply a rule on those
       means that variable v is declared with type t.
                                                                               combinations of tuples that we have not seen before. Such a tech-
 hT : H × T represents the types of objects created at a particular            nique is known as incrementalization in the BDD literature and
       creation site. In Java, the type created by a new instruction           semi-naı̈ve fixpoint evaluation in the database literature[1]. Our
       is usually known statically.2 hT (h, t) means that the object           system also identifies loop-invariant relations to avoid unnecessary
       created at h has type t.                                                difference and rename operations. Shown below is the result of in-
 aT : T × T is the relation of assignable types. aT (t1 , t2 ) means           crementalizing the repeated application of Rule (2):
       that type t2 is assignable to type t1 .
 vPfilter : V × H is the type filter relation. vPfilter (v, h) means                   d = vP ;
       that it is type-safe to assign heap object h to variable v.                     repeat
                                                                                           t1 = rename(d, variable, source);
   Rule (5) in Algorithm 2 defines the vPfilter relation: It is type-                      t2 = project(join(assign, t1 ), source);
safe to assign heap object h of type th to variable v of type tv if tv                     d0 = rename(t2 , dest, variable);
is assignable from th . Rules (6) and (8) are the same as Rules (1)                        d = d0 − vP ;
and (3) in Algorithm 1. Rules (7) and (9) are analogous to Rules (2)                       vP = vP ∪ d;
and (4), with the additional constraint that only points-to relations                  until d == ∅;
that match the type filter are inserted.
1
  We could similarly perform type filtering on stores into heap ob-            2.4.2    Relational Algebra in BDD
jects. However, because all stores must go through variables, such a              We now explain how BDDs work and how they can be used to
type filter would only catch one extra case — when the base object
is a null constant.                                                            implement relations and relational operations. BDDs (Binary Deci-
2
  The type of a created object may not be known precisely if, for              sion Diagrams) were originally invented for hardware verification to
example, the object is returned by a native method or reflection is            efficiently store a large number of states that share many common-
used. Such types are modeled conservatively as all possible types.             alities[6]. They are an efficient representation of boolean functions.
A BDD is a directed acyclic graph (DAG) with a single root node                        A LGORITHM 3. Context-insensitive points-to analysis that
and two terminal nodes, representing the constants one and zero.                     computes call graph on the fly.
Each non-terminal node in the DAG represents an input variable
                                                                                     D OMAINS
and has exactly two outgoing edges: a high edge and a low edge.
                                                                                       Domains from Algorithm 2, plus:
The high edge represents the case where the input variable for the
node is true, and the low outgoing edge represents the case where                            I          32768     invoke.map
the input variable is false. On any path in the DAG from the root                            N          4096      name.map
to a terminal node, the value of the function on the truth values on                         M          16384     method.map
the input variables in the path is given by the value of the terminal                        Z          256
node. To evaluate a BDD for a specific input, one simply starts at
the root node and, for each node, follows the high edge if the input                 R ELATIONS
variable is true, and the low edge if the input variable is false. The                 Relations from Algorithm 2, with the modification that assign is
value of the terminal node that we reach is the value of the BDD for                 now a computed relation, plus:
that input.
                                                                                           input        cha       (type : T, name : N, target : M)
   The variant of BDDs that we use are called ordered binary de-
                                                                                           input        actual    (invoke : I, param : Z, var : V)
cision diagrams, or OBDDs[6]. “Ordered” refers to the constraint
                                                                                           input        formal    (method : M, param : Z, var : V)
that on all paths through the graph the variables respect a given lin-
                                                                                           input        IE 0      (invoke : I, target : M)
ear order. In addition, OBDDs are maximally reduced meaning that
                                                                                           input        mI        (method : M, invoke : I)
nodes with the same variable name and low and high successors are
                                                                                           output       IE        (invoke : I, target : M)
collapsed as one, and nodes with identical low and high successors
are bypassed. Thus, the more commonalities there are in the paths                    RULES
leading to the terminals, the more compact the OBDDs are. Accord-                     Rules from Algorithm 2, plus:
ingly, the amount of the sharing and the size of the representation                       IE (i, m)          :−    IE 0 (i, m).                         (10)
depends greatly on the ordering of the variables.
   We can use BDDs to represent relations as follows. Each element                        IE (i, m2 )        :−    mI (m1 , i, n), actual (i, 0, v),
d in an n-element domain D is represented as an integer between                                                    vP (v, h), hT (h, t), cha(t, n, m2 ). (11)
0 and n − 1 using log2 (n) bits. A relation R : D1 × . . . × Dn is                        assign(v1 , v2 )   :−    IE (i, m), formal (m, z, v1 ),
represented as a boolean function f : D1 ×. . .×Dn → {0, 1} such                                                   actual (i, z, v2 ).                  (12)
that (d1 , . . . , dn ) ∈ R iff f (d1 , . . . , dn ) = 1, and (d1 , . . . , dn ) ∈
                                                                                 /
R iff f (d1 , . . . , dn ) = 0.                                                       2
   A number of highly-optimized BDD packages are available[21,
27]; the operations they provide can be used directly to implement                    I is the domain of invocation sites in the program. An invocation
relational operations efficiently. For example, the “replace” oper-                         site is a method invocation of the form r = p0 .m(p1 . . .
ation in BDD has the same semantics as the “rename” operation;                              pk ). Note that H ⊆ I.
the “relprod” operation in BDD finds the natural join between two
                                                                                      N is the domain of method names used in invocations. In an invo-
relations and projects away the common domains.
                                                                                            cation r = p0 .n(p1 . . . pk ), n is the method name.
   Let us now use a concrete example to illustrate the signif-
icance of variable ordering. Suppose relation R1 contains tu-                         M is the domain of implemented methods in the program. It does
ples (1, 1), (2, 1), . . . , (100, 1) and relation R2 contains tuples                       not include abstract or interface methods.
(1, 2), (2, 2), . . . , (100, 2). If in the variable order the bits for the           Z is the domain used for numbering parameters.
first attribute come before the bits for the second, the BDD will                     cha: T × N × M encodes virtual method dispatch information
need to represent the sequence 1, . . . , 100 separately for each rela-                     from the class hierarchy. cha(t, n, m) means that m is the
tion. However, if instead the bits for the second attribute come first,                     target of dispatching the method name n on type t.
the BDD can share the representation for the sequence 1, . . . , 100                  actual : I × Z × V encodes the actual parameters for invocation
between R1 and R2 . Unfortunately, the problem of finding the best                          sites. actual (i, z, v) means that v is passed as parameter
variable ordering is NP-complete[5]. Our bddbddb system auto-                               number z at invocation site i.
matically explores different alternatives empirically to find an ef-                  formal : M × Z × V encodes formal parameters for methods.
fective ordering[35].                                                                       formal (m, z, v) means that formal parameter z of method
                                                                                            m is represented by variable v.
3.     CALL GRAPH DISCOVERY                                                           IE 0 : I × M are the initial invocation edges. They record the invo-
   The call graph generated using class hierarchy analysis can have                         cation edges whose targets are statically bound. In Java, some
many spurious call targets, which can lead to many spurious points-                         calls are static or non-virtual. Additionally, local type analy-
to relations[19]. We can get more precise results by creating the call                      sis combined with analysis of the class hierarchy allows us to
graph on the fly using points-to relations. As the algorithm gener-                         determine that some calls have a single target[13]. IE 0 (i, m)
ates points-to results, they are used to identify the receiver types of                     means that invocation site i can be analyzed statically to call
the methods invoked and to bind calls to target methods; and as call                        method m.
graph edges are discovered, we use them to find more points-to re-                    mI : M × I × N represents invocation sites. mI (m, i, n) means
lations. The algorithm converges when no new call targets and no                            that method m contains an invocation site i with virtual
new pointer relations are found.                                                            method name n. Non-virtual invocation sites are given a spe-
   Modifying Algorithm 2 to discover call graphs on the fly is sim-                         cial null method name, which does not appear in the cha re-
ple. Instead of an input assign relation computed from a given call                         lation.
graph, we derive it from method invocation statements and points-to                   IE : I × M is an output relation encoding all invocation edges.
relations.                                                                                  IE (i, m) means that invocation site i calls method m.
The rules in Algorithm 3 compute the assign relation used in           (a)                     M1                                         (b)                              M1
Algorithm 2. Rules (10) and (11) find the invocation edges, with                          1                1
                                                                                      a                        b                                             a                              b
the former handling statically bound targets and the latter handling             1                                 2                                 1                                                  2
virtual calls. Rule (11) matches invocation sites with the type of                                     d
the “this” pointer and the class hierarchy information to find the          M2                c                    M3                           M2                   M3                M2               M3
possible target methods. If an invocation site i with method name n             1−2       e            1−2 f           1−2       g              e                f            g        e            f               g
is invoked on variable v, and v can point to h and h has type t, and
                                                                                      1−2                  3−4               1−2            1            3           1             2        4               2
invoking n on type t leads to method m, then m is a possible target                               M4                                  M5        M4           M4           M5           M4       M4           M5
of invocation i.
                                                                                                  1−4          h             i       1−2
   Rule (12) handles parameter passing.3 If invocation site i has a                                                                             h            h            i            h        h               i
target method m, variable v2 is passed as argument number z, and                                           1−4               5−6            1            3           5             2        4               6
the formal parameter z of method m is v1 , then the points-to set                                                  M6                           M6           M6           M6           M6       M6           M6
of v1 includes the points-to set of v2 . Return values are handled
in a likewise manner, only the inclusion relation is in the opposite            Figure 1: Example of path numbering. The graph on the left is
direction. We see that as the discovery of more variable points-                the original graph. Nodes M2 and M3 are in a cycle and there-
to (vP ) can create more invocation edges (IE ), which in turn can              fore are placed in one equivalence class. Each edge is marked
create more assignments (assign) and more points-to relations. The              with path numbers at the source and target of the edge. The
algorithm converges when all the relations stabilize.                           graph on the right is the graph with all of the paths expanded.
4.    CONTEXT SENSITIVE POINTS-TO                                                                                Call paths                Reduced call paths
   A context-insensitive or monomorphic analysis produces just one                                             reaching M6                   reaching M6
set of results for each method regardless how many ways a method                                                 a(cd)∗ eh                       aeh
may be invoked. This leads to imprecision because information                                                   b(dc)∗ deh                       beh
from different calling contexts must be merged, so information                                                  a(cd)∗ cf h                      af h
along one calling context can propagate to other calling contexts.                                               b(dc)∗ f h                      bf h
A context-sensitive or polymorphic analysis avoids this imprecision                                             a(cd)∗ cgi                       agi
by allowing different contexts to have different results.                                                        b(dc)∗ gi                        bgi
   We can make a context-sensitive version of a context-insensitive
analysis as follows. We make a clone of a method for each path
through the call graph, linking each call site to its own unique clone.               Figure 2: The six contexts of function M6 in Example 1
We then run the original context-insensitive analysis over the ex-
ploded call graph. However, this technique can require an exponen-
tial (and in the presence of cycles, potentially unbounded) number              an entry method, typically main4 , and ik is an invocation site in
of clones to be created.                                                        method mk−1 for all k > 1.
   It has been observed that different contexts of the same method                 For programs without recursion, every call path to a method de-
often have many similarities. For example, parameters to the same               fines a context for that method. To handle recursive programs,
method often have the same types or similar aliases. This obser-                which have an unbounded number of call paths, we first find the
vation led to the concept of partial transfer functions (PTF), where            strongly connected components (SCCs) in a call graph. By elimi-
summaries for each input pattern are created on the fly as they are             nating all method invocations whose caller and callee belong to the
discovered[36, 37]. However, PTFs are notoriously difficult to im-              same SCC from the call paths, we get a finite set of reduced call
plement and get correct, as the programmer must explicitly calcu-               paths. Each reduced call path to an SCC defines a context for the
late the input patterns and manage the summaries. Furthermore, the              methods in the SCC. Thus, information from different paths lead-
technique has not been shown to scale to very large programs.                   ing to the SCCs are kept separate, but the methods within the SCC
   Our approach is to allow the exponential explosion to occur and              invoked with the same incoming call path are analyzed context-
rely on the underlying BDD representation to find and exploit the               insensitively.
commonalities across contexts. BDDs can express large sets of re-
dundant data in an efficient manner. Contexts with identical infor-                E XAMPLE 1. Figure 1(a) shows a small call graph with just six
mation will automatically be shared at the data structure level. Fur-           methods and a set of invocation edges. Each invocation edge has a
thermore, because BDDs operate down at the bit level, it can even               name, being one of a through i; its source is labeled by the context
exploit commonalities between contexts with different information.              number of the caller and its sink by the context number of the callee.
BDD operations operate on entire relations at a time, rather than               The numbers will be explained in Example 2. Methods M2 and
one tuple at a time. Thus, the cost of BDD operations depends on                M3 belong to a strongly connected component, so invocations along
the size and shape of the BDD relations, which depends greatly on               edges c and d are eliminated in the computation of reduced call
the variable ordering, rather than the number of tuples in a relation.          graphs. While there are infinitely many call paths reaching method
Also, due to caching in BDD packages, identical subproblems only                M6 , there are only six reduced call paths reaching M6 , as shown in
have to be computed once. Thus, with the right variable ordering,               Figure 2. Thus M6 has six clones, one for each reduced call path.
the results for all contexts can be computed very efficiently.
                                                                                   Under this definition of context sensitivity, large programs can
4.1    Numbering Call Paths                                                     have many contexts. For example, pmd from our test programs has
   A call path is a sequence of invocation edges                                1971 methods and 1023 contexts! In the BDD representation, we
(i1 , m1 ), (i2 , m2 ), . . . , such that i1 is an invocation site in           give each reduced call path reaching a method a distinct context
3                                                                               4
  We also match thread objects to their corresponding run() meth-                 Other “entry” methods in typical programs are static class initial-
ods, even though the edges do not explicitly appear in the call graph.          izers, object finalizers, and thread run methods.
number. It is important to find a context numbering scheme that          4.2      Context-Sensitive Pointer Analysis with a
allows the BDDs to share commonalities across contexts. Algo-                     Pre-computed Call Graph
rithm 4 shows one such scheme.                                              We are now ready to present our context-sensitive pointer analy-
                                                                         sis. We assume the presence of a pre-computed call graph created,
   A LGORITHM 4. Generating context-sensitive invocation edges
                                                                         for example, by using a context-insensitive points-to analysis (Al-
from a call graph.
                                                                         gorithm 3). We apply Algorithm 4 to the call graph to generate
I NPUT: A call multigraph.                                               the context-sensitive invocation edges IE C . Once that is created,
                                                                         we can simply apply a context-insensitive points-to analysis on the
O UTPUT: Context-sensitive invocation edges IE C : C × I × C × M,        exploded call graph to get context-sensitive results. We keep the
where C is the domain of context numbers. IE C (c, i, cm , m) means      results separate for each clone by adding a context number to meth-
that invocation site i in context c calls method m in context cm .       ods, variables, invocation sites, points-to relations, etc.
M ETHOD:                                                                   A LGORITHM 5. Context-sensitive points-to analysis with a pre-
                                                                         computed call graph.
    1. A method with n clones will be given numbers 1, . . . , n.
       Nodes with no predecessors are given a singleton context          D OMAINS
       numbered 1.                                                         Domains from Algorithm 2, plus:
    2. Find strongly connected components in the input call graph.
                                                                           C           9223372036854775808
       The ith clone of a method always calls the ith clone of another
       method belonging to the same component.                           R ELATIONS
                                                                           Relations from Algorithm 2, plus:
    3. Collapse all methods in a strongly connected component to a
       single node to get an acyclic reduced graph.                        input  IE C    (caller : C, invoke : I, callee : C, tgt : M)
                                                                                  assignC (destc : C, dest : V, srcc : C, src : V)
    4. For each node n in the reduced graph in topological order,          output vP C    (context : C, variable : V, heap : H)
         Set the counts of contexts created, c, to 0.
                                                                         RULES
         For each incoming edge,
              If the predecessor of the edge p has k contexts,              vPfilter (v, h)         : − vT (v, tv ), hT (h, th ), aT (tv , th ). (13)
              create k clones of node n,                                       vP C (c, v, h)       : − vP 0 (v, h), IE C (c, h, , ).             (14)
              Add tuple (i, p, i + c, n) to IE C , for 1 ≤ i ≤ k,
                                                                               vP C (c1 , v1 , h)   : − assignC (c1 , v1 , c2 , v2 ),
              c = c + k.
2                                                                                                       vP C (c2 , v2 , h), vPfilter (v1 , h).    (15)
                                                                               hP (h1 , f, h2 )     : − store(v1 , f, v2 ),
   E XAMPLE 2. We now show the results of applying Algorithm 4                                          vP C (c, v1 , h1 ), vP C (c, v2 , h2 ).   (16)
to Example 1. M1 , the root node, is given context number 1. We
                                                                               vP C (cv , v2 , h2 ) : − load (v1 , f, v2 ), vP C (cv , v1 , h1 ),
shall visit the invocation edges from left to right. Nodes M2 and
                                                                                                        hP (h1 , f, h2 ), vPfilter (v2 , h2 ). (17)
M3 , being members of a strongly connected component, are repre-
sented as one node. The strongly connected component is reached                assignC (c1 , v1 , c2 , v2 )
by two edges from M1 . Since M1 has only one context, we create                                    : − IE C (c1 , i, c2 , m), formal (m, z, v1 ),
two clones, one reached by each edge. For method M4 , the pre-                                              actual (i, z, v2 ).               (18)
decessor on each of the two incoming edges has two contexts, thus
                                                                          2
M4 has four clones. Method M5 has two clones, one for each clone
that invokes M5 . Finally, method M6 has six clones: Clones 1-4           C is the domain of context numbers. Our BDD library uses signed
of method M4 invoke clones 1-4 and clones 1-2 of method M5 call                 64-bit integers to represent domains, so the size is limited to
clones 5-6, respectively. The cloned graph is shown in Figure 1(b).             263 .
                                                                          IE C : C × I × C × M is the set of context-sensitive invocation
   The numbering scheme used in Algorithm 4 plays up the                        edges. IE C (c, i, cm , m) means that invocation site i in con-
strengths of BDDs. Each method is assigned a contiguous range of                text c calls method m in context cm . This relation is com-
contexts, which can be represented efficiently in BDDs. The con-                puted using Algorithm 4.
texts of callees can be computed simply by adding a constant to the       assignC : C × V × C × V is the context-sensitive version of the
contexts of the callers; this operation is also cheap in BDDs. Be-              assign relation. assignC (c1 , v1 , c2 , v2 ) means variable v1 in
cause the information for contexts that share common tail sequences             context c1 includes the points-to set of variable v2 in context
are likely to be similar, this numbering allows the BDD data struc-             v2 due to parameter passing. Again, return values are handled
ture to share effectively across common contexts. For example, the              analogously.
sequentially-numbered clones 1 and 2 of M6 both have a common             vP C : C × V × H is the context-sensitive version of the variable
tail sequence eh. Because of this, the contexts are likely to be sim-           points-to relation (vP ). vP C (c, v, h) means variable v in con-
ilar and therefore the BDD can take advantage of the redundancies.              text c can point to heap object h.
   To optimize the creation of the cloned invocation graph, we have
defined a new primitive that creates a BDD representation of con-           Rule (18) interprets the context-sensitive invocation edges to find
tiguous ranges of numbers in O(k) operations, where k is the num-        the bindings between actual and formal parameters. The rest of
ber of bits in the domain. In essence, the algorithm creates one BDD     the rules are the context-sensitive counterparts to those found in
to represent numbers below the upper bound, and one to represent         Algorithm 2.
numbers above the lower bound, and computes the conjunction of              Algorithm 5 takes advantage of a pre-computed call graph to cre-
these two BDDs.                                                          ate an efficient context numbering scheme for the contexts. We can
compute the call graph on the fly while enjoying the benefit of the         vuln(c, i)          :−      IE (i, “PBEKeySpec.init()”),
numbering scheme by numbering all the possible contexts with a                                          actual (i, 1, v), vP C (c, v, h),
conservative call graph, and delaying the generation of the invoca-                                     fromString(h).
tion edges only if warranted by the points-to results. We can reduce
                                                                           Notice that this query does not only find cases where the
the iterations necessary by exploiting the fact that many of the in-
                                                                         object derived from a String is immediately supplied to
vocation sites of a call graph created by a context-insensitive anal-
                                                                         PBEKeySpec.init(). This query will also identify cases where the
ysis have single targets. Such an algorithm has an execution time
                                                                         object has passed through many variables and heap objects.
similar to Algorithm 5, but is of primarily academic interest as the
call graph rarely improves due to the extra precision from context-      5.3     Type Refinement
sensitive points-to information.                                            Libraries are written to handle the most general types of objects
5.     QUERIES AND OTHER ANALYSES                                        possible, and their full generality is typically not used in many ap-
                                                                         plications. By analyzing the actual types of objects used in an ap-
  The algorithms in sections 2, 3 and 4 generate vast amounts of re-     plication, we can refine the types of the variables and object fields.
sults in the form of relations. Using the same declarative program-      Type refinement can be used to reduce overheads in cast operations,
ming interface, we can conveniently query the results and extract        resolve virtual method calls, and gain better understanding of the
exactly the information we are interested in. This section shows a       program.
variety of queries and analyses that make use of pointer information        We say that variable v can be legally declared as t, written
and context sensitivity.                                                 varSuperTypes(v, t), if t is a supertype of the types of all the ob-
5.1      Debugging a Memory Leak                                         jects v can point to. The type of a variable is refinable if the variable
                                                                         can be declared to have a more precise type. To compute the super
   Memory leaks can occur in Java when a reference to an object
                                                                         types of v, we first find varExactTypes(v, t), the types of objects
remains even after it will no longer be used. One common approach
                                                                         pointed to by v. We then intersect the supertypes of all the exact
of debugging memory leaks is to use a dynamic tool that locates the
                                                                         types to get the desired solution; we do so in Datalog by finding the
allocation sites of memory-consuming objects. Suppose that, upon
                                                                         complement of the union of the complement of the exact types.
reviewing the information, the programmer thinks objects allocated
in line 57 in file a.java should have been freed. He may wish to          varExactTypes(v, t)          :−   vP C ( , v, h), hT (h, t).
know which objects may be holding pointers to the leaked objects,
                                                                          notVarType(v, t)             :−   varExactTypes(v, tv ), ¬aT (t, tv ).
and which operations may have stored the pointers. He can consult
the static analysis results by supplying the queries:                     varSuperTypes(v, t)          :−   ¬notVarType(v, t).
     whoPointsTo57 (h, f )       :−   hP (h, f, “a.java : 57”).           refinable(v, tc )            :−   vT (v, td ), varSuperTypes(v, tc ),
                                                                                                            aT (td , tc ), td 6= tc .
     whoDunnit(c, v1 , f, v2 )   :−   store(v1 , f, v2 ),
                                      vP C (c, v2 , “a.java : 57”).         The above shows a context-insensitive type refinement query. We
                                                                         find, for each variable, the type to which it can be refined regardless
   The first query finds the objects and their fields that may point
                                                                         of the context. Even if the end result is context-insensitive, it is more
to objects allocated at ”a.java:57”; the second finds the store
                                                                         precise to take advantage of the context-sensitive points-to results
instructions, and the contexts under which they are executed, that
                                                                         available to determine the exact types, as shown in the first rule.
create the references.
                                                                         In Section 6.3, we compare the accuracy of this context-insensitive
5.2      Finding a Security Vulnerability                                query with a context-sensitive version.
   The Java Cryptography Extension (JCE) is a library of crypto-
graphic algorithms[29]. Misuse of the JCE API can lead to security
                                                                         5.4     Context-Sensitive Mod-Ref Analysis
vulnerabilities and a false sense of security. For example, many            Mod-ref analysis is used to determine what fields of what objects
operations in the JCE use a secret key that must be supplied by the      may be modified or referenced by a statement or call site[18]. We
programmer. It is important that secret keys be cleared after they are   can use the context-sensitive points-to results to solve a context-
used so they cannot be recovered by attackers with access to mem-        sensitive version of this query. We define mV (m, v) to mean that v
ory. Since String objects are immutable and cannot be cleared,           is a local variable in m. The mV ∗C relation specifies the set of vari-
secret keys should not be stored in String objects but in an array       ables and contexts of methods that are transitively reachable from
of characters or bytes instead.                                          a method. mV ∗C (c1 , m, c2 , v) means that calling method m with
   To guard against misuse, the function that accepts the secret key,    context c1 can transitively call a method with local variable v under
PBEKeySpec.init(), only allows arrays of characters or bytes as          context c2 .
input. However, a programmer not versed in security issues may             mV ∗C (c, m, c, v)          :−    mV (m, v).
have stored the key in a String object and then use a routine                   ∗
                                                                           mV C (c1 , m1 , c3 , v3 )   :−    mI (m1 , i), IE C (c1 , i, c2 , m2 ),
in the String class to convert it to an array of characters. We
                                                                                                             mV ∗C (c2 , m2 , c3 , v3 ).
can write a query to audit programs for the presence of such id-
ioms. Let Mret(m, v) be an input relation specifying that vari-             The first rule simply says that a method m in context c can reach
able v is the return value of method m. We define a relation             its local variable. The second rule says that if method m1 in context
fromString(h) which indicates if the object h was directly derived       c1 calls method m2 in context c2 , then m1 in context c1 can also
from a String. Specifically, it records the objects that are re-         reach all variables reached by method m2 in context c2 .
turned by a call to a method in the String class. An invocation             We can now define the mod and ref set of a method as follows:
i to method PBEKeySpec.init() is a vulnerability if the first argu-            mod (c, m, h, f )       :−   mV ∗C (c, m, cv , v),
ment points to an object derived from a String.                                                             store(v, f, ), vP C (cv , v, h).
     fromString(h)     :−    cha(“String”, , m), Mret(m, v),                   ref (c, m, h, f )       :−   mV ∗C (c, m, cv , v),
                             vP C ( , v, h).                                                                load (v, f, ), vP C (cv , v, h).
The first rule says that if method m in context c can reach a vari-        5.6     Thread Escape Analysis
able v in context cv , and if there is a store through that variable to          Our last example is a thread escape analysis, which determines
field f of object h, then m in context c can modify field f of object         if objects created by one thread may be used by another. The re-
h. The second rule for defining the ref relations is analogous.               sults of the analysis can be used for optimizations such as synchro-
5.5      Context-Sensitive Type Analysis                                      nization elimination and allocating objects in thread-local heaps, as
                                                                              well as for understanding programs and checking for possible race
   Our cloning technique can be applied to add context sensitivity
                                                                              conditions due to missing synchronizations[8, 34]. This example
to other context-insensitive algorithms. The example we show here
                                                                              illustrates how we can vary context sensitivity to fit the needs of the
is the type inference of variables and fields. By not distinguish-
                                                                              analysis.
ing between instances of heap objects, this analysis does not gener-
                                                                                 We say that an object allocated by a thread has escaped if it may
ate results as precise as those extracted from running the complete
                                                                              be accessed by another thread. This notion is stronger than most
context-sensitive pointer analysis as discussed in Section 5.3, but is
                                                                              other formulations where an object is said to escape if it can be
much faster.
                                                                              reached by another thread[8, 34].
   The basic type analysis is similar to 0-CFA[26]. Each variable
                                                                                 Java threads, being subclasses of java.lang.Thread, are
and field in the program has a set of concrete types that it can re-
                                                                              identified by their creation sites. In the special case where a thread
fer to. The sets are propagated through calls, returns, loads, and
                                                                              creation can execute only once, a thread can simply be named by
stores. By using the path numbering scheme in Algorithm 4, we
                                                                              the creation site. The thread that exists at virtual machine startup
can convert this basic analysis into one which is context-sensitive—
                                                                              is an example of a thread that can only be created once. A creation
in essence, making the analysis into a k-CFA analysis where k is
                                                                              site reached via different call paths or embedded in loops or recur-
the depth of the call graph and recursive cycles are collapsed.
                                                                              sive cycles may generate multiple threads. To distinguish between
   A LGORITHM 6. Context-sensitive type analysis.                             thread instances created at the same site, we create two thread con-
                                                                              texts to represent two separate thread instances. If an object created
D OMAINS                                                                      by one instance is not accessed by its clone, then it is not accessed
  Domains from Algorithm 5                                                    by any other instances created by the same call site. This scheme
                                                                              creates at most twice as many contexts as there are thread creation
R ELATIONS                                                                    sites.
  Relations from Algorithm 5, plus:                                              We clone the thread run() method, one for each thread context,
     output vT C             (context : C, variable : V, type : T)            and place these clones on the list of entry methods to be analyzed.
     output fT               (field : F, target : T)                          Methods (transitively) invoked by a context’s run() method all
            vTfilter         (variable : V, type : T)                         inherit the same context. A clone of a method not only has its own
                                                                              cloned variables, but also its own cloned object creation sites. In
RULES                                                                         this way, objects created by separate threads are distinct from each
  vTfilter (v, t)        : − vT (v, tv ), aT (tv , t).                 (19)   other. We run a points-to analysis over this slightly expanded call
     vT C (c, v, t)      : − vP 0 (v, h), IE C (c, h, , ), hT (h, t).(20)     graph; an object created in a thread context escapes if it is accessed
                                                                              by variables in another thread context.
     vT C (cv1 , v1 , t) : − assignC (cv1 , v1 , cv2 , v2 ),
                             vT C (cv2 , v2 , t), vTfilter (v1 , t).   (21)      A LGORITHM 7. Thread-sensitive pointer analysis.
     fT (f, t)           : − store( , f, v2 ), vT C ( , v2 , t).       (22)
                                                                              D OMAINS
     vT C ( , v, t)      : − load ( , f, v), fT (f, t),                         Domains from Algorithm 5
                             vTfilter (v, t).                          (23)
                                                                              R ELATIONS
     assignC (c1 , v1 , c2 , v2 )                                               Relations from Algorithm 2, plus:
                         : − IE C (c1 , i, c2 , m), formal (m, z, v1 ),
                                  actual (i, z, v2 ).                 (24)     input       HT       (c : C, heap : H)
                                                                               input       vP 0 T   (cv : C, variable : V, ch : C, heap : H)
 2                                                                             output      vP T     (cv : C, variable : V, ch : C, heap : H)
 vT C : C × V × T is the context-sensitive variable type relation.             output      hP T     (cb : C, base : H, field : F, ct : C, target : H)
       vT C (c, v, t) means that variable v in context cv can refer to        RULES
       an object of type t. This is the analogue of vP C in the points-
       to analysis.                                                             vPfilter (v, h)           : −vT (v, tv ), hT (h, th ), aT (tv , th ). (25)
 fT : F × T is the field type relation. fT (f, t) means that field f            vP T (c1 , v, c2 , h)     : −vP 0 T (c1 , v, c2 , h).                   (26)
       can point to an object of type t.                                        vP T (c, v, c, h)         : −vP 0 (v, h), HT (c, h).                    (27)
 vTfilter : V × T is the type filter relation. vTfilter (v, t) means
       that it is type-safe to assign an object of type t to variable v.        vP T (c2 , v1 , ch , h)   : −assign(v1 , v2 ), vP T (c2 , v2 , ch , h),
                                                                                                             vPfilter (v1 , h).                        (28)
   Rule (20) initializes the vT C relation based on the initial local           hP T (c1 , h1 , f, c2 , h2 ): −store(v1 , f, v2 ), vP T (c, v1 , c1 , h1 ),
points-to information contained in vP 0 , combining it with hT to get                                          vP T (c, v2 , c2 , h2 ).                   (29)
the type and IE C to get the context numbers. Rule (21) does tran-
                                                                                vP T (c, v2 , c2 , h2 )   : −load (v1 , f, v2 ), vP T (c, v1 , c1 , h1 ),
sitive closure on the vT C relation, filtering with vTfilter to enforce
                                                                                                             hP T (c1 , h1 , f, c2 , h2 ),
type safety. Rules (22) and (23) handle stores and loads, respec-
                                                                                                             vPfilter (v2 , h2 ).                        (30)
tively. They differ from their counterparts in the pointer analysis in
that they do not use the base object, only the field. Rule (24) models         2
the effects of parameter passing in a context-sensitive manner.
HT : C × H encodes the non-thread objects created by a thread.               to handle return values and threads, and added annotations for the
       HT (c, h) means that a thread with context c may execute non-          physical domain assignments of input relations.) The input rela-
       thread allocation site h; in other words, there is a call path         tions were generated with the Joeq compiler infrastructure[32]. The
       from the run() method in context c to allocation site h.               entire bddbddb implementation is only 2500 lines of code. bd-
 vP 0 T : C × V × C × H is the set of initial inter-thread points-to          dbddb uses the JavaBDD library[31], an open-source library based
       relations. This includes the points-to relations for thread cre-       on the BuDDy library[21]. The entire system is available as open-
       ation sites and for the global object. vP 0 T (c1 , v, c2 , h) means   source[35], and we hope that others will find it useful.
       that thread c1 has an thread allocation site h, and v points to           All experiments were performed on a 2.2GHz Pentium 4 with
       the newly created thread context c2 . (There are usually two           Sun JDK 1.4.2 04 running on Fedora Linux Core 1. For the context-
       contexts assigned to each allocation site). All global objects         insensitive and context-sensitive experiments, respectively: we used
       across all contexts are given the same context.                        initial BDD table sizes of 4M and 12M; the tables could grow by
 vP T : C × V × C × H is the thread-sensitive version of the variable         1M and 3M after each garbage collection; the BDD operation cache
       points-to relation vP C . vP T (c1 , v, c2 , h) means variable v in    sizes were 1M and 3M.
       context c1 can point to heap object h created under context               To test the scalability and applicability of the algorithm, we ap-
       c2 .                                                                   plied our technique to 21 of the most popular Java projects on
 hP T : C × H × F × C × H is the thread-sensitive version of the              Sourceforge as of November 2003. We simply walked down the
       heap points-to relation hP . hP T (c1 , h1 , f, c2 , h2 ) means that   list of 100% Java projects sorted by activity, selecting the ones that
       field f of heap object h1 created under context c1 can point           would compile directly as standalone applications. They are all
       to heap object h2 created under context c2 .                           real applications with tens of thousands of users each. As far as
                                                                              we know, these are the largest benchmarks ever reported for any
   Rule (26) incorporates the initial points-to relations for thread          context-sensitive Java pointer analysis. As a point of comparison,
creation sites. Rule (27) incorporates the points-to information              the largest benchmark in the specjvm suite, javac, would rank only
for non-thread creation sites, which have the context numbers of              13th in our list.
threads that can reach the method. The other rules are analogous                 For each application, we chose an applicable main() method as
to those of the context-sensitive pointer analysis, with an additional        the entry point to the application. We included all class initializers,
context attribute for the heap objects.                                       thread run methods, and finalizers. We ignored null constants in the
   From the analysis results, we can easily determine which objects           analysis—every points-to set is automatically assumed to include
have escaped. An object h created by thread context c has escaped,            null. Exception objects of the same type were merged. We treated
written escaped (c, h), if it is accessed by a different context cv .         reflection and native methods as returning unknown objects. Some
Complications involving unknown code, such as native methods,                 native methods and special fields were modeled explicitly.
could also be handled using this technique.                                      A short description of each of the benchmarks is included in Fig-
                                                                              ure 3, along with their vital statistics. The number of classes, meth-
          escaped (c, h)      :−    vP T (cv , , c, h), cv 6= c.              ods, and bytecodes were those discovered by the context-insensitive
   Conversely, an object h created by context c is captured, written          on-the-fly call graph construction algorithm, so they include only
captured (c, h), if it has not escaped. Any captured object can be            the reachable parts of the program and the class library.
allocated on a thread-local heap.                                                The number of context-sensitive (C.S.) paths is for the most part
                                                                              correlated to the number of methods in the program, with the excep-
      captured (c, h)    :−     vP T (c, v, c, h), ¬escaped (c, h).           tion of pmd. pmd has an astounding 5×1023 paths in the call graph,
   We can also use escape analysis to eliminate unnecessary syn-              which requires 79 bits to represent. pmd has different characteristics
chronizations. We define a relation syncs(v ) indicating if the pro-          because it contains code generated by the parser generator JavaCC.
gram contains a synchronization operation performed on variable v.            Many machine-generated methods call the same class library rou-
A synchronization for variable v under context c is necessary, writ-          tines, leading to a particularly egregious exponential blowup. The
ten neededSyncs(c, v), if syncs(v ) and v can point to an escaped             JavaBDD library only supports physical domains up to 63 bits; con-
object.                                                                       texts numbered beyond 263 were merged into a single context. The
                                                                              large number of paths also caused the algorithm to require many
      neededSyncs(c, v)       :−     syncs(v), vP T (c, v, ch , h),           more rule applications to reach a fixpoint solution.
                                     escaped (ch , h)
                                                                              6.2    Analysis Times
   Notice that neededSyncs is context-sensitive. Thus, we can
distinguish when a synchronization is necessary only for certain                 We measured the analysis times and memory usage for each of
threads, and generate specialized versions of methods for those               the algorithms presented in this paper (Figure 4). The algorithm
threads.                                                                      with call graph discovery, in each iteration, computes a call graph
                                                                              based on the points-to relations from the previous iteration. The
6.    EXPERIMENTAL RESULTS                                                    number of iterations taken for that algorithm is also included here.
                                                                                 All timings reported are wall-clock times from a cold start, and
   In this section, we present some experimental results of using bd-         include the various overheads for Java garbage collection, BDD
dbddb on the Datalog algorithms presented in this paper. We de-               garbage collection, growing the node table, etc. The memory num-
scribe our testing methodology and benchmarks, present the anal-              bers reported are the sizes of the peak number of live BDD nodes
ysis times, evaluate the results of the analyses, and provide some            during the course of the algorithm. We measured peak BDD mem-
insight on our experience of developing these analyses and the bd-            ory usage by setting the initial table size and maximum table size
dbddb tool.                                                                   increase to 1MB, and only allowed the table to grow if the node
6.1     Methodology                                                           table was more than 99% full after a garbage collection.5
  The input to bddbddb is more or less the Datalog programs ex-               5
                                                                                To avoid garbage collections, it is recommended to use more mem-
actly as they are presented in this paper. (We added a few rules              ory. Our timing runs use the default setting of 80%.
You can also read