# Cloning-Based Context-Sensitive Pointer Alias Analysis Using Binary Decision Diagrams

←

**Page content transcription**

If your browser does not render page correctly, please read the page content below

Cloning-Based Context-Sensitive Pointer Alias Analysis Using Binary Decision Diagrams John Whaley Monica S. Lam Computer Science Department Stanford University Stanford, CA 94305 {jwhaley, lam}@stanford.edu ABSTRACT 1. INTRODUCTION This paper presents the first scalable context-sensitive, inclusion- Many applications of program analysis, such as program opti- based pointer alias analysis for Java programs. Our approach to mization, parallelization, error detection and program understand- context sensitivity is to create a clone of a method for every con- ing, need pointer alias information. Scalable pointer analyses text of interest, and run a context-insensitive algorithm over the ex- developed to date are imprecise because they are either context- panded call graph to get context-sensitive results. For precision, insensitive[3, 17, 19, 33] or unification-based[15, 16]. A context- we generate a clone for every acyclic path through a program’s call insensitive analysis does not distinguish between different calling graph, treating methods in a strongly connected component as a sin- contexts of a method and allows information from one caller to gle node. Normally, this formulation is hopelessly intractable as a propagate erroneously to another caller of the same method. In call graph often has 1014 acyclic paths or more. We show that these unification-based approaches, pointers are assumed to be either un- exponential relations can be computed efficiently using binary de- aliased or are pointing to the same set of locations[28]. In contrast, cision diagrams (BDDs). Key to the scalability of the technique is inclusion-based approaches are more efficient but also more expen- a context numbering scheme that exposes the commonalities across sive, as they allow two aliased pointers to point to overlapping but contexts. We applied our algorithm to the most popular applications different sets of locations. available on Sourceforge, and found that the largest programs, with We have developed a context-sensitive and inclusion-based hundreds of thousands of Java bytecodes, can be analyzed in under pointer alias analysis that scales to hundreds of thousands of Java 20 minutes. bytecodes. The analysis is field-sensitive, meaning that it tracks This paper shows that pointer analysis, and many other queries the individual fields of individual pointers. Our analysis is mostly and algorithms, can be described succinctly and declaratively using flow-insensitive, using flow sensitivity only in the analysis of local Datalog, a logic programming language. We have developed a sys- pointers in each function. The results of this analysis, as we show in tem called bddbddb that automatically translates Datalog programs this paper, can be easily used to answer users’ queries and to build into highly efficient BDD implementations. We used this approach more advanced analyses and programming tools. to develop a variety of context-sensitive algorithms including side effect analysis, type analysis, and escape analysis. 1.1 Cloning to Achieve Context Sensitivity Our approach to context sensitivity is based on the notion of Categories and Subject Descriptors cloning. Cloning conceptually generates multiple instances of a D.3.4 [Programming Languages]: Processors—Compilers; E.2 method such that every distinct calling context invokes a different [Data]: Data Storage Representations instance, thus preventing information from one context to flow to another. Cloning makes generating context-sensitive results algo- rithmically trivial: We can simply apply a context-insensitive algo- General Terms rithm to the cloned program to obtain context-sensitive results. Note Algorithms, Performance, Design, Experimentation, Languages that our analysis does not clone the code per se; it simply produces a separate answer for each clone. Keywords The context of a method invocation is often distinguished by its call path, which is simply the call sites, or return addresses, on the context-sensitive, inclusion-based, pointer analysis, Java, scalable, invocation’s call stack. In the case of a recursive program, there cloning, binary decision diagrams, program analysis, Datalog, logic are an unbounded number of calling contexts. To limit the number programming of calling contexts, Shivers proposed the concept of k-CFA (Con- trol Flow Analysis) whereby one remembers only the last k call sites[26]. Emami et al. suggested distinguishing contexts by their Permission to make digital or hard copies of all or part of this work for full call paths if they are acyclic. For cyclic paths, they suggested personal or classroom use is granted without fee provided that copies are including each call site in recursive cycles only once[14]. Our ap- not made or distributed for profit or commercial advantage and that copies proach also uses entire call paths to distinguish between contexts bear this notice and the full citation on the first page. To copy otherwise, to in programs without recursion. To handle recursion, call paths are republish, to post on servers or to redistribute to lists, requires prior specific reduced by eliminating all invocations whose callers and callees be- permission and/or a fee. PLDI’04, June 9–11, 2004, Washington, DC, USA. long to the same strongly connected component in the call graph. Copyright 2004 ACM 1-58113-807-5/04/0006 ...$5.00. These reduced call paths are used to identify contexts.

It was not obvious, at least to us at the beginning of this project, algorithms here directly. All the experimental results reported in that a cloning-based approach would be feasible. The number of re- this paper are obtained by running the BDD programs automatically duced call paths in a program grows exponentially with the number generated by bddbddb. of methods, and a cloning-based approach must compute the result Context-sensitive queries and other analyses. The context- of every one of these contexts. Emami et al. have only reported sensitive points-to results, the simple cloning-based approach to context-sensitive points-to results on small programs[14]. Realistic context sensitivity, and the bddbddb system make it easy to write programs have many contexts; for example, the megamek applica- new analyses. We show some representative examples in each of tion has over 1014 contexts (see Section 6.1). The size of the final the following categories: results alone appears to be prohibitive. 1. Simple queries. The results from our context-sensitive pointer We show that we can scale a cloning-based points-to analysis analysis provide a wealth of information of interest to pro- by representing the context-sensitive relations using ordered binary grammers. We show how a few lines of Datalog can help decision diagrams (BDDs)[6]. BDDs, originally designed for hard- programmers debug a memory leak and find potential secu- ware verification, have previously been used in a number of pro- rity vulnerabilities. gram analyses[2, 23, 38], and more recently for points-to analy- 2. Algorithms using context-sensitive points-to results. We show sis[3, 39]. We show that it is possible to compute context-sensitive how context-sensitive points-to results can be used to cre- points-to results for over 1014 contexts. ate advanced analyses. We include examples of a context- In contrast, most context-sensitive pointer alias analyses devel- sensitive analysis to compute side effects (mod-ref) and an oped to date are summary-based[15, 34, 37]. Parameterized sum- analysis to refine declared types of variables. maries are created for each method and used in creating the sum- maries of its callers. It is not necessary to represent the results for 3. Other context-sensitive algorithms. Cloning can be used to the exponentially many contexts explicitly with this approach, be- trivially generate other kinds of context-sensitive results be- cause the result of a context can be computed independently using sides points-to relations. We illustrate this with a context- the summaries. However, to answer queries as simple as “which sensitive type analysis and a context-sensitive thread escape variables point to a certain object” would require all the results analysis. Whereas previous escape analyses require thou- to be computed. The readers may be interested to know that, de- sands of lines of code to implement[34], the algorithm here spite much effort, we tried but did not succeed in creating a scalable has only seven Datalog rules. summary-based algorithm using BDDs. Experimental Results. We present the analysis time and mem- ory usage of our analyses across 21 of the most popular Java appli- 1.2 Contributions cations on Sourceforge. Our context-sensitive pointer analysis can The contributions of this paper are not limited to just an algorithm analyze even the largest of the programs in under 19 minutes. We for computing context-sensitive and inclusion-based points-to infor- also compare the precision of context-insensitive pointer analysis, mation. The methodology, specification language, representation, context-sensitive pointer analysis and context-sensitive type analy- and tools we used in deriving our pointer analysis are applicable to sis, and show the effects of merging versus cloning contexts. creating many other algorithms. We demonstrate this by using the 1.3 Paper Organization approach to create a variety of queries and algorithms. Scalable cloning-based context-sensitive points-to analysis Here is an overview of the rest of the paper. Section 2 ex- using BDDs. The algorithm we have developed is remarkably sim- plains our methodology. Using Berndl’s context-insensitive points- ple. We first create a cloned call graph where a clone is created to algorithm as an example, we explain how an analysis can be for every distinct calling context. We then run a simple context- expressed in Datalog and, briefly, how bddbddb translates Dat- insensitive algorithm over the cloned call graph to get context- alog into efficient BDD implementations. Section 3 shows how sensitive results. We handle the large number of contexts by rep- we can easily extend the basic points-to algorithm to discover resenting them in BDDs and using an encoding scheme that al- call graphs on the fly by adding a few Datalog rules. Section 4 lows commonalities among similar contexts to be exploited. We presents our cloning-based approach and how we use it to compute improve the efficiency of the algorithm by using an automatic tool context-sensitive points-to results. Section 5 shows the represen- that searches for an effective variable ordering. tative queries and algorithms built upon our points-to results and Datalog as a high-level language for BDD-based program the cloning-based approach. Section 6 presents our experimental analyses. Instead of writing our program analyses directly in terms results. We report related work in Section 7 and conclude in Sec- of BDD operations, we store all program information and results as tion 8. relations and express our analyses in Datalog, a logic programming language used in deductive databases[30]. Because Datalog is suc- 2. FROM DATALOG TO BDDS cinct and declarative, we can express points-to analyses and many In this section, we start with a brief introduction to Datalog. other algorithms simply and intuitively in just a few Datalog rules. We then show how Datalog can be used to describe the context- We use Datalog because its set-based operation semantics insensitive points-to analysis due to Berndl et al. at a high level. We matches the semantics of BDD operations well. To aid our al- then describe how our bddbddb system translates a Datalog pro- gorithm research, we have developed a deductive database system gram into an efficient implementation using BDDs. called bddbddb (BDD Based Deductive DataBase) that automati- cally translates Datalog programs into BDD algorithms. We provide 2.1 Datalog a high-level summary of the optimizations in this paper; the details We represent a program and all its analysis results as relations. are beyond the scope of this paper[35]. Conceptually, a relation is a two-dimensional table. The columns Our experience is that programs generated by bddbddb are faster are the attributes, each of which has a domain defining the set of than their manually optimized counterparts. More importantly, Dat- possible attribute values. The rows are the tuples of attributes that alog programs are orders-of-magnitude easier to write. They are so share the relation. If tuple (x, y, z) is in relation A, we say that succinct and easy to understand that we use them to explain all our predicate A(x, y, z) is true.

A Datalog program consists of a set of rules, written in a Prolog- allows bddbddb to communicate with the users with meaningful style notation, where a predicate is defined as a conjunction of other names. A relation declaration has an optional keyword specifying predicates. For example, the Datalog rule whether it is an input or output relation, the name of the relation, and the name and domain of every attribute. A relation declared D(w, z) :− A(w, x), B(x, y), C(y, z). as neither input nor output is a temporary relation generated in the says that “D(w, z) is true if A(w, x), B(x, y), and C(y, z) are all analysis but not written out. Finally, the rules follow the standard true.” Variables in the predicates can be replaced with constants, Datalog syntax. The rule numbers, introduced here for the sake of which are surrounded by double-quotes, or don’t-cares, which are exposition, are not in the actual program. signified by underscores. Predicates on the right side of the rules We can express all information found in the intermediate repre- can be inverted. sentation of a program as relations. To avoid inundating readers Datalog is more powerful than SQL, which is based on relational with too many definitions all at once, we define the relations as they calculus, because Datalog predicates can be recursively defined[30]. are used. The domains and relations used in Algorithm 1 are: If none of the predicates in a Datalog program is inverted, then there is a guaranteed minimal solution consisting of relations with the V is the domain of variables. It represents all the allocation sites, least number of tuples. Conversely, programs with inverted pred- formal parameters, return values, thrown exceptions, cast op- icates may not have a unique minimal solution. Our bddbddb erations, and dereferences in the program. There is also a system accepts a subclass of Datalog programs, known as strati- special global variable for use in accessing static variables. fied programs[7], for which minimal solutions always exist. Infor- H is the domain of heap objects. Heap objects are named by the mally, rules in such programs can be grouped into strata, each with invocation sites of object creation methods. To increase pre- a unique minimal solution, that can be solved in sequence. cision, we also statically identify factory methods and treat them as object creation methods. 2.2 Context-Insensitive Points-to Analysis F is the domain of field descriptors in the program. Field descrip- We now review Berndl et al.’s context-insensitive points-to anal- tors are used when loading from a field (v2 = v1 .f;) or ysis[3], while also introducing the Datalog notation. This algorithm storing to a field (v1 .f = v2 ;). There is a special field de- assumes that a call graph, computed using simple class hierarchy scriptor to denote an array access. analysis[13], is available a priori. Heap objects are named by their vP 0 : V × H is the initial variable points-to relation extracted allocation sites. The algorithm finds the objects possibly pointed to from object allocation statements in the source program. by each variable and field of heap objects in the program. Shown in vP 0 (v, h) means there is an invocation site h that assigns Algorithm 1 is the exact Datalog program, as fed to bddbddb, that a newly allocate object to variable v. implements Berndl’s algorithm. To keep the first example simple, store: V × F × V represents store statements. store(v1 , f, v2 ) we defer the discussion of using types to improve precision until says that there is a statement “v1 .f = v2 ;” in the program. Section 2.3. load : V × F × V represents load statements. load (v1 , f, v2 ) says A LGORITHM 1. Context-insensitive points-to analysis with a that there is a statement “v2 = v1 .f;” in the program. precomputed call graph. assign: V × V is the assignments relation due to passing of argu- ments and return values. assign(v1 , v2 ) means that variable D OMAINS v1 includes the points-to set of variable v2 . Although we do not cover return values here, they work in an analogous man- V 262144 variable.map ner. H 65536 heap.map vP : V × H is the output variable points-to relation. vP (v, h) F 16384 field.map means that variable v can point to heap object h. R ELATIONS hP : H × F × H is the output heap points-to relation. hP (h1 , f, h2 ) means that field f of heap object h1 can point input vP 0 (variable : V, heap : H) to heap object h2 . input store (base : V, field : F, source : V) input load (base : V, field : F, dest : V) Note that local variables and their assignments are factored away input assign (dest : V, source : V) using a flow-sensitive analysis[33]. The assign relation is derived output vP (variable : V, heap : H) by using a precomputed call graph. The sizes of the domains are output hP (base : H, field : F, target : H) determined by the number of variables, heap objects, and field de- RULES scriptors in the input program. vP (v, h) :− vP 0 (v, h). (1) Rule (1) incorporates the initial variable points-to relations into vP . Rule (2) finds the transitive closure over inclusion edges. If vP (v1 , h) :− assign(v1 , v2 ), vP (v2 , h). (2) v1 includes v2 and variable v2 can point to object h, then v1 can hP (h1 , f, h2 ) :− store(v1 , f, v2 ), also point to h. Rule (3) models the effect of store instructions on vP (v1 , h1 ), vP (v2 , h2 ). (3) heap objects. Given a statement “v1 .f = v2 ;”, if v1 can point to h1 and v2 can point to h2 , then h1 .f can point to h2 . Rule (4) vP (v2 , h2 ) :− load (v1 , f, v2 ), resolves load instructions. Given a statement “v2 = v1 .f;”, if v1 vP (v1 , h1 ), hP (h1 , f, h2 ). (4) can point to h1 and h1 .f can point to h2 , then v2 can point to h2 . 2 Applying these rules until the results converge finds all the possible context-insensitive points-to relations in the program. A Datalog program has three sections: domains, relations, and rules. A domain declaration has a name, a size n, and an optional 2.3 Improving Points-to Analysis with Types file name that provides a name for each element in the domain, inter- Because Java is type-safe, variables can only point to objects of nally represented as an ordinal number from 0 to n − 1. The latter assignable types. Assignability is similar to the subtype relation,

A LGORITHM 2. Context-insensitive points-to analysis with 2.4 Translating Datalog into Efficient BDD type filtering. Implementations D OMAINS We first describe how Datalog rules can be translated into opera- Domains from Algorithm 1, plus: tors from relational algebra such as “join” and “project”, then show how to translate these operations into BDD operations. T 4096 type.map 2.4.1 Query Resolution R ELATIONS We can find the solution to an unstratified query, or a stratum of Relations from Algorithm 1, plus: a stratified query, simply by applying the inference rules repeatedly input vT (variable : V, type : T) until none of the output relations change. We can apply a Datalog input hT (heap : H, type : T) rule by performing a series of relational natural join, project and re- input aT (supertype : T, subtype : T) name operations. A natural join operation combines rows from two vPfilter (variable : V, heap : H) relations if the rows share the same value for a common attribute. A project operation removes an attribute from a relation. A rename RULES operation changes the name of an attribute to another one. vPfilter (v, h) :− vT (v, tv ), hT (h, th ), aT (tv , th ). (5) For example, the application of Rule (2) can be implemented as: vP (v, h) :− vP 0 (v, h). (6) t1 = rename(vP , variable, source); vP (v1 , h) :− assign(v1 , v2 ), vP (v2 , h), t2 = project(join(assign, t1 ), source); vPfilter (v1 , h). (7) vP = vP ∪ rename(t2 , dest, variable); hP (h1 , f, h2 ) :− store(v1 , f, v2 ), We first rename the attribute in relation vP from variable to source vP (v1 , h1 ), vP (v2 , h2 ). (8) so that it can be joined with relation assign to create a new points-to vP (v2 , h2 ) :− load (v1 , f, v2 ), vP (v1 , h1 ), relation. The attribute dest of the resulting relation is changed to hP (h1 , f, h2 ), vPfilter (v2 , h2 ). (9) variable so that the tuples can be added to the vP tuples accumu- lated thus far. 2 The bddbddb system uses the three following optimizations to speed up query resolution. with allowances for interfaces, null values, and arrays[22]. By drop- Attributes naming. Since the names of the attributes must match ping targets of unassignable types in assignments and load state- when two relations are joined, the choice of attribute names can ments, we can eliminate many impossible points-to relations that affect the costs of rename operations. Since the renaming cost is result from the imprecision of the analysis. 1 highly sensitive to how the relations are implemented, the bddbddb Adding type filtering to Algorithm 1 is simple in Datalog. We system takes the representation into account when minimizing the add a new domain to represent types and new relations to repre- renaming cost. sent assignability as well as type declarations of variables and heap Rule application order. A rule needs to be applied only if the objects. We compute the type filter and modify the rules in Algo- input relations have changed. bddbddb optimizes the ordering of rithm 1 to filter out unsafe assignments and load operations. the rules by analyzing the dependences between the rules. For ex- ample, Rule 1 in Algorithm 1 does not depend on any of the other T is the domain of type descriptors (i.e. classes) in the program. rules and can be applied only once at the beginning of the query resolution. vT : V × T represents the declared types of variables. vT (v, t) Incrementalization. We only need to re-apply a rule on those means that variable v is declared with type t. combinations of tuples that we have not seen before. Such a tech- hT : H × T represents the types of objects created at a particular nique is known as incrementalization in the BDD literature and creation site. In Java, the type created by a new instruction semi-naı̈ve fixpoint evaluation in the database literature[1]. Our is usually known statically.2 hT (h, t) means that the object system also identifies loop-invariant relations to avoid unnecessary created at h has type t. difference and rename operations. Shown below is the result of in- aT : T × T is the relation of assignable types. aT (t1 , t2 ) means crementalizing the repeated application of Rule (2): that type t2 is assignable to type t1 . vPfilter : V × H is the type filter relation. vPfilter (v, h) means d = vP ; that it is type-safe to assign heap object h to variable v. repeat t1 = rename(d, variable, source); Rule (5) in Algorithm 2 defines the vPfilter relation: It is type- t2 = project(join(assign, t1 ), source); safe to assign heap object h of type th to variable v of type tv if tv d0 = rename(t2 , dest, variable); is assignable from th . Rules (6) and (8) are the same as Rules (1) d = d0 − vP ; and (3) in Algorithm 1. Rules (7) and (9) are analogous to Rules (2) vP = vP ∪ d; and (4), with the additional constraint that only points-to relations until d == ∅; that match the type filter are inserted. 1 We could similarly perform type filtering on stores into heap ob- 2.4.2 Relational Algebra in BDD jects. However, because all stores must go through variables, such a We now explain how BDDs work and how they can be used to type filter would only catch one extra case — when the base object is a null constant. implement relations and relational operations. BDDs (Binary Deci- 2 The type of a created object may not be known precisely if, for sion Diagrams) were originally invented for hardware verification to example, the object is returned by a native method or reflection is efficiently store a large number of states that share many common- used. Such types are modeled conservatively as all possible types. alities[6]. They are an efficient representation of boolean functions.

A BDD is a directed acyclic graph (DAG) with a single root node A LGORITHM 3. Context-insensitive points-to analysis that and two terminal nodes, representing the constants one and zero. computes call graph on the fly. Each non-terminal node in the DAG represents an input variable D OMAINS and has exactly two outgoing edges: a high edge and a low edge. Domains from Algorithm 2, plus: The high edge represents the case where the input variable for the node is true, and the low outgoing edge represents the case where I 32768 invoke.map the input variable is false. On any path in the DAG from the root N 4096 name.map to a terminal node, the value of the function on the truth values on M 16384 method.map the input variables in the path is given by the value of the terminal Z 256 node. To evaluate a BDD for a specific input, one simply starts at the root node and, for each node, follows the high edge if the input R ELATIONS variable is true, and the low edge if the input variable is false. The Relations from Algorithm 2, with the modification that assign is value of the terminal node that we reach is the value of the BDD for now a computed relation, plus: that input. input cha (type : T, name : N, target : M) The variant of BDDs that we use are called ordered binary de- input actual (invoke : I, param : Z, var : V) cision diagrams, or OBDDs[6]. “Ordered” refers to the constraint input formal (method : M, param : Z, var : V) that on all paths through the graph the variables respect a given lin- input IE 0 (invoke : I, target : M) ear order. In addition, OBDDs are maximally reduced meaning that input mI (method : M, invoke : I) nodes with the same variable name and low and high successors are output IE (invoke : I, target : M) collapsed as one, and nodes with identical low and high successors are bypassed. Thus, the more commonalities there are in the paths RULES leading to the terminals, the more compact the OBDDs are. Accord- Rules from Algorithm 2, plus: ingly, the amount of the sharing and the size of the representation IE (i, m) :− IE 0 (i, m). (10) depends greatly on the ordering of the variables. We can use BDDs to represent relations as follows. Each element IE (i, m2 ) :− mI (m1 , i, n), actual (i, 0, v), d in an n-element domain D is represented as an integer between vP (v, h), hT (h, t), cha(t, n, m2 ). (11) 0 and n − 1 using log2 (n) bits. A relation R : D1 × . . . × Dn is assign(v1 , v2 ) :− IE (i, m), formal (m, z, v1 ), represented as a boolean function f : D1 ×. . .×Dn → {0, 1} such actual (i, z, v2 ). (12) that (d1 , . . . , dn ) ∈ R iff f (d1 , . . . , dn ) = 1, and (d1 , . . . , dn ) ∈ / R iff f (d1 , . . . , dn ) = 0. 2 A number of highly-optimized BDD packages are available[21, 27]; the operations they provide can be used directly to implement I is the domain of invocation sites in the program. An invocation relational operations efficiently. For example, the “replace” oper- site is a method invocation of the form r = p0 .m(p1 . . . ation in BDD has the same semantics as the “rename” operation; pk ). Note that H ⊆ I. the “relprod” operation in BDD finds the natural join between two N is the domain of method names used in invocations. In an invo- relations and projects away the common domains. cation r = p0 .n(p1 . . . pk ), n is the method name. Let us now use a concrete example to illustrate the signif- icance of variable ordering. Suppose relation R1 contains tu- M is the domain of implemented methods in the program. It does ples (1, 1), (2, 1), . . . , (100, 1) and relation R2 contains tuples not include abstract or interface methods. (1, 2), (2, 2), . . . , (100, 2). If in the variable order the bits for the Z is the domain used for numbering parameters. first attribute come before the bits for the second, the BDD will cha: T × N × M encodes virtual method dispatch information need to represent the sequence 1, . . . , 100 separately for each rela- from the class hierarchy. cha(t, n, m) means that m is the tion. However, if instead the bits for the second attribute come first, target of dispatching the method name n on type t. the BDD can share the representation for the sequence 1, . . . , 100 actual : I × Z × V encodes the actual parameters for invocation between R1 and R2 . Unfortunately, the problem of finding the best sites. actual (i, z, v) means that v is passed as parameter variable ordering is NP-complete[5]. Our bddbddb system auto- number z at invocation site i. matically explores different alternatives empirically to find an ef- formal : M × Z × V encodes formal parameters for methods. fective ordering[35]. formal (m, z, v) means that formal parameter z of method m is represented by variable v. 3. CALL GRAPH DISCOVERY IE 0 : I × M are the initial invocation edges. They record the invo- The call graph generated using class hierarchy analysis can have cation edges whose targets are statically bound. In Java, some many spurious call targets, which can lead to many spurious points- calls are static or non-virtual. Additionally, local type analy- to relations[19]. We can get more precise results by creating the call sis combined with analysis of the class hierarchy allows us to graph on the fly using points-to relations. As the algorithm gener- determine that some calls have a single target[13]. IE 0 (i, m) ates points-to results, they are used to identify the receiver types of means that invocation site i can be analyzed statically to call the methods invoked and to bind calls to target methods; and as call method m. graph edges are discovered, we use them to find more points-to re- mI : M × I × N represents invocation sites. mI (m, i, n) means lations. The algorithm converges when no new call targets and no that method m contains an invocation site i with virtual new pointer relations are found. method name n. Non-virtual invocation sites are given a spe- Modifying Algorithm 2 to discover call graphs on the fly is sim- cial null method name, which does not appear in the cha re- ple. Instead of an input assign relation computed from a given call lation. graph, we derive it from method invocation statements and points-to IE : I × M is an output relation encoding all invocation edges. relations. IE (i, m) means that invocation site i calls method m.

The rules in Algorithm 3 compute the assign relation used in (a) M1 (b) M1 Algorithm 2. Rules (10) and (11) find the invocation edges, with 1 1 a b a b the former handling statically bound targets and the latter handling 1 2 1 2 virtual calls. Rule (11) matches invocation sites with the type of d the “this” pointer and the class hierarchy information to find the M2 c M3 M2 M3 M2 M3 possible target methods. If an invocation site i with method name n 1−2 e 1−2 f 1−2 g e f g e f g is invoked on variable v, and v can point to h and h has type t, and 1−2 3−4 1−2 1 3 1 2 4 2 invoking n on type t leads to method m, then m is a possible target M4 M5 M4 M4 M5 M4 M4 M5 of invocation i. 1−4 h i 1−2 Rule (12) handles parameter passing.3 If invocation site i has a h h i h h i target method m, variable v2 is passed as argument number z, and 1−4 5−6 1 3 5 2 4 6 the formal parameter z of method m is v1 , then the points-to set M6 M6 M6 M6 M6 M6 M6 of v1 includes the points-to set of v2 . Return values are handled in a likewise manner, only the inclusion relation is in the opposite Figure 1: Example of path numbering. The graph on the left is direction. We see that as the discovery of more variable points- the original graph. Nodes M2 and M3 are in a cycle and there- to (vP ) can create more invocation edges (IE ), which in turn can fore are placed in one equivalence class. Each edge is marked create more assignments (assign) and more points-to relations. The with path numbers at the source and target of the edge. The algorithm converges when all the relations stabilize. graph on the right is the graph with all of the paths expanded. 4. CONTEXT SENSITIVE POINTS-TO Call paths Reduced call paths A context-insensitive or monomorphic analysis produces just one reaching M6 reaching M6 set of results for each method regardless how many ways a method a(cd)∗ eh aeh may be invoked. This leads to imprecision because information b(dc)∗ deh beh from different calling contexts must be merged, so information a(cd)∗ cf h af h along one calling context can propagate to other calling contexts. b(dc)∗ f h bf h A context-sensitive or polymorphic analysis avoids this imprecision a(cd)∗ cgi agi by allowing different contexts to have different results. b(dc)∗ gi bgi We can make a context-sensitive version of a context-insensitive analysis as follows. We make a clone of a method for each path through the call graph, linking each call site to its own unique clone. Figure 2: The six contexts of function M6 in Example 1 We then run the original context-insensitive analysis over the ex- ploded call graph. However, this technique can require an exponen- tial (and in the presence of cycles, potentially unbounded) number an entry method, typically main4 , and ik is an invocation site in of clones to be created. method mk−1 for all k > 1. It has been observed that different contexts of the same method For programs without recursion, every call path to a method de- often have many similarities. For example, parameters to the same fines a context for that method. To handle recursive programs, method often have the same types or similar aliases. This obser- which have an unbounded number of call paths, we first find the vation led to the concept of partial transfer functions (PTF), where strongly connected components (SCCs) in a call graph. By elimi- summaries for each input pattern are created on the fly as they are nating all method invocations whose caller and callee belong to the discovered[36, 37]. However, PTFs are notoriously difficult to im- same SCC from the call paths, we get a finite set of reduced call plement and get correct, as the programmer must explicitly calcu- paths. Each reduced call path to an SCC defines a context for the late the input patterns and manage the summaries. Furthermore, the methods in the SCC. Thus, information from different paths lead- technique has not been shown to scale to very large programs. ing to the SCCs are kept separate, but the methods within the SCC Our approach is to allow the exponential explosion to occur and invoked with the same incoming call path are analyzed context- rely on the underlying BDD representation to find and exploit the insensitively. commonalities across contexts. BDDs can express large sets of re- dundant data in an efficient manner. Contexts with identical infor- E XAMPLE 1. Figure 1(a) shows a small call graph with just six mation will automatically be shared at the data structure level. Fur- methods and a set of invocation edges. Each invocation edge has a thermore, because BDDs operate down at the bit level, it can even name, being one of a through i; its source is labeled by the context exploit commonalities between contexts with different information. number of the caller and its sink by the context number of the callee. BDD operations operate on entire relations at a time, rather than The numbers will be explained in Example 2. Methods M2 and one tuple at a time. Thus, the cost of BDD operations depends on M3 belong to a strongly connected component, so invocations along the size and shape of the BDD relations, which depends greatly on edges c and d are eliminated in the computation of reduced call the variable ordering, rather than the number of tuples in a relation. graphs. While there are infinitely many call paths reaching method Also, due to caching in BDD packages, identical subproblems only M6 , there are only six reduced call paths reaching M6 , as shown in have to be computed once. Thus, with the right variable ordering, Figure 2. Thus M6 has six clones, one for each reduced call path. the results for all contexts can be computed very efficiently. Under this definition of context sensitivity, large programs can 4.1 Numbering Call Paths have many contexts. For example, pmd from our test programs has A call path is a sequence of invocation edges 1971 methods and 1023 contexts! In the BDD representation, we (i1 , m1 ), (i2 , m2 ), . . . , such that i1 is an invocation site in give each reduced call path reaching a method a distinct context 3 4 We also match thread objects to their corresponding run() meth- Other “entry” methods in typical programs are static class initial- ods, even though the edges do not explicitly appear in the call graph. izers, object finalizers, and thread run methods.

number. It is important to find a context numbering scheme that 4.2 Context-Sensitive Pointer Analysis with a allows the BDDs to share commonalities across contexts. Algo- Pre-computed Call Graph rithm 4 shows one such scheme. We are now ready to present our context-sensitive pointer analy- sis. We assume the presence of a pre-computed call graph created, A LGORITHM 4. Generating context-sensitive invocation edges for example, by using a context-insensitive points-to analysis (Al- from a call graph. gorithm 3). We apply Algorithm 4 to the call graph to generate I NPUT: A call multigraph. the context-sensitive invocation edges IE C . Once that is created, we can simply apply a context-insensitive points-to analysis on the O UTPUT: Context-sensitive invocation edges IE C : C × I × C × M, exploded call graph to get context-sensitive results. We keep the where C is the domain of context numbers. IE C (c, i, cm , m) means results separate for each clone by adding a context number to meth- that invocation site i in context c calls method m in context cm . ods, variables, invocation sites, points-to relations, etc. M ETHOD: A LGORITHM 5. Context-sensitive points-to analysis with a pre- computed call graph. 1. A method with n clones will be given numbers 1, . . . , n. Nodes with no predecessors are given a singleton context D OMAINS numbered 1. Domains from Algorithm 2, plus: 2. Find strongly connected components in the input call graph. C 9223372036854775808 The ith clone of a method always calls the ith clone of another method belonging to the same component. R ELATIONS Relations from Algorithm 2, plus: 3. Collapse all methods in a strongly connected component to a single node to get an acyclic reduced graph. input IE C (caller : C, invoke : I, callee : C, tgt : M) assignC (destc : C, dest : V, srcc : C, src : V) 4. For each node n in the reduced graph in topological order, output vP C (context : C, variable : V, heap : H) Set the counts of contexts created, c, to 0. RULES For each incoming edge, If the predecessor of the edge p has k contexts, vPfilter (v, h) : − vT (v, tv ), hT (h, th ), aT (tv , th ). (13) create k clones of node n, vP C (c, v, h) : − vP 0 (v, h), IE C (c, h, , ). (14) Add tuple (i, p, i + c, n) to IE C , for 1 ≤ i ≤ k, vP C (c1 , v1 , h) : − assignC (c1 , v1 , c2 , v2 ), c = c + k. 2 vP C (c2 , v2 , h), vPfilter (v1 , h). (15) hP (h1 , f, h2 ) : − store(v1 , f, v2 ), E XAMPLE 2. We now show the results of applying Algorithm 4 vP C (c, v1 , h1 ), vP C (c, v2 , h2 ). (16) to Example 1. M1 , the root node, is given context number 1. We vP C (cv , v2 , h2 ) : − load (v1 , f, v2 ), vP C (cv , v1 , h1 ), shall visit the invocation edges from left to right. Nodes M2 and hP (h1 , f, h2 ), vPfilter (v2 , h2 ). (17) M3 , being members of a strongly connected component, are repre- sented as one node. The strongly connected component is reached assignC (c1 , v1 , c2 , v2 ) by two edges from M1 . Since M1 has only one context, we create : − IE C (c1 , i, c2 , m), formal (m, z, v1 ), two clones, one reached by each edge. For method M4 , the pre- actual (i, z, v2 ). (18) decessor on each of the two incoming edges has two contexts, thus 2 M4 has four clones. Method M5 has two clones, one for each clone that invokes M5 . Finally, method M6 has six clones: Clones 1-4 C is the domain of context numbers. Our BDD library uses signed of method M4 invoke clones 1-4 and clones 1-2 of method M5 call 64-bit integers to represent domains, so the size is limited to clones 5-6, respectively. The cloned graph is shown in Figure 1(b). 263 . IE C : C × I × C × M is the set of context-sensitive invocation The numbering scheme used in Algorithm 4 plays up the edges. IE C (c, i, cm , m) means that invocation site i in con- strengths of BDDs. Each method is assigned a contiguous range of text c calls method m in context cm . This relation is com- contexts, which can be represented efficiently in BDDs. The con- puted using Algorithm 4. texts of callees can be computed simply by adding a constant to the assignC : C × V × C × V is the context-sensitive version of the contexts of the callers; this operation is also cheap in BDDs. Be- assign relation. assignC (c1 , v1 , c2 , v2 ) means variable v1 in cause the information for contexts that share common tail sequences context c1 includes the points-to set of variable v2 in context are likely to be similar, this numbering allows the BDD data struc- v2 due to parameter passing. Again, return values are handled ture to share effectively across common contexts. For example, the analogously. sequentially-numbered clones 1 and 2 of M6 both have a common vP C : C × V × H is the context-sensitive version of the variable tail sequence eh. Because of this, the contexts are likely to be sim- points-to relation (vP ). vP C (c, v, h) means variable v in con- ilar and therefore the BDD can take advantage of the redundancies. text c can point to heap object h. To optimize the creation of the cloned invocation graph, we have defined a new primitive that creates a BDD representation of con- Rule (18) interprets the context-sensitive invocation edges to find tiguous ranges of numbers in O(k) operations, where k is the num- the bindings between actual and formal parameters. The rest of ber of bits in the domain. In essence, the algorithm creates one BDD the rules are the context-sensitive counterparts to those found in to represent numbers below the upper bound, and one to represent Algorithm 2. numbers above the lower bound, and computes the conjunction of Algorithm 5 takes advantage of a pre-computed call graph to cre- these two BDDs. ate an efficient context numbering scheme for the contexts. We can

compute the call graph on the fly while enjoying the benefit of the vuln(c, i) :− IE (i, “PBEKeySpec.init()”), numbering scheme by numbering all the possible contexts with a actual (i, 1, v), vP C (c, v, h), conservative call graph, and delaying the generation of the invoca- fromString(h). tion edges only if warranted by the points-to results. We can reduce Notice that this query does not only find cases where the the iterations necessary by exploiting the fact that many of the in- object derived from a String is immediately supplied to vocation sites of a call graph created by a context-insensitive anal- PBEKeySpec.init(). This query will also identify cases where the ysis have single targets. Such an algorithm has an execution time object has passed through many variables and heap objects. similar to Algorithm 5, but is of primarily academic interest as the call graph rarely improves due to the extra precision from context- 5.3 Type Refinement sensitive points-to information. Libraries are written to handle the most general types of objects 5. QUERIES AND OTHER ANALYSES possible, and their full generality is typically not used in many ap- plications. By analyzing the actual types of objects used in an ap- The algorithms in sections 2, 3 and 4 generate vast amounts of re- plication, we can refine the types of the variables and object fields. sults in the form of relations. Using the same declarative program- Type refinement can be used to reduce overheads in cast operations, ming interface, we can conveniently query the results and extract resolve virtual method calls, and gain better understanding of the exactly the information we are interested in. This section shows a program. variety of queries and analyses that make use of pointer information We say that variable v can be legally declared as t, written and context sensitivity. varSuperTypes(v, t), if t is a supertype of the types of all the ob- 5.1 Debugging a Memory Leak jects v can point to. The type of a variable is refinable if the variable can be declared to have a more precise type. To compute the super Memory leaks can occur in Java when a reference to an object types of v, we first find varExactTypes(v, t), the types of objects remains even after it will no longer be used. One common approach pointed to by v. We then intersect the supertypes of all the exact of debugging memory leaks is to use a dynamic tool that locates the types to get the desired solution; we do so in Datalog by finding the allocation sites of memory-consuming objects. Suppose that, upon complement of the union of the complement of the exact types. reviewing the information, the programmer thinks objects allocated in line 57 in file a.java should have been freed. He may wish to varExactTypes(v, t) :− vP C ( , v, h), hT (h, t). know which objects may be holding pointers to the leaked objects, notVarType(v, t) :− varExactTypes(v, tv ), ¬aT (t, tv ). and which operations may have stored the pointers. He can consult the static analysis results by supplying the queries: varSuperTypes(v, t) :− ¬notVarType(v, t). whoPointsTo57 (h, f ) :− hP (h, f, “a.java : 57”). refinable(v, tc ) :− vT (v, td ), varSuperTypes(v, tc ), aT (td , tc ), td 6= tc . whoDunnit(c, v1 , f, v2 ) :− store(v1 , f, v2 ), vP C (c, v2 , “a.java : 57”). The above shows a context-insensitive type refinement query. We find, for each variable, the type to which it can be refined regardless The first query finds the objects and their fields that may point of the context. Even if the end result is context-insensitive, it is more to objects allocated at ”a.java:57”; the second finds the store precise to take advantage of the context-sensitive points-to results instructions, and the contexts under which they are executed, that available to determine the exact types, as shown in the first rule. create the references. In Section 6.3, we compare the accuracy of this context-insensitive 5.2 Finding a Security Vulnerability query with a context-sensitive version. The Java Cryptography Extension (JCE) is a library of crypto- graphic algorithms[29]. Misuse of the JCE API can lead to security 5.4 Context-Sensitive Mod-Ref Analysis vulnerabilities and a false sense of security. For example, many Mod-ref analysis is used to determine what fields of what objects operations in the JCE use a secret key that must be supplied by the may be modified or referenced by a statement or call site[18]. We programmer. It is important that secret keys be cleared after they are can use the context-sensitive points-to results to solve a context- used so they cannot be recovered by attackers with access to mem- sensitive version of this query. We define mV (m, v) to mean that v ory. Since String objects are immutable and cannot be cleared, is a local variable in m. The mV ∗C relation specifies the set of vari- secret keys should not be stored in String objects but in an array ables and contexts of methods that are transitively reachable from of characters or bytes instead. a method. mV ∗C (c1 , m, c2 , v) means that calling method m with To guard against misuse, the function that accepts the secret key, context c1 can transitively call a method with local variable v under PBEKeySpec.init(), only allows arrays of characters or bytes as context c2 . input. However, a programmer not versed in security issues may mV ∗C (c, m, c, v) :− mV (m, v). have stored the key in a String object and then use a routine ∗ mV C (c1 , m1 , c3 , v3 ) :− mI (m1 , i), IE C (c1 , i, c2 , m2 ), in the String class to convert it to an array of characters. We mV ∗C (c2 , m2 , c3 , v3 ). can write a query to audit programs for the presence of such id- ioms. Let Mret(m, v) be an input relation specifying that vari- The first rule simply says that a method m in context c can reach able v is the return value of method m. We define a relation its local variable. The second rule says that if method m1 in context fromString(h) which indicates if the object h was directly derived c1 calls method m2 in context c2 , then m1 in context c1 can also from a String. Specifically, it records the objects that are re- reach all variables reached by method m2 in context c2 . turned by a call to a method in the String class. An invocation We can now define the mod and ref set of a method as follows: i to method PBEKeySpec.init() is a vulnerability if the first argu- mod (c, m, h, f ) :− mV ∗C (c, m, cv , v), ment points to an object derived from a String. store(v, f, ), vP C (cv , v, h). fromString(h) :− cha(“String”, , m), Mret(m, v), ref (c, m, h, f ) :− mV ∗C (c, m, cv , v), vP C ( , v, h). load (v, f, ), vP C (cv , v, h).

The first rule says that if method m in context c can reach a vari- 5.6 Thread Escape Analysis able v in context cv , and if there is a store through that variable to Our last example is a thread escape analysis, which determines field f of object h, then m in context c can modify field f of object if objects created by one thread may be used by another. The re- h. The second rule for defining the ref relations is analogous. sults of the analysis can be used for optimizations such as synchro- 5.5 Context-Sensitive Type Analysis nization elimination and allocating objects in thread-local heaps, as well as for understanding programs and checking for possible race Our cloning technique can be applied to add context sensitivity conditions due to missing synchronizations[8, 34]. This example to other context-insensitive algorithms. The example we show here illustrates how we can vary context sensitivity to fit the needs of the is the type inference of variables and fields. By not distinguish- analysis. ing between instances of heap objects, this analysis does not gener- We say that an object allocated by a thread has escaped if it may ate results as precise as those extracted from running the complete be accessed by another thread. This notion is stronger than most context-sensitive pointer analysis as discussed in Section 5.3, but is other formulations where an object is said to escape if it can be much faster. reached by another thread[8, 34]. The basic type analysis is similar to 0-CFA[26]. Each variable Java threads, being subclasses of java.lang.Thread, are and field in the program has a set of concrete types that it can re- identified by their creation sites. In the special case where a thread fer to. The sets are propagated through calls, returns, loads, and creation can execute only once, a thread can simply be named by stores. By using the path numbering scheme in Algorithm 4, we the creation site. The thread that exists at virtual machine startup can convert this basic analysis into one which is context-sensitive— is an example of a thread that can only be created once. A creation in essence, making the analysis into a k-CFA analysis where k is site reached via different call paths or embedded in loops or recur- the depth of the call graph and recursive cycles are collapsed. sive cycles may generate multiple threads. To distinguish between A LGORITHM 6. Context-sensitive type analysis. thread instances created at the same site, we create two thread con- texts to represent two separate thread instances. If an object created D OMAINS by one instance is not accessed by its clone, then it is not accessed Domains from Algorithm 5 by any other instances created by the same call site. This scheme creates at most twice as many contexts as there are thread creation R ELATIONS sites. Relations from Algorithm 5, plus: We clone the thread run() method, one for each thread context, output vT C (context : C, variable : V, type : T) and place these clones on the list of entry methods to be analyzed. output fT (field : F, target : T) Methods (transitively) invoked by a context’s run() method all vTfilter (variable : V, type : T) inherit the same context. A clone of a method not only has its own cloned variables, but also its own cloned object creation sites. In RULES this way, objects created by separate threads are distinct from each vTfilter (v, t) : − vT (v, tv ), aT (tv , t). (19) other. We run a points-to analysis over this slightly expanded call vT C (c, v, t) : − vP 0 (v, h), IE C (c, h, , ), hT (h, t).(20) graph; an object created in a thread context escapes if it is accessed by variables in another thread context. vT C (cv1 , v1 , t) : − assignC (cv1 , v1 , cv2 , v2 ), vT C (cv2 , v2 , t), vTfilter (v1 , t). (21) A LGORITHM 7. Thread-sensitive pointer analysis. fT (f, t) : − store( , f, v2 ), vT C ( , v2 , t). (22) D OMAINS vT C ( , v, t) : − load ( , f, v), fT (f, t), Domains from Algorithm 5 vTfilter (v, t). (23) R ELATIONS assignC (c1 , v1 , c2 , v2 ) Relations from Algorithm 2, plus: : − IE C (c1 , i, c2 , m), formal (m, z, v1 ), actual (i, z, v2 ). (24) input HT (c : C, heap : H) input vP 0 T (cv : C, variable : V, ch : C, heap : H) 2 output vP T (cv : C, variable : V, ch : C, heap : H) vT C : C × V × T is the context-sensitive variable type relation. output hP T (cb : C, base : H, field : F, ct : C, target : H) vT C (c, v, t) means that variable v in context cv can refer to RULES an object of type t. This is the analogue of vP C in the points- to analysis. vPfilter (v, h) : −vT (v, tv ), hT (h, th ), aT (tv , th ). (25) fT : F × T is the field type relation. fT (f, t) means that field f vP T (c1 , v, c2 , h) : −vP 0 T (c1 , v, c2 , h). (26) can point to an object of type t. vP T (c, v, c, h) : −vP 0 (v, h), HT (c, h). (27) vTfilter : V × T is the type filter relation. vTfilter (v, t) means that it is type-safe to assign an object of type t to variable v. vP T (c2 , v1 , ch , h) : −assign(v1 , v2 ), vP T (c2 , v2 , ch , h), vPfilter (v1 , h). (28) Rule (20) initializes the vT C relation based on the initial local hP T (c1 , h1 , f, c2 , h2 ): −store(v1 , f, v2 ), vP T (c, v1 , c1 , h1 ), points-to information contained in vP 0 , combining it with hT to get vP T (c, v2 , c2 , h2 ). (29) the type and IE C to get the context numbers. Rule (21) does tran- vP T (c, v2 , c2 , h2 ) : −load (v1 , f, v2 ), vP T (c, v1 , c1 , h1 ), sitive closure on the vT C relation, filtering with vTfilter to enforce hP T (c1 , h1 , f, c2 , h2 ), type safety. Rules (22) and (23) handle stores and loads, respec- vPfilter (v2 , h2 ). (30) tively. They differ from their counterparts in the pointer analysis in that they do not use the base object, only the field. Rule (24) models 2 the effects of parameter passing in a context-sensitive manner.

HT : C × H encodes the non-thread objects created by a thread. to handle return values and threads, and added annotations for the HT (c, h) means that a thread with context c may execute non- physical domain assignments of input relations.) The input rela- thread allocation site h; in other words, there is a call path tions were generated with the Joeq compiler infrastructure[32]. The from the run() method in context c to allocation site h. entire bddbddb implementation is only 2500 lines of code. bd- vP 0 T : C × V × C × H is the set of initial inter-thread points-to dbddb uses the JavaBDD library[31], an open-source library based relations. This includes the points-to relations for thread cre- on the BuDDy library[21]. The entire system is available as open- ation sites and for the global object. vP 0 T (c1 , v, c2 , h) means source[35], and we hope that others will find it useful. that thread c1 has an thread allocation site h, and v points to All experiments were performed on a 2.2GHz Pentium 4 with the newly created thread context c2 . (There are usually two Sun JDK 1.4.2 04 running on Fedora Linux Core 1. For the context- contexts assigned to each allocation site). All global objects insensitive and context-sensitive experiments, respectively: we used across all contexts are given the same context. initial BDD table sizes of 4M and 12M; the tables could grow by vP T : C × V × C × H is the thread-sensitive version of the variable 1M and 3M after each garbage collection; the BDD operation cache points-to relation vP C . vP T (c1 , v, c2 , h) means variable v in sizes were 1M and 3M. context c1 can point to heap object h created under context To test the scalability and applicability of the algorithm, we ap- c2 . plied our technique to 21 of the most popular Java projects on hP T : C × H × F × C × H is the thread-sensitive version of the Sourceforge as of November 2003. We simply walked down the heap points-to relation hP . hP T (c1 , h1 , f, c2 , h2 ) means that list of 100% Java projects sorted by activity, selecting the ones that field f of heap object h1 created under context c1 can point would compile directly as standalone applications. They are all to heap object h2 created under context c2 . real applications with tens of thousands of users each. As far as we know, these are the largest benchmarks ever reported for any Rule (26) incorporates the initial points-to relations for thread context-sensitive Java pointer analysis. As a point of comparison, creation sites. Rule (27) incorporates the points-to information the largest benchmark in the specjvm suite, javac, would rank only for non-thread creation sites, which have the context numbers of 13th in our list. threads that can reach the method. The other rules are analogous For each application, we chose an applicable main() method as to those of the context-sensitive pointer analysis, with an additional the entry point to the application. We included all class initializers, context attribute for the heap objects. thread run methods, and finalizers. We ignored null constants in the From the analysis results, we can easily determine which objects analysis—every points-to set is automatically assumed to include have escaped. An object h created by thread context c has escaped, null. Exception objects of the same type were merged. We treated written escaped (c, h), if it is accessed by a different context cv . reflection and native methods as returning unknown objects. Some Complications involving unknown code, such as native methods, native methods and special fields were modeled explicitly. could also be handled using this technique. A short description of each of the benchmarks is included in Fig- ure 3, along with their vital statistics. The number of classes, meth- escaped (c, h) :− vP T (cv , , c, h), cv 6= c. ods, and bytecodes were those discovered by the context-insensitive Conversely, an object h created by context c is captured, written on-the-fly call graph construction algorithm, so they include only captured (c, h), if it has not escaped. Any captured object can be the reachable parts of the program and the class library. allocated on a thread-local heap. The number of context-sensitive (C.S.) paths is for the most part correlated to the number of methods in the program, with the excep- captured (c, h) :− vP T (c, v, c, h), ¬escaped (c, h). tion of pmd. pmd has an astounding 5×1023 paths in the call graph, We can also use escape analysis to eliminate unnecessary syn- which requires 79 bits to represent. pmd has different characteristics chronizations. We define a relation syncs(v ) indicating if the pro- because it contains code generated by the parser generator JavaCC. gram contains a synchronization operation performed on variable v. Many machine-generated methods call the same class library rou- A synchronization for variable v under context c is necessary, writ- tines, leading to a particularly egregious exponential blowup. The ten neededSyncs(c, v), if syncs(v ) and v can point to an escaped JavaBDD library only supports physical domains up to 63 bits; con- object. texts numbered beyond 263 were merged into a single context. The large number of paths also caused the algorithm to require many neededSyncs(c, v) :− syncs(v), vP T (c, v, ch , h), more rule applications to reach a fixpoint solution. escaped (ch , h) 6.2 Analysis Times Notice that neededSyncs is context-sensitive. Thus, we can distinguish when a synchronization is necessary only for certain We measured the analysis times and memory usage for each of threads, and generate specialized versions of methods for those the algorithms presented in this paper (Figure 4). The algorithm threads. with call graph discovery, in each iteration, computes a call graph based on the points-to relations from the previous iteration. The 6. EXPERIMENTAL RESULTS number of iterations taken for that algorithm is also included here. All timings reported are wall-clock times from a cold start, and In this section, we present some experimental results of using bd- include the various overheads for Java garbage collection, BDD dbddb on the Datalog algorithms presented in this paper. We de- garbage collection, growing the node table, etc. The memory num- scribe our testing methodology and benchmarks, present the anal- bers reported are the sizes of the peak number of live BDD nodes ysis times, evaluate the results of the analyses, and provide some during the course of the algorithm. We measured peak BDD mem- insight on our experience of developing these analyses and the bd- ory usage by setting the initial table size and maximum table size dbddb tool. increase to 1MB, and only allowed the table to grow if the node 6.1 Methodology table was more than 99% full after a garbage collection.5 The input to bddbddb is more or less the Datalog programs ex- 5 To avoid garbage collections, it is recommended to use more mem- actly as they are presented in this paper. (We added a few rules ory. Our timing runs use the default setting of 80%.

You can also read