SCanDroid: Automated Security Certification of Android Applications

Page created by Emily Schroeder
SCanDroid: Automated Security Certification of Android Applications

                              Adam P. Fuchs, Avik Chaudhuri, and Jeffrey S. Foster
                                     University of Maryland, College Park

                         Abstract                                        However, enforcing permissions is not sufficient to pre-
                                                                     vent security violations, since permissions may be misused,
   Android is a popular mobile-device platform developed             intentionally or unintentionally, to introduce insecure data
by Google. Android’s application model is designed to                flows. Indeed, suppose that Alice downloads and installs a
encourage applications to share their code and data with             new application, developed by Bob, on her Android-based
other applications. While such sharing can be tightly con-           phone. Say this application, wikinotes, interacts with a core
trolled with permissions, in general users cannot determine          application, notes, to publish some notes from the phone to
what applications will do with their data, and thereby can-          a wiki, and to sync edits back from the wiki to the phone. Of
not decide what permissions such applications should run             course, Alice would not like all her notes to be published,
with. In this paper we present SC AN D ROID, a tool for rea-         and would not like all her published notes to be edited; for
soning automatically about the security of Android appli-            instance, her notes may include details of her ongoing re-
cations. SC AN D ROID’s analysis is modular to allow in-             search. How can she know whether it is safe to run the
cremental checking of applications as they are installed on          application? Can she trust the application to safely access
an Android device. It extracts security specifications from          her data? Conversely, Bob may want to be able to convince
manifests that accompany such applications, and checks               Alice that his application can be run safely on her phone.
whether data flows through those applications are consis-                In this paper, we present SC AN D ROID,1 a tool for
tent with those specifications. To our knowledge, SC AN -            automated security certification of Android applications.
D ROID is the first program analysis tool for Android, and           SC AN D ROID statically analyzes data flows through An-
we expect it to be useful for automated security certification       droid applications, and can make security-relevant deci-
of Android applications.                                             sions automatically, based on such flows. In particular, it
                                                                     can decide whether it is safe for an application to run with
                                                                     certain permissions, based on the permissions enforced by
1   Introduction                                                     other applications. Alternatively, it can provide enough con-
                                                                     text to the user to make informed security-relevant deci-
                                                                     sions. SC AN D ROID can also be useful in various proof-
    Android [3] is Google’s open-source platform for mobile
                                                                     carrying code (PCC) [11] settings. For example, applica-
devices, which is recently enjoying wide adoption by the in-
                                                                     tions can be reviewed offline with SC AN D ROID by an appli-
dustry. Designed to be a complete software stack, Android
                                                                     cation store [2], and Android devices can check certificates
includes an operating system, middleware, and core appli-
                                                                     of security issued by the application store at install time.
cations. Furthermore, it comes with an SDK that provides
                                                                     Alternatively, the developer can construct a safety proof for
the tools and APIs necessary to develop new applications
                                                                     the application by using our analysis, and the device can
for the platform in Java [3]. Developers of new applications
                                                                     verify that proof before installing the application.
have full access to the same framework APIs used by the
                                                                         At the heart of SC AN D ROID is a modular data flow anal-
core applications.
                                                                     ysis for Android applications, designed to allow incremen-
    Android’s application model has several interesting fea-
                                                                     tal checking of applications as they are installed on an An-
tures. First, applications must follow a specific structure,
                                                                     droid device. Our analysis tracks data flows through and
i.e., they must be composed of some basic kinds of com-
                                                                     across components, while relying on an underlying abstract
ponents understood by Android. This design encourages
                                                                     semantics for Android applications. The data flows can be
sharing of code and data across applications. Next, inter-
                                                                     fairly complicated, due to sophisticated protocols run by
actions between components can be tightly controlled. By
                                                                     Android to route control between components. Our abstract
default, components within an application are sandboxed
                                                                     semantics for Android applications exposes these control
by Android, and other applications may access such com-
                                                                     routes to our analysis.
ponents only if they have the required permissions to do so.
This design promises some measure of protection from ma-                 1 The name is intended to abbreviate “Security Certifier for anDroid”,

licious applications.                                                although various puns might be intended as well.

We formalize the basic elements of our data flow analy-          opportunity to bring language-based security to the main-
sis as a constraint system, based on an existing core calcu-         stream: the idea of certified installation should certainly be
lus to describe and reason about Android applications [7].           attractive to a growing and diverse Android community.
We show how end-to-end security can be enforced with our                 The remainder of the paper is organized as follows. In
data flow analysis. In our formalism, we focus only on               Section 2 we present an overview of Android, focusing on
constructs that are unique to Android, while ignoring the            its application model, data structures, and security mech-
other usual Java constructs that may appear in Android ap-           anisms. In Section 3, we provide a series of examples to
plications. This simplification allows us to study Android-          illustrate data flows through Android applications. In Sec-
specific features in isolation. Our system relies on the ac-         tion 4, we formalize our analysis and outline its security
cess control mechanisms already provided by Android, and             guarantees. In Section 5, we describe the details of our
enforces “best practices” for developing secure applications         implementation. In Section 6, we discuss limitations and
with these mechanisms. The resulting guarantees include              future extensions. In Section 7, we discuss some closely
standard data-flow security properties for well-constrained          related work. Finally, in Section 8, we conclude.
applications described in the calculus.
    Next, we extend and implement this core analysis to rea-
                                                                     2     Overview of Android
son about actual Android applications. For this purpose,
we must consider the usual Java constructs in combination
with Android-specific constructs. This poses some signifi-              Android implements a complete software stack for run-
cant challenges: for instance, we need a string analysis to          ning mobile device applications. At the lowest layer is a
recover addresses of components; we need a pointer analy-            Linux kernel that provides device drivers, memory man-
sis to track flows through the heap; we need to handle inter-        agement, power management, and networking. Next, An-
procedural flows through JVML bytecode, and so on. Our               droid provides some native libraries for graphics, database
implementation is built over WALA, an open-source col-               management, and web browser functionality, which can be
lection of libraries for Java program analysis. Although we          called through Java interfaces. Furthermore, Android in-
have not proved so, we expect that our formal guarantees             cludes core Java libraries, and a virtual machine for running
should carry over to our implementation.                             customized bytecode (called dex) derived from JVML byte-
    In sum, our contributions in this paper are as follows.          code. Above this layer is the application framework, which
                                                                     serves as an abstract machine for applications. Finally, the
  • We design a modular data flow analysis for Android               top layer contains application code, developed in Java with
    applications, which tracks data flows through and                the APIs provided by an SDK.
    across components, while relying on an abstract se-                 Next, we describe Android’s application model, focusing
    mantics for Android applications.                                on how applications are developed and run in detail.

  • We formalize the basic elements of our analysis in a
                                                                     2.1    Application components
    core calculus to describe and reason about Android ap-
    plications. We prove that applications verified by our
    analysis have standard data-flow security properties.               An Android application is a package of components,
                                                                     each of which can be instantiated and run as necessary (pos-
  • We present a tool, SC AN D ROID, that extends and im-            sibly even by other applications) [3]. Components are of the
    plements this analysis to reason about actual Android            following types:
    applications. Although we have not proved so, we ex-
    pect that our formal guarantees can be carried over to           Activity components form the basis of the user interface.
    our implementation.                                                  Usually, each window of the application is controlled
                                                                         by some activity.
  • We indicate how SC AN D ROID can be useful for auto-
    mated security certification of Android applications.            Service components run in the background, and remain ac-
                                                                         tive even if windows are switched. Services can expose
   To our knowledge, SC AN D ROID is the first program                   interfaces for communication with other applications.
analysis tool for Android. While we focus on security in
this paper, we believe that SC AN D ROID’s data flow analysis        BroadcastReceiver components react asynchronously to
will be useful in other contexts as well. For instance, it may           messages from other applications.
be used to prove the safety of type casts, which are common
for Android’s data structures; it may also be used to infer          ContentProvider components store data relevant to the
minimal sets of permissions required for functionality. At               application, usually in a database. Such data can be
the very least, we believe that this setting provides an ideal           shared across applications.

Consider, for example, a music-player application for an              on a device together form what we will call the environ-
     Android-based phone. This application may include sev-                   ment. The environment affects Android’s run-time deci-
     eral components. There may be activities for viewing the                 sions about control flow and data flow between application
     songs on the phone and for editing the details of a particu-             components, and is therefore a critical consideration for any
     lar song. There may be a service for playing a song in the               flow analysis.
     background. There may be broadcast receivers for pausing
     a song when a call comes in, and for restarting the song                 2.3    Data structures
     when the call ends. Finally, there may be a content provider
     for sharing the songs on the phone.                                         We now explain some of the data structures that ap-
        The Android SDK provides a base class for each type                   pear in the signatures above. In Android, data is usually
     of component, with callback methods that are run at var-                 passed around as hashes of key/value pairs. Containers for
     ious points in the life cycle of the associated component.               such data include Bundles, Cursors, and ContentValues (and
     Each component of an application is defined by extending                 there are others, e.g., to share preferences across compo-
     one of the base classes and overriding the methods in that               nents in an application). Sometimes, it is convenient to tag
     class. (The methods have default implementations, so it is               data with an address, especially to construct messages for
     not necessary to override all such methods in the class.) The            inter-component communication. Containers for such mes-
     following is a simplified version of the interfaces defined              sages, i.e., data/address pairs, include Intents. Addresses for
     for each of the four component types. (The types Bundle,                 data can be specified as Uris. Finally, IBinder is a higher-
     Intent, Cursor, ContentValues, Uri, and IBinder describe data            order data structure that may contain Java interfaces for
     structures that are used to pass data through and between                inter-process communication on Android.
     components; we will explain these data structures later.)
 1   class Activity ... {                                                     2.4    Inter-component communication
 2      public void onCreate(Bundle ...) {...}
 3      public void onActivityResult(..., Intent data) {...}                     Besides managing the lifecycles of individual compo-
 4      ...
                                                                              nents, Android’s application framework is also responsi-
 5   }
 6   class Service ... {
                                                                              ble for calling components from other components and
 7      public IBinder onBind(Intent data) {...}                              for passing messages between components. In application
 8      ...                                                                   code, these interactions occur through various method calls
 9   }                                                                        and callback methods. In particular, inflow methods are
10   class BroadcastReceiver ... {                                            used to import data to the application, and outflow meth-
11      public void onReceive(..., Intent data) {...}                         ods are used to export data from the application. Some of
12      ...
                                                                              the inflow methods are shown in the interfaces above, e.g.,
13   }
14   class ContentProvider ... {
                                                                              ContentProvider.query and Activity.onActivityResult; there
15      public Cursor query(Uri queryUri, ...) {...}                          are others that are not shown, such as Activity.getIntent. In
16      public int update(Uri updateUri, ContentValues data, ...) {...}       general, each inflow method can be matched with an out-
17      ...                                                                   flow method. For instance, Activity.getIntent() is matched by
18   }                                                                        Activity.startActivityForResult(), Activity.onActivityResult() is
                                                                              matched by Activity.setResult(), and ContentProvider.query
         Effectively, each of these methods is an entry point (i.e.,          is matched by ContentProvider.update().
     a main method) that can be called at various points in the                  Much of the complexity of orchestrating inter-
     lifecycle of the component. Our implementation carries out               component interactions lies in linking outflow methods
     a modular analysis over all such entry points. However, in               with components that provide the corresponding inflow
     this paper we focus only on the methods shown above for                  methods. As explained above, addresses can be specified
     simplicity.                                                              either as Uris to address ContentProviders or inside Intents
                                                                              to address Activitys, Services, or BroadcastReceivers.
     2.2     Environment                                                      Furthermore, additional linking information is provided in
                                                                              the manifest.
        Each Android application publishes its contents to the                   Ultimately, addresses influence how control flows across
     framework via a manifest. The manifest contains a list of                components, and permissions (described later) can restrict
     components that are hosted by an application, some infor-                such flows based on those addresses. Thus, our analysis
     mation used to link components at run time, and informa-                 needs to track Uris and Intents very precisely. Next we dis-
     tion about the permissions enforced and requested by the                 cuss the addressing mechanisms through Uris and Intents in
     application. The manifests for all the applications installed            more detail.

Uri-based addressing In application code, query and up-                     Conversely, an application’ manifest declares any per-
date calls to ContentProviders must be addressed through                missions required by other applications to access its com-
Uris.    A ContentProvider is specified in the manifest                 ponents. For dynamic components, this information is pro-
along with the address of its authority. For exam-                      vided in their constructors. In particular, an application can
ple, a ContentProvider that shares code certificates from               enforce permissions for launching its activities, binding to
some domain might have domain.certificates as its au-                   its services, sending messages to its broadcast receivers or
thority. A query or update on an Uri of the form con-                   receiving messages from any of its components, and query-
tent://domain.certificates/... would then be directed to that           ing and updating data stored by its content providers.
Intent-based addressing On the other hand, components                   Isolation and Signatures Going further, Android derives
such as Activitys, Services, or BroadcastReceivers must be              several protection mechanisms from the Linux kernel. Ev-
addressed through Intents. For example, Intents are passed              ery application runs in its own Linux process, on its own
as parameters in inter-component calls for launching ac-                VM. Android starts the application’s process when any of
tivities (via startActivityForResult), binding to services (via         the application’s code needs to be run, and stops the process
bindService), and sending messages to broadcast receivers               when another application’s code needs to be run. The ap-
(via sendBroadcast). An Intent can specify the component                plication’s code runs in isolation from the code of all other
to call either by name (e.g., ‘‘’’),            applications. Finally, each application is assigned a unique
or more abstractly by action (e.g., ‘‘POSTBLOG’’). As ex-               Linux UID, so the application’s files are not visible to other
plained above, Intents may also carry data, usually in a                applications.
Bundle.                                                                    That said, it is possible for several applications to share,
    If the address in an Intent is an action rather than a name,        through signatures, the same UID (see below), in which
the Intent is matched against intent filters that are speci-            case their files become visible to each other. Such appli-
fied alongside component descriptions in the manifest. The              cations can also run in the same process, sharing the same
action specified in the Intent is checked against the action            VM. In particular, any Android application must be signed
specified in the intent filter to find a suitable component for         with a certificate whose private key is held by the developer.
processing the call.                                                    The certificate does not need to be signed by a certificate
    Finally, we should point out that some components, such             authority; it is used only to establish trust between applica-
as BroadcastReceivers and Services, may be constructed                  tions by the same developer. Thus, such applications may
dynamically by application code. As such, linking informa-              share the same UID or the same permissions.
tion about such components is provided at run time in their
constructors, rather than statically in the manifest. More              3      Examples
generally, components that handle events in Android, such
as notification managers and user interfaces, follow a simi-
                                                                           In this section we show a series of examples to illustrate
lar pattern. In our formal analysis, we conveniently gener-
                                                                        how data may flow through Android applications. In par-
alize such components as listeners.
                                                                        ticular we are concerned about flows between Uris, which
2.5    Security mechanisms                                              address content providers, since ultimately they are the data
                                                                        sources and sinks across applications.
   An application can share its data and functionality with
other applications by letting such applications access its              Intra-component flows We begin with an example of a
components. Clearly, these accesses must be carefully con-              simple flow from one Uri to another within an activity.
trolled for security. Android provides the following key ac-            Example (i)
cess control mechanisms [3].
                                                                   19   public class Util {
                                                                   20    private static String authorityA = ‘‘some.authority’’;
Permissions Any application needs explicit permissions             21    private static String authorityB = ‘‘some.other.authority’’;
to access the components of other applications. Crucially,         22

such permissions are set at install time, not at run time. An      23       public static final Uri uriA = Uri.parse(‘‘content://’’+authorityA);
application’s manifest must declare any permissions that the       24       public static final Uri uriB = Uri.parse(‘‘content://’’+authorityB);
application may require. The package installer sets these
                                                                   26       public static String readA(ContentResolver handle) {
permissions via dialog with the user. No further decisions         27         Uri queryUri = Uri.withAppendedPath(uriA, ‘‘param’’);
are made at run time; if the application’s code needs a per-       28         Cursor result = handle.query(queryUri, ...);
mission at run time that is not set at install time, it blocks,    29         return result.toString();
and its resources are reclaimed by Android.                        30       }

Suppose that we launch an instance of Activity1. This
                                                             Uri A                                           passes control to the onCreate method of that activity (line
                                                                                                             42). Next, readA is called to query Uri A (line 44), and
                                                      (ii)        (iii)           (iv)
                                                                                                             writeB is called with the result to update Uri B (line 45).
                                                                                                             Thus, we have a flow of data from Uri A to Uri B; this flow
                                                                                                             is shown via the edges marked (i) in the graph in Figure 1.
                                                                                                                 Such a flow may have unintended security consequences.
                            Activity2.onCreate                  (iii)         Activity6.onCreate
                                                                                                               • Suppose that the data at Uri A is intended to be secret,
                      (i)               (ii)          Service5.onBind                     (iv)                   i.e., reading data at Uri A requires permission >. At
                                                                                                                 the same time, suppose that the data at Uri B is pub-
                                                                                                                 lic, i.e., reading data at Uri B requires permission ⊥.
                            Activity3.onCreate                        (iii)       receiver.onReceive
                                                                                                                 Clearly this admits a secrecy violation, because any-
                                                                                                                 body who has ⊥ permission can effectively read data
                                   (ii)        connection.onServiceConnected                                     from Uri A.

                            Activity2.onActivityResult                    (iii)                  (iv)
                                                                                                               • Dually, suppose that the data at Uri B is intended to be
                                                                                                                 trusted, i.e., writing data at Uri B requires permission
                                                                                                                 >. At the same time, suppose that the data at Uri A is
             Activity1.onCreate                (ii)          IService5.Stub.rpc
                                                                                                                 tainted, i.e., writing data at Uri A requires permission
                                                                                                                 ⊥. Clearly this admits an integrity violation, because
                                  (i)                         (iii)
                                                                                                                 anybody who has ⊥ permission can effectively write
                                          Uri B                                                                  data to Uri B.

                                                                                                                To prevent such violations, our analysis enforces that the
              Figure 1. Data flows from Uri A to Uri B                                                       permission required to read Uri B should be at least as high
                                                                                                             as the permission required to read Uri A; and conversely,
31                                                                                                           the permission required to write Uri A should be at least as
32       public static void writeB(ContentResolver handle, String s) {                                       high as the permission required to write Uri B.
33         Uri updateUri = uriB;
34         ContentValues values = new ContentValues();
35         values.put(‘‘param’’, data);                                                                      Inter-component flows Our next example involves a sim-
36         handle.update(updateUri, values, ...);                                                            ilar overall flow from Uri A to Uri B, but split across multi-
37       }                                                                                                   ple activities, i.e., the overall flow is a transitive composition
38   }                                                                                                       of simpler flows within and through these activities.

40   public class Activity1 extends Activity {
                                                                                                             Example (ii)
41     public void onCreate(...) {                                                                      47   public class Util {
42       ...                                                                                            48    ...
43       String s = Util.readA(getContentResolver());                                                   49    public static Intent pack(String s) {
44       Util.writeB(getContentResolver(), s);                                                          50        Intent intent = new Intent();
45     }                                                                                                51        intent.putExtra(‘‘param’’, s);
46   }                                                                                                  52        return intent;
                                                                                                        53    }
         The class Util defines some methods that will be used
                                                                                                        55     public static String unpack(Intent data) {
     throughout this section. One of these methods, readA                                               56       return data.getStringExtra(‘‘param’’);
     (lines 27–31), is used to issue a query on the Uri con-                                            57     }
     tent://some.authority/param (line 29), which we will call                                          58   }
     Uri A; this returns a Cursor object over some data at that                                         59   public class Activity2 extends Activity {
     Uri (line 30). The other method, writeB (lines 33–38), is                                          60     public void onCreate(...) {
     used to package such data into a ContentValues object (line                                        61       ...
                                                                                                        62       String s = Util.readA(getContentResolver());
     36) and send it off to update the contents at the Uri con-
                                                                                                        63       Intent intent = Util.pack(s);
     tent://some.other.authority (line 37), which we will call Uri
                                                                                                        64       intent.setClass(this, Activity3.class);
     B. (Both of these methods take a ContentResolver object as                                         65       startActivityForResult(intent, ...);
     parameter, which can be obtained simply by calling a built-                                        66     }
     in method getContentResolver.)                                                                     67

68       public void onActivityResult(..., Intent data) {               86            String s = Util.readA(getContentResolver());
69         String s = Util.unpack(data);                                87            Intent intent = Util.pack(s);
70         Util.writeB(getContentResolver(), s);                        88            intent.setClass(this, Service5.class);
71       }                                                              89            ServiceConnection connection = new ServiceConnection() {
72   }                                                                  90              public void onServiceConnected(..., IBinder service) {
73                                                                      91                  IService5.Stub.asInterface(service).rpc();
74   public class Activity3 extends Activity {                          92              }
75     public void onCreate(...) {                                      93              ...
76       ...                                                            94            }
77       Bundle input = getIntent().getExtras();                        95            bindService(intent, connection, ...);
78       Intent data = new Intent();                                    96        }
79       i.putExtras(input);                                            97    }
80       setResult(..., i);                                             98

81     }                                                                99    public interface IService5 extends IInterface {
82   }                                                                  100     // autogenerated from IService5.aidl
                                                                        101     ...
         The class Util defines the following new methods: pack,        102     public void rpc();
     which bundles data in an intent (line 52) and returns it, and      103   }
     unpack, which extracts data from such an intent (line 57)
                                                                        105   public class Service5 extends Service {
     and returns it.
                                                                        106     public IBinder onBind(final Intent data) {
         Suppose that we launch an instance of Activity2. As            107       return new IService5.Stub() {
     usual, control passes to the onCreate method (line 61),            108         public void rpc() {
     where readA is called to issue a query on Uri A (line 63).         109           String s = Util.unpack(data);
     Next, pack is called to bundle the returned data into an Intent    110           Util.writeB(getContentResolver(), s);
     object (line 64), and an instance of Activity3 is launched with    111         }
                                                                        112       }
     it (lines 65–66). This passes control to the onCreate method
                                                                        113     }
     of that activity (line 76), where the data in the intent is ex-    114   }
     tracted and bundled back into another intent (lines 78–80),
     and that intent is set as the result of this activity (line 81).             Let us fast-forward to line 88, where data has been read
     Control now passes back to the onActivityResult method of                from Uri A and bundled into an intent. This intent is used
     the calling activity (line 69), where unpack is called to ex-            to bind to an instance of Service5 (line 89—96). Binding
     tract the data in the returned intent (line 70), and finally             to a service actually involves a complicated protocol than
     writeB is called to, as usual, package the data and use it to            simply connecting to the service; in particular, it allows ar-
     update the contents at Uri B (line 71).                                  bitrary RPC calls into the service. This requires passing a
         Thus, effectively, we have another flow of data from Uri             dynamic ServiceConnection object (lines 90–95) together
     A to Uri B; this flow is shown via the edges marked (ii) in              with the intent at binding time. Control then passes to the
     the graph on the left in Figure 1. This flow may have the                onBind method of the service (line 125), which must re-
     same unintended security consequences as in the previous                 turn an implementation of some RPC interface with which
     example, so our analysis enforces the same constraints on                the connection can interact. Such an interface is typically
     the permissions required for reading from and writing to                 autogenerated from an AIDL (Android Interface Descrip-
     Uri A and Uri B, to account for this flow.                               tion Language) specification, as shown in lines 100–104;
                                                                              in this example, the interface just exposes a method named
     Flows through dynamic components Of course, transi-                      rpc. The implementation of this method (lines 109–112)
     tive flows may not only involve activities, but also other               causes the data bundled in the intent of onBind to flow
     components. Tracking data flow across components is al-                  to Uri B. This implementation is implicitly passed to the
     ready nontrivial since control flow across such components               onServiceConnected method (line 91) of the connection re-
     is orchestrated implicitly, via reflection, by Android’s ap-             ceived at binding time, where it is called (line 92) to finally
     plication framework. Further complications arise when the                trigger the flow. The overall flow is shown via the edges
     components involved in such flows are dynamic, rather than               marked (iii) in the graph on the left in Figure 1.
     static. Our next example involves data flow through a ser-                   Such complicated flows may also involve other dynamic
     vice that dynamically exposes an RPC interface.                          components, including notification managers and user inter-
                                                                              faces. In the following example, a dynamically registered
     Example (iii)
                                                                              broadcast receiver is involved in a similar flow.
83   public class Activity4 extends Activity {
84    public void onCreate(...) {                                             Example (iv)
85         ...                                                          115   public class Activity6 extends Activity {

116       public void onCreate(...) {
117         ...                                                             Program syntax
118         BroadcastReceiver listener = new BroadcastReceiver() {
119             public void onReceive(..., Intent intent) {                 v ::= x | any                 value
120               String s = Util.unpack(data);                             t ::=                         code
121               Util.writeB(getContentResolver(), s);                          begin(n, v)                  launch activity
122             }                                                                end(v)                       finish activity
123         };                                                                   listen(SEND, n, λx.t)        register listener
124         registerReceiver(listener, new IntentFilter(‘‘ACTION’’));            send(LISTEN, n, v)           send to listener
125         String s = Util.readA(getContentResolver());                         query(n)                     read from provider
126         Intent intent = Util.pack(s);                                        update(n, v)                 write to provider
127         intent.setAction(‘‘ACTION’’);                                        let x = t in t0              evaluate
128         sendBroadcast(intent);                                               t                           fork
129       }                                                                      t + t0                       choice
130   }                                                                          v                            result

         A BroadcastReceiver object (lines 119–124) is regis-
                                                                                The meanings of programs are described informally be-
      tered to listen to intents directed at “ACTION” (line 125).
                                                                            low, after introducing some additional concepts; a formal
      Then, data is read from Uri A, bundled into an intent, and
                                                                            operational semantics appears elsewhere [7].
      broadcast to any receiver listening to “ACTION”. In par-
                                                                                A program runs in an environment that maps addresses
      ticular, this passes control to the onReceive method of the
                                                                            to components. The calculus considers three kinds of com-
      registered receiver (line 120), whence the data flows to Uri
                                                                            ponents: activities, listeners, and stores. (As argued previ-
      B. The overall flow is shown via the edges marked (iv) in
                                                                            ously, listeners generalize services, broadcast receivers, no-
      the graph on the left in Figure 1.
                                                                            tification managers, and various user interfaces; and stores
                                                                            generalize content providers, databases, files, URIs, and
      These examples illustrate some possible ways for data to
                                                                            other data containers [7].) An environment is derived from
      flow through application code. The goal of our analysis is
                                                                            the set of applications installed on the system. The syntax
      to track all such flows between Uris, so as to generate suit-
                                                                            of environments is as follows:
      able constraints on the permissions enforced by those Uris.
      In particular, we want to guarantee that those permissions            Environment syntax
      soundly enforce end-to-end security.                                  κ ::=                                      component
                                                                                activity(LAUNCH, PERM, λx.t, λx.t0 )       activity
      4      Formalization                                                      listener(SEND, PERM, λx.t)                 listener
                                                                                store(READ, WRITE, v)                      store
                                                                            E ::= ∅ | E, n 7→ κ                        environment
         In this section, we formalize our flow analysis in a core
      calculus that was developed specifically to describe and rea-           • The component activity(LAUNCH, PERM, λx.t, λx.t0 )
      son about Android applications. For simplicity, the calcu-                describes an activity that enforces permission LAUNCH
      lus does not model general classes and methods; instead, it               on its callers, that runs with permission PERM, and
      treats component classes and methods as idealized, primi-                 whose onCreate and onActivityResult methods are
      tive constructs. Furthermore, it considers permissions as the             modeled by the functions λx.t and λx.t0 ; the param-
      only mechanisms to control cross-component interactions.                  eter of λx.t is the value used to launch this activity,
                                                                                and the parameter of λx.t0 is the value returned to this
      4.1       Syntax and informal semantics                                   activity by any other activity launched by it.
                                                                              • The component listener(SEND, PERM, λx.t) describes a
         Let there be a partial order ≥ over permissions (which
                                                                                listener (e.g., a service or a broadcast receiver) that en-
      are written in uppercase typewriter font, as in PERM). Com-
                                                                                forces permission SEND on its callers, that runs with
      ponents are identified by addresses n, which model class
                                                                                permission PERM, and whose listener method (e.g.,
      names or actions. Values include variables x and a con-
                                                                                onBind or onReceive) is modeled by the function λx.t;
      stant any (which stands for any concrete data); we do not
                                                                                the parameter of λx.t is the value sent to this listener.
      consider addresses as values for simplicity. These values
      suffice for our purposes, since we consider only (explicit)             • The component store(READ, WRITE, v) describes a
      data flows in our analysis—by definition, such flows do not               store (such as a content provider) that enforces permis-
      depend on the values of the data involved. The syntax of                  sions READ and WRITE for accessing its contents, and
      programs is as follows:                                                   whose (current) content is v.

Furthermore, a program runs with a permission, which is              E(Activity2) = activity(>, >,
the permission of the context, i.e., the permission of the ap-               λ . let x = query(uriA) in begin(Activity3, x),
plication that the code belongs to. The program itself may                   λy. update(uriB, y))
be run on a stack of windows (produced by launching activ-
ities) or in a pool of threads (produced by forking).                    E(Activity3) = activity(>, >,
                                                                             λz. end(z),
  • The program begin(n, v) checks that n is mapped to                       λ. )
    a component activity(LAUNCH, PERM, λx.t, λx.t0 ), and
    that the current context has permission LAUNCH. A new                Next, recall example (iv), which involved a similar flow
    window is pushed on the stack, and the program t is               through a dynamic receiver. Once again, for simplicity we
    run with permission PERM and with x bound to v.                   assume that the activity as well as the receiver involved in
  • Conversely, end(v) pops the current window off the                this flow run with and enforce > permissions.
    stack and returns control to the previous activity; if
    that activity is activity(LAUNCH, PERM, λx.t, λx.t0 ), the           E(Activity6) = activity(>, >,
    program t0 is run with permission PERM and with x                        λ . listen(>, ACTION,
    bound to v.                                                                       λw. update(uriB, w));
                                                                                  let x = query(uriA) in send(>, ACTION, x))
  • The program listen(SEND, n, λx.t) maps n to the com-                     λ. )
    ponent listener(SEND, PERM, λx.t) in the environment;
    here, PERM is the permission of the current context.              4.3    Constraint system
  • Conversely, send(LISTEN, n, v) checks that n is
                                                                          We analyze programs by tracking data flows and then
    mapped to a component listener(SEND, PERM, λx.t),
                                                                      checking whether these flows are secure. Recall that we are
    that the current context has permission SEND, and that
                                                                      ultimately interested in flows from stores to stores. How-
    PERM includes LISTEN. The program t is run with per-
                                                                      ever, since flows may be transitively composed, we need
    mission PERM, and with x bound to v.
                                                                      to track for each value in a program the set of stores from
  • The program query(n) checks that n is mapped to a                 which it may flow. Thus, our analysis is best described
    component store(READ, WRITE, v), and that the current             as a constraint system that derives judgments of the form
    context has permission READ; the value v is returned.             βˆ†  t 7→ F , where βˆ† is a set of hypotheses, t is a program,
                                                                      and F identifies a set of stores from which data may flow to
  • Conversely, update(n, v) checks that n is mapped to               the result of t. If such a judgment can be derived, we say
    a component store(READ, WRITE, v 0 ), and that the cur-           that F collects the sources of t under βˆ†.
    rent context has permission WRITE; n is then mapped                   Our constraint system is modular, in the following sense.
    to store(READ, WRITE, v) in the environment.                      For any function λx.t, we assume some set of stores F
  • The program let x = t in t0 evaluates t, and then evalu-          from which x may flow in order to track flows through t.
    ates t0 with x bound to the result; x is a local variable.        Conversely, for any call to such a function with value v,
                                                                      we guarantee that the set of stores from which v flows is
  • The program  t forks a thread that runs t. Forking               a subset of F . In the calculus, such functions appear in
    allows connection to services in the background.                  definitions of components such as activities and listeners,
                                                                      so we require appropriate hypotheses for these definitions.
  • The program t + t0 evaluates either t or t0 . Choice al-
                                                                      Furthermore, we need to approximate the activity stack, to
    lows nondeterministic interaction with user interfaces.
                                                                      track flows of values returned from activities to activities
                                                                      that may have launched them.
4.2    Some encoded examples                                              In sum, hypotheses in our flow judgments may be of the
                                                                      following forms:
   Below we encode some of the examples of Section 3 in
this calculus. Our intention is to illustrate that these encod-         • n 7→ F , where n identifies a store and F identifies a
ings can serve as convenient intermediate representations                 set of stores from which contents may flow into n;
for analysis of actual bytecode. In particular, the concise
                                                                        • n 7→ hSend: F i, where n identifies a listener and F
syntax makes the flows in these examples easy to identify.
                                                                          collects the sources of values that may be sent to this
   Recall example (ii), which involved an inter-component
                                                                          listener (via send);
flow across two activities. For simplicity, we assume that
both activities run with and enforce > permissions.                     • n 7→ h↓: N, Begin: F, End: F 0 i, where n identifies an
                                                                          activity, N identifies a set of activities that may launch

this activity, F collects the sources of values that may         Well-constrained programs βˆ†  t 7→ F
     be used to launch this activity (via begin), and F 0 col-
                                                                         (Con query)     βˆ†  query(n) 7→ {n}
     lects the sources of values that may be returned to this
     activity from other activities launched by it (via end);                            βˆ†  v 7→ F       F ⊆ βˆ†(n)
                                                                        (Con update)
                                                                                          βˆ†  update(n, v) 7→ {}
  • x 7→ F , where F identifies a set of stores from which
                                                                                          βˆ†, x 7→ βˆ†(n).Send  t 7→ {}
    contents may flow into local variable x;                             (Con listen)
                                                                                         βˆ†  listen(SEND, n, λx.t) 7→ {}
  • self 7→ N , where N identifies a set of activities that                              βˆ†  v 7→ F       F ⊆ βˆ†(n).Send
    may be running the current code.                                      (Con send)
                                                                                         βˆ†  send(LISTEN, (n, v)) 7→ {}
   Given a set of hypotheses βˆ†, let βˆ†(α) retrieve the entry                              βˆ†  t 7→ F      βˆ†, x 7→ F  t0 7→ F 0
                                                                            (Con let)
mapped to identifier α in βˆ† (where α is one of x, n, or self);                                 βˆ†  let x = t in t0 7→ F 0
and let βˆ†(α).φ look up the field φ of that entry, if possible                             βˆ†  t 7→ {}
(where φ is one of Send, ↓, Begin, or End).                               (Con fork)
                                                                                         βˆ†   t 7→ {}
   The rules for deriving constraint judgments for programs
                                                                                         βˆ†  t 7→ F      βˆ†  t0 7→ F
are shown in Figure 2. By (Con query), the set of stores                (Con choice)
from which data may flow to a query on n is {n}. On the                                        βˆ†  t + t0 7→ F
other hand, by (Con update), the set of stores from which                  (Con var)     βˆ†  x 7→ βˆ†(x)
values v may flow into n via an update must be contained                   (Con any)     βˆ†  any 7→ {}
in βˆ†(n). Recall that for each store n, βˆ†(n) should identify                                  βˆ†  v 7→ F        βˆ†(self) = N
the set of stores from which contents may flow into n; this                              F ⊆ βˆ†(n).Begin        N ⊆ βˆ†(n). ↓
will ultimately be the basis of our security analysis.                   (Con begin)
                                                                                               βˆ†  begin(n, v) 7→ {}
   Moving on, by (Con listen), the body of a listener func-
                                                                                                βˆ†  v 7→ F     βˆ†(self) = N
tion (such as onReceive) must be well-constrained assum-                                 ∀n ∈ N. ∀m ∈ βˆ†(n). ↓: F ⊆ βˆ†(m).End
ing that the values sent to it flow from βˆ†(n).Send. Con-                   (Con end)
                                                                                                      βˆ†  end(v) 7→ {}
versely, by (Con send), values v that may be sent to a lis-
tener n must flow from stores in βˆ†(n).Send. (Con let),
                                                                      Well-constrained environment βˆ†  E, defined pointwise
(Con fork), (Con choice), (Con var), and (Con any) are
straightforward. Finally, by (Con begin), values v that may                                βˆ†, self 7→ {n}, x 7→ βˆ†(n).Begin  t 7→ {}
be used to launch an activity n must flow from stores in                                   βˆ†, self 7→ {n}, x 7→ βˆ†(n).End  t0 7→ {}
                                                                        (Con activity)
βˆ†(n).Begin; furthermore, the current activity (collected in                               βˆ†  n 7→ activity(LAUNCH, PERM, λx.t, λx.t0 )
N ), must be contained in the βˆ†(n). ↓. Conversely, by (Con                                βˆ†, self 7→ N, x 7→ βˆ†(n).Send  t 7→ {}
end), values v that may be returned by the current activity             (Con listener)
                                                                                           βˆ†  n 7→ listener(SEND, PERM, λx.t)
(collected in N ) must flow from βˆ†(m).End for every m that
                                                                                            βˆ†  v 7→ F       F ⊆ βˆ†(n)
may have launched the current activity.                                (Con provider)
                                                                                          βˆ†  n 7→ store(READ, WRITE, v)
   Based on constraint judgments for programs, we derive
constraint judgments for environments, which are of the
form βˆ†  E. Such a judgment is defined pointwise, i.e.,                             Figure 2. Constraint system
we have βˆ†  E if and only if for each mapping n 7→ κ in
E, we have βˆ†  n 7→ κ. The rules for deriving the latter              as user interfaces) seldom return interesting values, i.e., the
judgments are also shown in Figure 2. By (Con activity),              flow sets for such values is often {}.
the bodies of the onCreate and onActivityResult functions                 In our implementation (described in the next section),
of an activity n must be well-constrained assuming that the           we essentially derive these flow judgments by inference, by
set of activities that may run these functions is {n}, and the        initializing each of flow sets in the hypotheses to ∅ and then
values passed to these functions flow from βˆ†(n).Begin and             computing a fixpoint of these sets by applying the derivation
βˆ†(n).End. (Con listener) is similar to (Con listen), except           rules in reverse. The solution of the constraint system is
that the set of activities that may run the listener function         then the least set of consistent hypotheses βˆ†.
is unknown, and is conservatively taken to be the set of all              Finally, we analyze these hypotheses in conjunction with
activities N. In particular, this means that for any activity m       the environment for security. (Recall that an environment
launched by a listener, βˆ†(m). ↓ must be unknown, and our              essentially describes the state of an Android-based device.)
analysis must consider values returned by m to flow to the            Formally, we consider environment E to be data flow secure
onActivityResult methods of all activities. While in general          if, for some set of hypotheses βˆ†, we have βˆ†  E, such that
this may introduce imprecision, in our experience it typi-            whenever m ∈ βˆ†(n), E(m) = store(READ, WRITE, ), and
cally does not since activities launched by listeners (such           E(n) = store(READ0 , WRITE0 , ), we have READ0 ≥ READ

and WRITE ≥ WRITE0 . In other words, E is data flow se-              READ. Such types are related by a subtyping rule (where Γ
cure if the permissions enforced in E are consistent with the        is a set of type hypotheses, analogous to βˆ†):
flows through E, as inferred by solving βˆ†  E for βˆ†.
                                                                                          READ0 ≥ READ      WRITE ≥ WRITE0
                                                                         (Sub any)
                                                                                     Γ ` Any(READ, WRITE)
supplies fixpoint solvers for intraprocedural data flow anal-                             Bytecode (Application + Android Model)
ysis and for interprocedural data flow analysis based on the
RHS (Reps-Horwitz-Sagiv) algorithm [13]. Such problems
can be set up by specifying the appropriate transfer func-                                               Bytecode Loader
tions across nodes and edges.
   Finally, WALA implements a flexible pointer analysis
framework. A pointer analysis maps local variables, fields,                                                           Inflow Filter

and other pointers to sets of objects they may point to; these
sets of objects are represented by instance keys. The pre-
                                                                                       String/Data Analysis              Flow Analysis
cision of a pointer analysis depends on the level of context
sensitivity used to disambiguate objects created at the same
program location, and WALA provides interfaces to cus-
                                                                                                                     Outflow Filter
tomize this level of context sensitivity as necessary.

5.2    Elements of our analysis                                                                         Flows for Application

   Our analysis is composed of several elements, as shown
in Figure 3. By separating each element of our analysis,                Manifest 1      Manifest 2
we leave open the future possibility of replacing those el-
ements with improved ones, or reusing those elements for
other analyses.                                                             Manifest Loader             Flows for Application 1       Flows for Application 2

   Overall, our inputs include the classes and manifests of
a set of applications, and our output is a set of permission
constraints ≥ that are induced by data flows through those
applications, which must be satisfied for end-to-end secrecy
and integrity guarantees. We assume that a policy is speci-                    Additional Information         Constraints
fied as a partial order over permissions. The default policy
considers permissions to be sets of string identifiers, inter-
preting ≥ as the superset relation over such sets. The verifi-                                Further Analysis
cation condition is simply that the constraints generated by
the checker are always admitted by such a policy.
   To keep our analysis tractable, we hide Android’s appli-                          Figure 3. Architecture of analysis
cation framework code from WALA. Thus, control flows
across components are not immediately visible to WALA.                calls. Similar implementations are required for code deal-
Our solution is to implement a modular analysis of flows              ing with other Android data structures (such as Intent in an-
within each component, followed by a final phase that con-            droid.content.Intent).
nects and analyzes flows across all components. This al-
lows us to analyze applications incrementally as they are                Furthermore, we customize the pointer analysis to add
installed, and better reflects the reusability of components          some necessary context-sensitivity for specific method
across applications.                                                  calls. In particular, we force disambiguation of strings and
                                                                      other data structures returned by methods, based on infor-
                                                                      mation about their call sites, receivers, and so on. This pre-
Bytecode loader Our bytecode loader parses and gener-                 cision becomes crucial in subsequent phases of our analysis.
ates a call graph from a given set of application classes and
a fixed set of APIs exposed by Android. We retain only                   Finally, for a modular analysis we consider multiple root
the code of the application, and prune away the APIs dur-             methods, or entry points, for call graph generation. In
ing analysis. Furthermore, instead of passing in the con-             particular, such entry points include callback methods for
crete implementations of the Android APIs, we pass in                 each component (e.g., Activity.onCreate), callback methods
simple, abstract stubs—most of which come with the An-                that may be invoked by the UI thread (e.g., Button.onClick),
droid SDK, and some of which we extend with our own                   and all methods executed by threads that are
implementations where necessary. For example, we need                 launched within the application.
to implement the APIs for manipulating Uri objects in an-                The outputs of the bytecode loader are a pruned call so that WALA’s pointer analysis is able to ac-          graph, an exploded interprocedural supergraph, and a
count for objects created in and returned by such method              pointer analysis.

String/data analysis Strings are used pervasively as iden-              Flow analysis Our flow analysis is best specified as an
tifiers in Android applications. For example, Uris are con-             interprocedural flow problem, and we use WALA’s built-
structed from strings, and permissions are associated with              in RHS solver in our implementation. Our specification
strings in manifests. Strings are also used as query and up-            includes a supergraph, a domain, and transfer functions
date parameters, as keys into various data structures such as           that define how data flows between domain elements across
Bundles and ContentValues, and as class names or actions                edges in the supergraph.
in Intents. Thus, precise tracking of data flows in Android                In general, data may flow from instance keys and local
requires a precise approximation of the values of various               variables to other instance keys and local variables. We in-
strings that appear in application code.                                clude such instance keys and local variables in a set of flow
    Technically, this is quite challenging. As we have seen in          identifiers. Flow identifiers correspond to values in our for-
our examples, such strings may be stored in fields and vari-            mal analysis (Section 4.1). Recall that unlike usual “taint”
ables and subjected to stateful operations (such as append).            analysis, our analysis needs to simultaneously track flows
Thus, we must rely on a precise pointer analysis for strings,           of several tags for these flow identifiers. These tags are ob-
as well as partially evaluate any method calls that manipu-             tained from the output of the inflow filter. We define the
late strings. In bytecode, strings are typically constructed            domain used for our analysis to be the cross product of flow
and manipulated through StringBuilder objects, and coerced              identifiers and tags. Thus, a set of domain elements with
to and from such objects. We approximate the values of                  flow identifier v and some tag in F corresponds to a map-
these objects as follows: first we construct a subgraph that            ping v 7→ F in our formal analysis (Section 4.3).
focuses on operations over such objects, and next we de-                   The transfer functions are identical across tags, so we
sign an analysis built on top of WALA’s intraprocedural                 can define them simply over flow identifiers. For any in-
flow solver that uses this subgraph to compute approximate              struction that is not a method call, let D be the set of local
prefixes of the strings these objects may evaluate to at run            variables and instance keys that are defined and U be the set
time. Approximation is necessary because the pointer anal-              of the instance keys that are used in that instruction (these
ysis may not be fully precise, and strings with imprecise               sets can be readily obtained through WALA’s representation
values may be, for example, appended to StringBuilder ob-               of the instruction with the general pointer analysis). Let B
jects.                                                                  be the basic block containing this instruction, and Pred(B)
    This information is then propagated to other data struc-            be the predecessors of B in the supergraph. We specify a
tures (such as Uris) that may use those strings. Typically the          flow from each u ∈ U in each B 0 ∈ Pred(B) to each d ∈ D
strings we care about behave as constants, and it is possible           in B.
to recover them fully (i.e., the prefixes we compute are to-               Invoke instructions (i.e., method calls) require special
tal). Otherwise, even incomplete prefixes can provide useful            handling. A method call is either explicitly represented in
information: even if we cannot recover the parameter string             the supergraph as a set of interprocedural edges, or it is an
in the suffix of a Uri, recovering the base Uri is sufficient to        Android API call for which we have pruned the interproce-
look up permissions for that Uri in a manifest.                         dural edges. If the call is represented in the call graph we
    The output of the string/data analysis is a map from in-            can simply specify flows of parameters across call edges
stance keys for Strings and Uris (derived by the pointer anal-          and of results across return edges. However, when the call
ysis) to strings.                                                       is not represented in the call graph we must consider all
                                                                        possible side effects of the invoke instruction at the location
                                                                        of the call. Android restricts side effects in its API calls to
Inflow filter As described in section 2.4, data can only                the objects that were passed to the call, so to be sound we
flow into an application through inflow methods. We treat               must add flows from each flow identifier used by the invoke
each of the inflow methods in our application as a sepa-                instruction to each other flow identifier used or defined by
rate source. Each such source introduces an instance key in             that instruction. Note that for non-static methods, the uses
WALA, which we map to a unique tag. These tags contain                  of the instruction will also include a flow identifier repre-
enough information to solve inter-component flows, and are              senting the receiver of that call. For example, consider an
passed around by reference during the flow analysis. For ex-            instance v of some Android object, a non-static method p of
ample, we tag the Cursor returned by a call to query with an            the object, and a parameter v 0 . An instruction x = v.p(v 0 )
object that contains the Uri instance key that was passed to            would result in flows from (the flow identifiers for) v to (the
that query. A set of such tags corresponds to a flow set in             flow identifiers for) x and v 0 , and from v 0 to x and v.
our formal analysis (Section 4.1).                                         There are a few points to note about our transfer func-
   The output of the inflow filter is a map from instance               tions. First, tracking flows through instance keys (derived
keys (that identify sources) to tags. This map is used to               by the pointer analysis) is crucial for an interprocedural
seed our flow analysis.                                                 analysis, since methods can have side effects through the

heap that cannot be tracked by flows through local variables           lows incremental checking as new applications are installed
alone. Next, we focus only on explicit flows in our analysis,          on an Android device; the system gets monotonically more
i.e., we ignore implicit flows. Finally, our transfer func-            constrained until we have a contradiction, which indicates
tions specify a flow-sensitive analysis of local variables but         a possible security violation. For example, suppose that a
a flow-insensitive analysis of heap variables (instance keys).         pre-installed application generates the constraint px ≥ py
This is important when considering multi-threaded applica-             and a new application generates the constraint py ≥ pz . If
tions, as we do not need to account for interleaving of multi-         as policy we know that px 6≥ pz (i.e., pz denotes an unre-
ple threads to achieve a sound analysis using this technique.          lated or strictly higher privilege than px ), then we have a
    Given the supergraph constructed by the bytecode                   contradiction, which means that it is dangerous to let these
loader, the transfer functions as defined above, and the ini-          applications coexist on the same Android device: either the
tial seeds provided by the inflow analysis, WALA’s solver              former application should be uninstalled, or the latter appli-
can calculate the domain elements that are active at each              cation should not be installed.
basic block. The result of the flow analysis is a map from
flow identifiers to the tags they might have at each basic             Note that not all flows detected by our flow analysis may
block. Thus, each entry in the map corresponds to a map-               be permissible at run time. In particular, inter-component
ping v 7→ F in our formal analysis (Section 4).                        flows require that the relevant components have sufficient
                                                                       permissions to access each other. Thus, any constraints gen-
Outflow filter Next, we identify sinks to which data may               erated by the checker due to those flows are in fact condi-
flow out of the application. Those sinks are the outflow               tional on such facts. For example, suppose that an overall
methods described in section 2.4. The analysis in the out-             flow from uriA to uriB is a composition of the following
flow filter combines the set of all sinks in the application           (intra-component) flows: within an activity in App1 , there
with the result of the flow analysis to derive all of the end-         is a flow from uriA to an intent used to launch another ac-
to-end, intra-component flows within the application. The              tivity in App2 ; and within that activity in App2 , there is a
output of the outflow filter is a map from (inflow) tags to            flow from the intent used to launch it to uriB .
sets of (outflow) tags. For example, the output for Activity1              Suppose that uriA requires permissions readA and
would map the query inflow characterized by the instance               writeA , and uriB requires permissions readB and writeB .
key for uriA to the update outflow characterized by the in-            Suppose further that (any activity in) App1 has permission
stance key for uriB.                                                   perm1 and App2 enforces, via its manifest, that calling any
                                                                       of its activities requires permission perm2 . Then the con-
Manifest loader The manifest loader is responsible for                 straints that are generated by our checker are in fact:
characterizing the environment in which applications run,                • perm1 ≥ perm2 ⇒ readB ≥ readA for data secrecy
as described in section 2.2. Since our analysis must account
for many different environments that are not fully defined at            • perm1 ≥ perm2 ⇒ writeA ≥ writeB for data integrity
the time of analysis, we can only load a partial representa-
tion. In our manifest loader we typically load the manifests              We expect these constraints to be useful for further anal-
for the applications we are analyzing, coupled with man-               ysis (which we do not implement). For instance, some in-
ifests for core Android applications. Some of the charac-              teresting scenarios arise if either App1 or App2 is a new (not
teristics of the manifests that our analysis requires include          yet installed) application.
links between Uris and permissions, links between Uris and               • Suppose that App1 is a new application requesting
the ContentProviders that host those Uris, and the set of per-             some permission in its manifest. Rather than generate
missions that each application requests at install time. All               constraints with perm1 as that permission, we could
such information is passed to the checker.                                 assume perm1 to be unknown and solve for an opti-
                                                                           mal solution. For example, an optimal solution may be
Checker The results of the outflow filter encode, in par-                  to request the least privilege necessary to ensure func-
ticular, flows between query and update calls that are rep-                tionality without compromising security.
resented by instance keys for Uri objects. We rely on our
string analysis to recover approximate prefixes of these Uris,           • Similarly, suppose that App2 is a new application en-
and we then look up the results of the manifest loader for                 forcing some permission in its manifest. Again, rather
any permissions known to be associated with these pre-                     than generate constraints with perm2 as that permis-
fixes. Based on this information, we generate permission                   sion, we could assume perm2 to be unknown and solve
constraints ≥ that are induced by the flows.                               for an optimal solution. For example, an optimal solu-
    We force the relation ≥ to be a partial order, i.e., ≥ must            tion may be to enforce the greatest privilege sufficient
remain reflexive, transitive, and anti-symmetric. This al-                 to ensure security without compromising functionality.

You can also read