Epidemic Spreading in Real Networks: An Eigenvalue Viewpoint

Page created by Frank Franklin
 
CONTINUE READING
Epidemic Spreading in Real Networks: An Eigenvalue Viewpoint

              Yang Wang, Deepayan Chakrabarti, Chenxi Wang∗, Christos Faloutsos†
                                Carnegie Mellon University
                         5000 Forbes Avenue, Pittsburgh, PA, 15213
                    yangwang, deepayan, chenxi, christos@andrew.cmu.edu

                       Abstract                                 1. Introduction

   How will a virus propagate in a real network?                   Computer viruses remain a significant threat to
Does an epidemic threshold exist for a finite power-            today’s networks and systems. Existing defense
law graph, or any finite graph? How long does it                mechanisms typically focus on local scanning of
take to disinfect a network given particular values             virus signatures. While these mechanisms can de-
of infection rate and virus death rate?                         tect and prevent the spreading of known viruses,
   We answer the first question by providing equa-              they do little for globally optimal defenses. The
tions that accurately model virus propagation in                recent proliferation of malicious code that spreads
any network including real and synthesized network              with virus code exacerbates the problem [10, 24, 25].
graphs. We propose a general epidemic thresh-                   From a network dependability standpoint, the prop-
old condition that applies to arbitrary graphs: we              agation of malicious code represents a particular
prove that, under reasonable approximations, the                form of fault propagation, which may lead to the ul-
epidemic threshold for a network is closely related             timate demise of the network (consider distributed
to the largest eigenvalue of its adjacency matrix.              denial-of-service attacks). With the exception of a
Finally, for the last question, we show that infec-             few specialized modeling studies [7, 8, 16, 19, 26],
tions tend to zero exponentially below the epidemic             much still remains unknown about the propagation
threshold.                                                      characteristics of computer viruses and the factors
   We show that our epidemic threshold model                    that influence them.
subsumes many known thresholds for special-case                    In this paper, we investigate epidemiological
graphs (e.g., Erdös-Rényi, BA power-law, homoge-              modeling techniques to reason about computer vi-
neous); we show that the threshold tends to zero for            ral propagation. Kephart and White [7, 8] are
infinite power-law graphs. Finally, we illustrate the           among the first to propose epidemiology-based an-
predictive power of our model with extensive experi-            alytic models. Their studies, however, are based
ments on real and synthesized graphs. We show that              on topologies that do not represent modern net-
our threshold condition holds for arbitrary graphs.             works. Staniford et al. [23] reported a study of the
                                                                Code Red worm propagation, but did not attempt
                                                                to create an analytic model. The more recent stud-
                                                                ies by Pastor-Satorras et al. [16, 17, 18, 19, 20] and
                                                                Barabási et al. [2, 4] focused on epidemic models
                                                                for power-law networks.
                                                                   This work aims to develop a general analytic
   ∗ This work is partially supported by the National Science
                                                                model of virus propagation. Specifically, we are in-
                                                                terested in models that capture the impact of the
Foundation under Grant No. CCR-0208853 and a grant from
NIST.                                                           underlying topology but are not limited by it. We
   † This work is partially supported by the National Science   found that, contrary to prior beliefs, viral propaga-
Foundation under Grants No. IIS-9817496, IIS-9988876, IIS-      tion is largely determined by intrinsic characteris-
0083148, IIS-0113089, IIS-0209107 IIS-0205224 by the Penn-
sylvania Infrastructure Technology Alliance (PITA) Grant
                                                                tics of the network. Our model holds for arbitrary
No. 22-901-0001, and by the Defense Advanced Research           graphs and renders surprisingly simple but accurate
Projects Agency under Contract No. N66001-00-1-8936.            predictions.
The layout of this paper is as follows: section 2    follow a power law structure instead. Computer
gives a background review of previous models. In        viruses, therefore, are likely to propagate among
section 3, we describe our proposed model. We           nodes with a high variance in connectivity.
show that our model conforms better to simulation          Pastor-Satorras and Vespignani studied epidemic
results than previous models over real networks. In     spread for power-law networks where the connec-
section 4, we revisit the issue of epidemic threshold   tivity distribution is characterized as P (k) = k −γ
and present a new theory for arbitrary graphs—the       (P (k) is the probability that a node has k links)
epidemic threshold of a given network is related in-    [14, 16, 18, 19]. Power-law networks have a highly
trinsically to the first eigenvalue of its adjacency    skewed connectivity distribution and for certain val-
matrix. We summarize in section 6.                      ues of γ resemble the Internet topology [6]. Pastor-
                                                        Satorras et al. developed an analytic model (we
2. Earlier models and their limitations                 refer to their model as the SV model) for the
                                                        Barabási-Albert (BA) power-law topology (γ = 3).
                                                        Their steady state prediction is,
    The class of epidemiological models that is most
widely used is the so-called homogeneous mod-
els [1, 11]. A homogeneous model assumes that                               η = 2e−δ/mβ                    (3)
every individual has equal contact to everyone else
in the population, and that the rate of infection is    where m is the minimum connection in the net-
largely determined by the density of the infected       work. The SV model, however, depends critically
population. Kephart and White adopted a mod-            on the assumption γ = 3, which does not hold for
ified homogeneous model in which the communi-           real networks [9, 6]. This model yields less than
cation among individuals is modeled as a directed       accurate predictions for networks that deviate from
graph [7]: a directed edge from node i to node j        the BA topology, as we will show later in the pa-
denotes that i can directly infect j. A rate of in-     per. Pastor-Satorras et al. [18] also proposed an
fection, called the birth rate, β, is associated with   epidemic threshold condition
each edge. A virus curing rate, δ, is associated with
each infected node.
                                                                                      hki
    If we denote the infected population at time t                           τSV =                         (4)
as ηt , a deterministic time evolution of ηt in the                                  hk 2 i
Kephart-White model (hereafter referred to as the
KW model) can be represented as                         where hki is the expected connectivity and hk 2 i sig-
            dηt                                         nifies the connectivity divergence.
                  =    βhkiηt (1 − ηt ) − δηt     (1)      Following [19], Boguñá and Satorras studied epi-
            dt
                                                        demic spreading in correlated networks where the
where hki is the average connectivity. The steady       connectivity of a node is related to the connectiv-
state solution for Equation 1 is η = 1−δ/(βhki)∗N ,     ity of its neighbors [3]. These correlated networks
where N is the total number of nodes.                   include Markovian networks where, in addition to
    An important prediction of Equation 1 is the no-    P (k), a function P (k|k 0 ) determines the probability
tion of epidemic threshold. An epidemic threshold,      that a node of degree k is connected to a node of
τ , is the critical β/δ ratio beyond which epidemics    degree k 0 .
ensue. In a homogeneous or Erdös-Rényi network,          While some degree of correlations may exist in
the epidemic threshold is,                              real networks, it is often difficult to character-
                                1                       ize connectivity correlations with a simple P (k|k 0 )
                      τhom =                     (2)    function. Indeed, prior studies on real networks
                               hki
                                                        [6, 15] have not found any conclusive evidence to
where hki is the average connectivity [7].              support the type of correlation as defined in [3].
   These earlier models provide a good approxima-       Hence, we will not discuss models for correlated
tion of virus propagation in networks where the         networks further in this paper.
contact among individuals is sufficiently homoge-          We present a new analytic model that does not
neous. However, there is overwhelming evidence          assume any particular propagation topology. We
that real networks (including social networks [21],     will show later that our model subsumes previous
router and AS networks [6], and Gnutella overlay        models that are tailored to fit special-case graphs
graphs [22]) deviate from such homogeneity—they         (homogeneous, BA power-law, etc.).
3. The proposed model                                       Note that the third bullet above is due to poten-
                                                            tially concurrent curing and infection events.
   In this section, we describe a model that does              We subsequently define the healthy probability
not assume homogeneous connectivity or any par-             of a node i at time t, 1 − pi,t , to be
ticular topology. We assume a connected network
                                                               1 − pi,t   = (1 − pi,t−1 )ζi,t + δpi,t−1 ζi,t
G = (N, E), where N is the number of nodes in the
network and E is the set of edges. We assume a                                1
                                                                            + δpi,t−1 (1 − ζi,t ) i = 1 . . . N (6)
universal infection rate β for each edge connected                            2
to an infected node, and a virus death rate δ for           Note that for the last term on the right hand side
each infected node. Table 1 lists the symbols used.         of Equation 6 we assume that the probability that
                                                            a curing event at node i takes place after infection
  β          Virus birth rate on a link connected           from neighbors is roughly 50%.
             to an infected node                               Given a network topology and particular values
  δ          Virus curing rate on an infected node          of β and δ, we can solve Equation 6 numerically and
  t          Time stamp                                     obtain the time
  pi,t       Probability that node i is infected at t                   PN evolution of infected population, ηt ,
                                                            where ηt = i=1 pi,t .
  ζi,t       Probability that node i does not
             receive infections from its neighbors at t     3.2. Experiments
  ηt         Infected population at time t
  hki        Average degree of nodes in a network              In this section, we present a set of simulation
  hk 2 i     Connectivity divergence                        results. The simulations are conducted to answer
                                                            the question—how does our model perform in real,
                Table 1. Table of Symbols
                                                            BA power law, and homogeneous graphs? We use a
                                                            real network graph collected at the Oregon router
                                                            views1 . This dataset contains 31180 links among
                                                            10900 AS peers. All synthesized power-law graphs
3.1. Model
                                                            used in this study are generated using BRITE [12].
                                                            Unless otherwise specified, each simulation plot is
    Our model assumes discrete time. During each            averaged over 15 individual runs.
time interval, an infected node i tries to infect its          We begin each simulation with a set of randomly
neighbors with probability β. At the same time,             chosen infected nodes on a given network topology2 .
i may be cured with probability δ. We denote the            Simulation proceeds in steps of one time unit. Dur-
probability that a node i is infected at time t as pi,t .   ing each step, an infected node attempts to infect
We define ζi,t , the probability that a node i will not     each of its neighbors with probability β. In addi-
receive infections from its neighbors at time t as,         tion, every infected node is cured with probability
                 Y                                          δ. An infection attempt on an already infected node
ζi,t =                  (pj,t−1 (1 − β) + (1 − pj,t−1 ))    has no effect.
             j:neighbor of i
                  Y                                            Figure 1 shows the time evolution of η as pre-
         =                     (1 − β ∗ pj,t−1 )     (5)    dicted by our model (see Equation 6) on the 10900-
             j:neighbor of i                                node Oregon AS graph, plotted against simulation
                                                            results and the steady state prediction of the SV
We assume that a node i is healthy at time t if             model in Equation 3 (Since the SV model does not
                                                            estimate the transients, we plot the steady state
  • i was healthy before t and did not receive in-
                                                            only.) As shown, our model yields a strictly more
    fections from its neighbors at t (defined by ζi,t )
                                                            precise result than the SV model.
    OR
                                                               Figure 2 compares the predictions of our model
  • i was infected before t, cured at t and did not         against the SV model for Barabási-Albert networks
    receive infections from its neighbors (defined          (see Equation 3). The topology used in Figure 2 is
    by ζi,t ) OR                                            a synthesized 1000-node BA network. Since the SV
                                                              1 http://topology.eecs.umich.edu/data.html
  • i was infected before t, received and ignored             2 The  number of initially-infected nodes does not affect
    infections from its neighbors, and was subse-           the equilibrium of the propagation. It is chosen based on
    quently cured at t                                      the particular values of β and δ for each run
(a)                                                (b)

  Figure 1. Experiments show the time evolution of infection in an 10900-node power-law network.
  Both simulations were performed on an Oregon network graph, with hki = 5.72 and β = 0.14. In
  both cases, our model conforms much closer to the simulation results than the SV model.

model (see Equation 3) is specifically tailored for     cise as the KW model, which is designed specifically
BA networks, we expect the comparison to be sim-        for homogeneous networks. In one case where β is
ply a sanity check. As shown, both models conform       0.2 and δ is 0.72, the simulated spreading appears
nicely to the simulation results, though our model      to follow our prediction more closely than that of
appears to be slightly more precise.                    the KW model.

  Figure 2. Experiments on BA topology:                   Figure 3. Experiments on ER topology:
  shows time evolution of infected popula-                shows time evolution of infected popu-
  tion in a 1000-node power-law network.                  lation in a 1000-node random Erdös net-
  Our model outperforms the SV model in                   work. Our model generally yields similar
  its steady state prediction.                            predictions to the KW model, but outper-
                                                          forms it when δ is high.

   Figure 3 shows simulation results of epidemic
spreading on a synthesized 1000-node random net-           The experiments we show here, conducted on a
work, plotted against the KW model [7] and our          real network, a synthesized BA power-law network,
model. The network is constructed according to          and an Erdös-Rényi network, illustrate the predic-
the Erdös-Rényi model [5]. Since an Erdös-Rényi     tive power of our model—as a general model, it sub-
network is sufficiently close to being homogeneous      sumes prior models and produces predictions that
as far as epidemiological models are concerned, the     equal or outperform predictions that target specific
results in Figure 3 suggest that our model is as pre-   topologies.
4. Epidemic threshold and eigenvalues                   matrix A of the network.
                                                        Theorem 1 (Epidemic Threshold) If an epi-
   As described previously, an epidemic threshold       demic dies out, then it is necessarily true that
is a critical state beyond which infections become      β           1
                                                        δ < τ = λ1,A , where β is the birth rate, δ is the
endemic. Predicting the epidemic threshold is an
                                                        curing rate and λ1,A is the largest eigenvalue of the
important part of an epidemiological model. The
                                                        adjacency matrix A.
epidemic threshold of a graph depends fundamen-
tally on the graph itself. The challenge therefore is   Proof: Restating Equation 6,
to capture the essence of the graph in as few param-
eters as possible. We present one such model here         1 − pi,t     = (1 − pi,t−1 )ζi,t + δpi,t−1 ζi,t
that predicts the epidemic threshold with a single                         1
                                                                         + δpi,t−1 (1 − ζi,t ) i = 1 . . . N
parameter—the largest eigenvalue of the adjacency                          2
matrix of the graph—for arbitrary graphs.               Rearranging the terms,
   We note that previous models have derived                                                            
                                                                         1                   1
threshold conditions for special-case graphs. For in-   1 − pi,t   =       δpi,t−1 + 1 +       δ − 1 pi,t−1 ζi,t
stance, the epidemic threshold for a homogeneous                         2                   2
                                                                                                 
network is the inverse of the average connectivity,                      1                 1
                                                                   =       δpi,t−1 + 1 +     δ − 1 pi,t−1
hki. Similarly, the threshold for infinite power-law                     2                 2
networks is zero. However, a unifying model for                              X
                                                                         −β      pj,t−1
arbitrary, real graphs has not appeared in the lit-
                                                                                j
erature. The closest model thus far is the one put                                                   X
forth by Pastor-Satorras et al. (see Equation 4).                  = 1 + δpi,t−1 − pi,t−1 − β                 pj,t−1 (8)
We show later that their model is not accurate for                                                        j
arbitrary graphs.                                       This uses the approximation
   In this section, we describe a general theory for
epidemic threshold that holds for arbitrary graphs.                    (1 − a)(1 − b) ≈ 1 − a − b                     (9)
We observe that the epidemic threshold is a con-        when a  1, b  1.
dition linking the virus’ birth and curing rate to
                                                          We thus have
the adjacency matrix of the graph, such that an in-                                                 X
fection becomes an epidemic if the condition holds,           so, pi,t    = (1 − δ)pi,t−1 + β                 pj,t−1 (10)
and dies out if it does not. Our theory is surpris-                                                   j
ingly simple yet accurate at the same time. We
show later in this section that this new threshold          Converting Equation 10 to matrix notation (Pt
condition subsumes prior models for special-case        is the column vector (p1,t , p2,t , . . . , pN,t )),
graphs. Table 2 lists the symbols used in this sec-                    Pt = ((1 − δ) I + βA) Pt−1                   (11)
tion.
                                                        Thus, Pt is of the form
  A      Adjacency matrix of the network
  trA    The transpose of matrix A                                              Pt   = SPt−1                        (12)
  λi,A   The i-th largest eigenvalue of A                                            = S t P0                       (13)
  ui,A   Eigenvector of A corresponding to λi,A
                                                        where S = (1 − δ)I + βA. We call S the system
  S      The ‘system’ matrix describing the
                                                        matrix.
         equations of infection
                                                           As we show in Lemma 1 in the Appendix, the
  λi,S   The i-th largest eigenvalue of S
                                                        matrices A and S have the same eigenvectors ui,S ,
   Table 2. Symbols for eigenvalue analysis             and their eigenvalues, λi,A and λi,S , are closely re-
                                                        lated:
                                                                      λi,S = 1 − δ + βλi,A ∀i            (14)
  Next, we will show that our estimate for the epi-     Using the spectral decomposition, we can say
demic threshold τ is                                                          X
                                                                      S =        λi,S ui,S tr(ui,S )
                            1
                      τ=   λ1,A                  (7)                                 i
                                                                                     X
                                                                            t
                                                                   and, S       =        λti,S ui,S tr(ui,S )       (15)
where λ1,A is the largest eigenvalue of the adjacency                                i
Using this in Equation 13,                                  5.    Discussion—generality                of     our
          Pt =
                X
                   λti,S ui,S tr(ui,S ) P0         (16)          threshold condition
                   i

   Without loss of generality, order the eigenvalues           We now turn to show that our threshold condi-
such that λ1,A ≥ λ2,A . . .. For an infection to die off    tion is general and holds for other graphs. In par-
and not become an epidemic, the vector Pt should            ticular, we show that the threshold condition holds
go to zero for large t, which happens when ∀i, λti,S        for a) homogeneous, b) star, c) infinite power-law,
tends to 0. That implies λ1,S < 1. So,                      and d) finite power-law graphs. We do that with
                                                            the following corollaries.
                  1 − δ + βλ1,A < 1                (17)
                                                            Corollary 2 The new threshold model holds for
                            1
which means that, τ =      λ1,A                       2     homogeneous or random Erdös-Rényi graphs.

Theorem 2 (Exponential Decay) When                an        Proof: As reported previously, the epidemic
                                          1
epidemic is diminishing (therefore β/δ < λ1,A ), the        threshold in a homogeneous network or a random
probability of infection decays exponentially over          Erdös-Rényi graph is τhom = 1/hki where hki is the
time.                                                       average connectivity [7]. It is easily shown that,
                                                            in a homogeneous or random network, the largest
Proof: We have:
                                                            eigenvalue of the adjacency matrix is hki. There-
Pt   = St P0 (from Equation 13)                             fore, our model yields the same threshold condition
                                                            as the homogeneous models [11].                   2
       X
     ≈     λti,S ui,S tr(ui,S )P0 (from Equation 15)
           i
                                                            Corollary 3 The epidemic threshold, τ (as defined
     ≈ λt1,S ∗ C                                     (18)   in section 2), for a star topology is exactly √1d ,
                                                                   √
where C is a constant vector. Since the value of            where d is the square root of the degree of the
λ1,S is less than 1 (because of the no-epidemic con-        central node.
dition). the values of pi,t are decreasing exponen-
tially over time.                                 2         Proof: In a star topology, we have two types of
                                                            nodes, the center node and the satellite nodes. Sup-
Corollary 1 When the network is below the epi-
                                                            pose that we have d satellites, the √ first eigenvalue
demic threshold, the number of infected nodes de-
                                                            of the adjacency matrix, λ1 , is d. The stability
cays exponentially over time.
                                                            condition then becomes
Proof: Let nt denote the number of infected nodes                                             √
at time t.                                                                 λ1 = 1 − δ + β ∗ d = 1              (19)
                       N                                                                 √
                      X                                     which means that δ = β ∗ d to achieve stability,
              nt =        pi,t
                                                            thus rendering τ = √1d .                             2
                           i=1
                           X                                   Figure 5 shows an infection spread over time in
                       =         λt1,S ∗ Ci
                                                            a 100-node
                                                               √         star graph with β = 0.016. Given τ =
                            i
                                    X                       1/ 99, the critical δ on the threshold is 0.16. We
                       = λt1,S ∗         Ci                 plotted our propagation model as given by Equa-
                                     i                      tion 6 in Figure 5(b). As shown, the propagation
where Ci are the individual elements Pof the matrix         model confirms our prediction for the critical δ.
C in Equation 18 above. Because i Ci is a con-              More specifically, the theoretical results rendered
stant and λ1,S < 1 (from Theorem 1), we see that            by the propagation model closely reflect the simu-
nt decays exponentially with time.                  2.      lation when δ > 0.16. For δ < 0.16, there is no
   The exponential decay in the number of infected          epidemic. For δ = 0.16, a very interesting setting
nodes is shown clearly in Figure 4, where we plot           appears.
the logarithm of the number of infected nodes, ηt ,            For the case of δ = 0.16, our propagation model
versus t. Two plots are shown: One for the star             seems to show that the expected number of infected
topology, and one for the Oregon dataset. In both           nodes ηt drops approximately at the rate of t−1 ,
cases, we observe that for large values of time t, the      which is qualitatively different from the other two
plots become linear, implying that the number of            cases: for δ > 0.16, ηt ≈ λt1 ; for δ < 0.16, ηt stabi-
infected nodes decays exponentially.                        lizes. This suggests a phase transition phenomenon.
100                                                                               10000
                                                      Our model, delta = 0.2                                                             Our model, delta = 0.09
                                                      Simulation, delta = 0.2                                                            Simulation, delta = 0.09
                                                     Our model, delta = 0.24                                                              Our model, delta = 0.1
                                10                   Simulation, delta = 0.24                                     1000                    Simulation, delta = 0.1
   Number of infected nodes

                                                                                      Number of infected nodes
                                 1                                                                                100

                                0.1                                                                                 10

                               0.01                                                                                 1

                              0.001                                                                                0.1
                                      0   20        40          60        80    100                                      0      200    400          600      800    1000
                                                         Time                                                                                Time
                                          (a) Star topology                                                                     (b) Oregon topology

  Figure 4. This figure shows the exponential decay in the number of infected nodes over time,
  when we are under the epidemic threshold. Plot (a) compares the logarithm of the number
  of infected nodes over time for a 100-node star topology; plot (b) shows the same for the
  Oregon topology. In both cases, the plot becomes linear for large t, meaning that the decay is
  exponential.

                                               (a) Simulation                                                                (b) Our propagation model

  Figure 5. Critical δ for an 100-node star topology: number of infected nodes versus time in
  log-log scales, given β = 0.016. Our threshold prediction places the critical δ at 0.16. (Triangles
  at left and crosses at right plot)

   Figure 6(d) depicts a further example for the                                                results indicate that our threshold is clearly in the
star topology, plotting the number of infected nodes                                            correct region, while the SV threshold prediction is
η200 at time t=200 for several values of the β/δ ra-                                            not accurate.
tio. We plot both theoretical (see Equation 6) and
simulation results. We also show the two epidemic                                               Corollary 4 The epidemic threshold for an infi-
thresholds with vertical lines: Our threshold with                                              nite power-law network is zero.
“crosses” at β/δ = 1/λ1,A = 0.1 and the SV thresh-
old with “squares” at β/δ = 0.02. The simulation                                                Proof: In a power-law network, the√ first eigenvalue
                                                                                                of the adjacency matrix, λ1,A , is dmax (according
to [13]). Since dmax ∝ ln(N ) and N is infinite, λ1,A       a continuous-time model and arrived at the eigen-
is infinite. Our epidemic threshold condition states        value based threshold condition following a different
that δ must be greater than β ∗ λ1,A in order for           line of reasoning. While the two results are similar
there not be any epidemic. Therefore, the epidemic          for correlated networks, our threshold condition is
threshold is effectively zero for infinite power-law        more general.
networks. This result concurs with previous work,
which finds that infinite power-law networks lack           6. Conclusions - contributions
epidemic thresholds.                                2
                                                               How will a virus propagate in a real computer
Corollary 5 The epidemic threshold, τ , for finite
                                                            network? What is the epidemic threshold for a fi-
power-law networks is more precisely indicated by
  1                                                         nite graph, if any? How long does it take for a
λ1,A , where λ1,A is the first eigenvalue of A.             viral outbreak to reach steady state? These ques-
                                                            tions have for decades intrigued researchers. In this
Proof: This follows from Theorem 1 above.            2      paper we attempt to answer these questions by pro-
   We compare our threshold prediction with the             viding a new analytic model that accurately models
threshold model by Pastor-Satorras et al. in Equa-          the propagation of viruses on arbitrary graphs. The
tion 4. Their model, τSV = hki/hk 2 i, where k              primary contributions of this paper are:
is the average connectivity, is put forth as a gen-
eral model. Figures 6(a) and (b) show simulated                 • We propose a new model for virus propagation
epidemic spreading on the Oregon network. The                     in networks (Equation 6), and show that our
largest eigenvalue λ1,A of the adjacency matrix for               model is more precise and general than previ-
this network is approximately 58.7211.                            ous models. We demonstrate the accuracy of
   We structured the experiment such that 5000                    our model in both real and synthetic networks.
nodes are infected initially. Simulations proceed               • We show that we can capture the virus-
with β = 0.001 and δ ranging from 0.05 to 0.14. For               propagation properties of an arbitrary graph
the particular values of β and λ1,A , our epidemic                in a single parameter, namely the eigenvalue
threshold model predicts a critical δ at 0.0587211,               λ1,A . We propose a precise epidemic thresh-
while the SV threshold prediction puts the critical               old, τ = 1/λ1,A , which holds irrespective of
δ at 0.2078. As shown in Figure 6(a), the simu-                   the network topology; an epidemic is prevented
lation with δ = 0.05 reaches equilibrium while the                when δ > δc = β ∗ λ1,A . We show that our
run with δ = 0.07 approaches zero at approximately                epidemic threshold is more general and more
time-tick 600. The run with δ = 0.06 approaches                   precise than previous models for special-case
zero steadily, but has yet to reach it at time-tick               graphs (e.g., Erdös-Rényi, homogeneous, BA
1000. These results closely mirror our threshold                  power-law); we show that it tends to zero for
prediction, which shows a critical δ at approxi-                  infinite power-law graphs.
mately 0.06.
   Figure 6(b) shows an alternate view of the exper-            • We show that, below the epidemic threshold,
iment result, plotting the number of infected nodes               the number of infected nodes in the network
η at time t=500 for several values of the β/δ ra-                 decays exponentially.
tio. We plot both theoretical (see Equation 6) and
                                                               Future research directions abound, both for the-
the simulation results. We also show the two epi-
                                                            oretical as well as experimental work. One could
demic thresholds with vertical lines: Our threshold
                                                            examine phase transition phenomena, when we are
with “crosses” at β/δ= 1/λ1,A = 0.0167 and the SV
                                                            exactly on the epidemic threshold. Another promis-
threshold with “squares” at β/δ= 0.0048. Notice
                                                            ing direction is to enhance the model with a “vig-
that our threshold is clearly in the correct region,
                                                            ilance” parameter to model environmental factors
while the SV threshold prediction is less precise.
                                                            that affect viral propagations.
   It was brought to our attention that Boguñá et
al. derived an epidemic threshold condition for cor-
related networks based on the largest eigenvalue of         7     Acknowledgments
a specialized connectivity matrix, C [3]. Each en-
try Ck,k0 of C is defined by kP (k|k 0 ) where P (k|k 0 )      The authors wish to thank Dr. Benoit Morel,
indicates the probability that a k-linked node is           Dr. Anthony Brockwell, and Dr. Deborah Brandon
connected to a k 0 -linked node. In [3], they used          for many insightful discussions on the subject. We
160

                                                           Number of infected nodes at timetick 500
                                                                                                                                                        Simulation
                                                                                                                                                        Our model
                                                                                                      140

                                                                                                      120
                                                                                                                 SV threshold                          Our threshold
                                                                                                      100

                                                                                                       80

                                                                                                       60

                                                                                                       40

                                                                                                       20

                                                                                                       0
                                                                                                       0.004 0.006 0.008           0.01    0.012 0.014 0.016 0.018           0.02
                                                                                                                                        beta / delta

       (a)infected population vs. time for Oregon      (b)infection at time-tick 500 vs. β/δ for Oregon
                                                                                                      30

                                                           Number of infected nodes at timetick 200
                                                                                                                                                        Simulation
                                                                                                                                                        Our model
                                                                                                               SV threshold
                                                                                                      25

                                                                                                      20

                                                                                                                        Our threshold
                                                                                                      15

                                                                                                      10

                                                                                                       5

                                                                                                       0
                                                                                                           0     0.05     0.1   0.15    0.2 0.25       0.3   0.35      0.4   0.45
                                                                                                                                        beta / delta

        (c)infected population vs. time for Star        (d)infection at time-tick 200 vs. β/δ for Star

  Figure 6. Epidemic threshold on the Oregon and Star topology. Plot (a) shows that the critical
  δ at 0.06 is very close to our predicted epidemic threshold critical δ ≈ 0.0587211. The SV model
  predicts critical δ ≈ 0.207796. Plot (b) shows that our predicted τ at 0.0167 approximates the
  behavior of the infection at time-tick 500 where the system state has stabilized. As shown, the
  threshold predicted by the SV model does not accurately reflect reality. Plots (c) and (d) show
  the same information for the Star topology, except at time-tick 200. Again, our estimate of the
  threshold is better than that of the SV model.

also like to thank the anonymous reviewers for their   References
helpful comments.
                                                         [1] N. Bailey. The Mathematical Theory of Infectious
8. Appendix                                                  Diseases and its Applications. Griffin, London,
                                                             1975.
                                                         [2] A.-L. Barabási and R. Albert. Emergence of scal-
Lemma 1 (Eigenvalues of the system matrix)
                                                             ing in random networks. Science, 286:509–512, 15
The i − th eigenvalue of S is of the form                    October 1999.
λi,S = 1 − δ + βλi,A , and the eigenvectors of S are     [3] M. Boguñá and R. Pastor-Satorras. Epidemic
the same as those of A.                                      spreading in correlated complex networks. Physi-
                                                             cal Review E, 66:047104, 2002.
Proof: Let ui,A be the eigenvector of A corre-
                                                         [4] Z. Dezsö and A.-L. Barabási. Halting viruses
sponding to eigenvalue λi,A . Then, by definition,
                                                             in scale-free networks.       Physical Review E,
Aui,A = λi,A ui,A (because A is symmetric in our             65:055103(R), 21 May 2002.
case). Now,                                              [5] P. Erdös and A. Rényi. On the evolution of random
       Sui,A   = (1 − δ)ui,A + βAui,A                        graphs. In Publication 5, pages 17–61. Institute
                                                             of Mathematics, Hungarian Academy of Sciences,
               = (1 − δ)ui,A + βλi,A ui,A                    Hungary, 1960.
                = (1 − δ + βλi,A )ui,A          (20)
Thus, ui,A is also an eigenvector of S, and the cor-
responding eigenvalue is (1 − δ + βλi,A ).        2
[6] M. Faloutsos, P. Faloutsos, and C. Faloutsos. On     [17] R. Pastor-Satorras and A. Vespignani. Epidemic
     power-law relationship of the internet topology.          spreading in scale-free networks. Physical Review
     In Proceedings of ACM Sigcomm 1999, September             Letters, 86(14):3200–3203, 2 April 2001.
     1999.                                                [18] R. Pastor-Satorras and A. Vespignani. Epidemic
 [7] J. O. Kephart and S. R. White. Directed-graph             dynamics in finite size scale-free networks. Physical
     epidemiological models of computer viruses. In            Review E, 65:035108, 2002.
     Proceedings of the 1991 IEEE Computer Society        [19] R. Pastor-Satorras and A. Vespignani. Epidemics
     Symposium on Research in Security and Privacy,            and immunization in scale-free networks.           In
     pages 343–359, May 1991.                                  S. Bornholdt and H. G. Schuster, editors, Hand-
 [8] J. O. Kephart and S. R. White. Measuring and              book of Graphs and Networks: From the Genome
     modeling computer virus prevalence. In Proceed-           to the Internet. Wiley-VCH, Berlin, May 2002.
     ings of the 1993 IEEE Computer Society Sympo-        [20] R. Pastor-Satorras and A. Vespignani. Immuniza-
     sium on Research in Security and Privacy, pages           tion of complex networks. Physical Review E,
     2–15, May 1993.                                           65:036104, 2002.
 [9] S. R. Kumar, P. Raghavan, S. Rajagopalan, and        [21] M. Richardson and P. Domingos. Mining the net-
     A. Tomkins. Trawling the web for emerging                 work value of customers. In Proceedings of the Sev-
     cyber-communities. Computer Networks, 31(11-              enth International Conference on Knowledge Dis-
     16):1481–1493, 1999.                                      covery and Data Mining, pages 57–66, San Fran-
[10] H. Martin, editor. The Virus Bulletin: Inde-              cisco, CA, 2001.
     pendent Anti-Virus Advice. World Wide Web,           [22] M. Ripeanu, I. Foster, and A. Iamnitchi. Map-
     http://www.virusbtn.com, 2002. Ongoing.                   ping the gnutella network: Properties of large-
[11] A. G. McKendrick. Applications of mathematics             scale peer-to-peer systems and implications for sys-
     to medical problems. In Proceedings of Edin. Math.        tem design. IEEE Internet Computing Journal,
     Society, volume 14, pages 98–130, 1926.                   6(1), 2002.
[12] A. Medina, A. Lakhina, I. Matta, and J. By-          [23] S. Staniford, V. Paxson, and N. Weaver. How to
     ers. Brite: Universal topology generation from            0wn the internet in your spare time. In Proceedings
     a user’s perspective. Technical Report BUCS-              of the 11th USENIX Security Symposium, August
     TR2001-003, Boston University, 2001. World Wide           2002.
     Web, http://www.cs.bu.edu/brite/publications/.       [24] CERT Advisory CA-1999-04.                     Melissa
[13] M. Mihail and C. H. Papadimitriou. On the eigen-
                                                               macro      virus.           World      Wide    Web,
     value power law. In RANDOM 2002, Harvard Uni-
                                                               http://www.cert.org/advisories/CA-1999-
     versity, Cambridge, MA, 15 September 2002.
                                                               04.html, 1999.
[14] Y. Moreno, R. Pastor-Satorras, and A. Vespignani.
                                                          [25] CERT Advisory CA-2001-23.          Continued threat
     Epidemic outbreaks in complex heterogeneous net-
                                                               of the ”code red” worm. World Wide Web,
     works. The European Physical Journal B, 26:521–
                                                               http://www.cert.org/advisories/CA-2001-
     529, 4 February 2002.
[15] M. E. J. Newman, S. Forrest, and J. Balthrop.             23.html, 2001.
                                                          [26] C. Wang, J. C. Knight, and M. C. Elder. On
     Email networks and the spread of computer
     viruses. Physical Review E, 66:035101(R), 10              computer viral infection and the effect of immu-
                                                               nization. In Proceedings of the 16th ACM Annual
     September 2002.
[16] R. Pastor-Satorras and A. Vespignani. Epidemic            Computer Security Applications Conference, De-
     dynamics and endemic states in complex networks.          cember 2000.
     Physical Review E, 63:066117, 2001.
You can also read