Cicada: Introducing Predictive Guarantees for Cloud Networks

Page created by Allen Strickland
 
CONTINUE READING
Cicada: Introducing Predictive Guarantees for Cloud Networks

               Katrina LaCurts* , Jeffrey C. Mogul† , Hari Balakrishnan* , Yoshio Turner‡
                                 * MIT CSAIL, † Google, Inc., ‡ HP Labs

Abstract                                                            and [24], current cloud systems fail to offer even basic
In cloud-computing systems, network-bandwidth guaran-               network performance guarantees; this inhibits cloud use
tees have been shown to improve predictability of applica-          by enterprises that must provide service-level agreements.
tion performance and cost [1, 28]. Most previous work on               Others have proposed mechanisms to support cloud
cloud-bandwidth guarantees has assumed that cloud ten-              bandwidth guarantees (see §2); with few exceptions, these
ants know what bandwidth guarantees they want [1, 17].              works assume that tenants know what guarantee they want,
However, as we show in this work, application bandwidth             and require the tenants to explicitly specify bandwidth
demands can be complex and time-varying, and many                   requirements, or to request a specific set of network re-
tenants might lack sufficient information to request a guar-        sources [1, 17]. But do cloud customers really know their
antee that is well-matched to their needs, which can lead           network requirements? Application bandwidth demands
to over-provisioning (and thus reduced cost-efficiency) or          can be complex and time-varying, and not all application
under-provisioning (and thus poor user experience).                 owners accurately know their bandwidth demands. A
   We analyze traffic traces gathered over six months from          tenant’s lack of accurate knowledge about its future band-
an HP Cloud Services datacenter, finding that application           width demands can lead to over- or under-provisioning.
bandwidth consumption is both time-varying and spatially               For many (but not all) cloud applications, future band-
inhomogeneous. This variability makes it hard to predict            width requirements are in fact predictable. In this paper,
requirements. To solve this problem, we develop a predic-           we use network traces from HP Cloud Services [11] to
tion algorithm usable by a cloud provider to suggest an             demonstrate that tenant bandwidth demands can be time-
appropriate bandwidth guarantee to a tenant. With tenant            varying and spatially inhomogeneous, but can also be
VM placement using these predictive guarantees, we find             predicted, based on automated inference from their previ-
that the inter-rack network utilization in certain datacenter       ous history.
topologies can be more than doubled.                                   We argue that predictive guarantees provide a better
                                                                    abstraction than prior approaches, for three reasons. First,
1   I NTRODUCTION                                                   the predictive guarantee abstraction is simpler for the ten-
This paper introduces predictive guarantees, a new ab-              ant, because the provider automatically predicts a suitable
straction for bandwidth guarantees in cloud networks. A             guarantee and presents it to the tenant.
provider of predictive guarantees observes traffic along               Second, the predictive guarantee abstraction supports
the network paths between virtual machines (VMs), and               time- and space-varying demands. Prior approaches typi-
uses those observations to predict the data rate require-           cally offer bandwidth guarantees that are static in at least
ments over a future time interval. The cloud provider can           one of those respects, but these approaches do not capture
offer a tenant a guarantee based on this prediction.                general cloud applications. For example, we expect tem-
   Why should clouds offer bandwidth guarantees, and                poral variation for user-facing applications with diurnally-
why should they use predictive guarantees in particu-               varying workloads, and spatial variation in VM-to-VM
lar? Cloud computing infrastructures, especially public             traffic for applications such as three-tier services.
“Infrastructure-as-a-Service” (IaaS) clouds, such as those             Third, the predictive guarantee abstraction easily sup-
offered by Amazon, HP, Google, Microsoft, and others,               ports fine-grained guarantees. By generating guarantees
are being used not just by small companies, but also by             automatically, rather than requiring the tenant to specify
large enterprises. For distributed applications involving           them, we can feasibly support a different guarantee on
significant inter-node communication, such as in [1], [23],         each VM-to-VM directed path, and for relatively short

                                                                1
time intervals. Fine-grained guarantees are potentially                    duce jobs are scheduled so that their network-intensive
more efficient than coarser-grained guarantees, because                    phases do not interfere with each other. Proteus assumes
they allow the provider to pack more tenants into the same                 uniform all-to-all hose-model bandwidth requirements
infrastructure.                                                            during network-intensive phases, although each such
   Recent research has addressed these issues in part.                     phase can run at a different predicted bandwidth. Pro-
For example, Oktopus [1] supports a limited form of                        teus (and other related work [13, 19]) does not generalize
spatial variation; Proteus [28] supports a limited form                    to a broad range of enterprise applications as Cicada does.
of temporal-variation prediction. But no prior work,                          Hajjat et al. [9] describe a technique to decide which
to our knowledge, has offered a comprehensive frame-                       application components to place in a cloud datacenter,
work for cloud customers and providers to agree on ef-                     for hybrid enterprises where some components remain in
ficient network-bandwidth guarantees, for applications                     a private datacenter. Their technique tries to minimize
with time-varying and space-varying traffic demands.                       the traffic between the private and cloud datacenters, and
   We place predictive guarantees in the concrete context                  hence recognizes that inter-component traffic demands
of Cicada1 , a system that implements predictive guaran-                   are spatially non-uniform. In contrast to Cicada, they do
tees. Cicada observes a tenant’s past network usage to                     not consider time-varying traffic nor how to predict it, and
predict its future network usage, and acts on its predic-                  they focus primarily on the consequences of wide-area
tions by offering predictive guarantees to the customer,                   traffic, rather than intra-datacenter traffic.
as well as by placing (or migrating) VMs to increase                          Cicada does not focus on the problem of enforcing
utilization and improve load balancing in the provider’s                   guarantees [8, 16, 22, 25] nor the tradeoff between guar-
network. Cicada is therefore beneficial to both the cloud                  antees and fairness [21]. Though these issues would arise
provider and cloud tenants. More details about Cicada                      for a provider using Cicada, they can be addressed with
are available in [15].                                                     any of the above techniques.
2    R ELATED W ORK                                                        3   T HE D ESIGN OF C ICADA
Throughout, we use the following definitions. A provider                   The goal of Cicada is to free tenants from choosing
refers to an entity offering a public cloud (IaaS) service.                between under-provisioning for peak periods, or over-
A customer is an entity paying for use of the public cloud.                paying for unused bandwidth. Predictive guarantees per-
We distinguish between customer and tenant, as tenant                      mit a provider and tenant to agree on a guarantee that
has a specific meaning in OpenStack [20]: operationally,                   varies in time and/or space. The tenant can get the net-
a collection of VMs that communicate with each other. A                    work service it needs at a good price, while the provider
single customer may be responsible for multiple tenants.                   can avoid allocating unneeded bandwidth and can amor-
Finally, application refers to software run by a tenant; a                 tize its infrastructure across more tenants.
tenant may run multiple applications at once.
   Recent research has proposed various forms of cloud                     3.1 An Overview of Cicada
network guarantees. Oktopus supports a two-stage “vir-                     Cicada has several components, corresponding to the
tual oversubscribed cluster” (VOC) model [1], intended                     steps in Figure 1. After determining whether to admit
to match a typical application pattern in which clusters                   a tenant—taking CPU, memory, and network resources
of VMs require high intra-cluster bandwidth and lower                      into account—and making an initial placement (steps 1
inter-cluster bandwidth. VOC is a hierarchical generaliza-                 and 2), Cicada measures the tenant’s traffic (step 3), and
tion of the hose model [5]; the standard hose model, as                    delivers a time series of traffic matrices to a logically
used in MPLS, specifies for each node its total ingress and                centralized controller. The controller uses these measure-
egress bandwidths. The finer-grained pipe model specifies                  ments to predict future bandwidth requirements (step 4).
bandwidth values between each pair of VMs. CloudMir-                       In most cases, Cicada converts a bandwidth prediction
ror allows tenants to specific a tenant abstraction graph,                 into an offered guarantee for some future interval. Cus-
which reflects the structure of the application itself [17].               tomers may choose to accept or reject Cicada’s predictive
Cicada supports any of these models, and unlike these                      guarantees (step 5). Because Cicada collects measure-
systems, does not require the tenant to specify network                    ment data continually, it can make new predictions and
demands up front.                                                          offer new guarantees throughout the lifetime of the tenant.
   The Proteus system [28] profiles specific MapReduce                        Cicada interacts with other aspects of the provider’s
jobs at a fine time scale, to exploit the predictable phased               infrastructure and control system. The provider needs to
behavior of these jobs. It supports a “temporally inter-                   rate-limit the tenant’s traffic to ensure that no tenant un-
leaved virtual cluster” model, in which multiple MapRe-                    dermines the guarantees sold to other customers. We dis-
    1 Thanksto Dave Levin, Cicada is an acronym: Cicada Infers Cloud       tinguish between guarantees and limits. If the provider’s
Application Demands Automatically.                                         limit is larger than the corresponding guarantee, tenants

                                                                       2
hypervisors enforce rate limits
               1. customer makes          2. provider places tenant VMs             and collect measurements
                  initial request            and sets initial rate limits
                                                                                                                      3. hypervisors continually send
                                                                                                                     measurements to Cicada controller
                                               5b. provider may update
            5a. provider may offer the   placement based on Cicadaʼs prediction                                  4. Cicada predicts bandwidth
           customer a guarantee based                                                                                    for each tenant
              on Cicadaʼs prediction

                                                Figure 1: Cicada’s architecture.

can exploit best-effort bandwidth beyond their guarantees.                    Note that the predictive guarantees offered by Cicada
   A provider may wish to place VMs based on their                          are limited by any caps set by the provider; thus, a pro-
associated bandwidth guarantees, to improve network                         posed guarantee might be lower than suggested by the
utilization. We describe a VM placement method in §6.                       prediction algorithm.
The provider could also migrate VMs to increase the                           A predictive guarantee entails some risk of either under-
number of guarantees that the network can support [6].                      provisioning or over-provisioning, and different tenants
                                                                            will have different tolerances for these risks, typically
3.2 Assumptions
                                                                            expressed as a percentile (e.g., the tenant wants sufficient
In order to make predictions, Cicada needs to gather suffi-                 bandwidth for 99.99% of the 10-second intervals). Ci-
cient data about a tenant’s behavior. Based on our eval-                    cada’s prediction algorithm also recognizes conditions
uation, Cicada may need at least an hour or two of data                     under which the prediction is unreliable, in which case
before it can offer useful predictions.                                     Cicada does not propose a guarantee for a tenant.
   Any shared resource that provides guarantees must in-
clude an admission control mechanism, to avoid making                       3.5 Recovering from Faulty Predictions
infeasible guarantees. We assume that Cicada will in-                       Cicada’s prediction algorithm may make faulty predic-
corporate network admission control using an existing                       tions because of inherent limitations or insufficient prior
mechanism [1, 12, 14]. We also assume that the cloud                        information. Because Cicada continually collects mea-
provider has a method to enforce guarantees, such as [21].                  surements, it can detect when its current guarantee is
3.3 Measurement Collection                                                  inappropriate for the tenant’s current network load.
Cicada collects a time series of traffic matrices for each                     When Cicada detects a faulty prediction, it can take one
tenant, either passively, by collecting NetFlow or sFlow                    of many actions: stick to the existing guarantee, propose
data at switches within the network, or by using an agent                   a new guarantee, upgrade the tenant to a higher, more
that runs in the hypervisor of each machine. We would                       expensive guarantee, etc. How and whether to upgrade
like to base predictions on offered load, but when the                      guarantees, as well as what to do if Cicada over-predicts,
provider imposes rate limits, we risk underestimating peak                  is a pricing-policy decision, and outside our scope. We
loads that exceed those limits, and thus under-predicting                   note, however, that typical System Level Agreements
future traffic. We can detect when a VM’s traffic is rate-                  (SLAs) include penalty clauses, in which the provider
limited via packet drops, so these underestimates can be                    agrees to remit some or all of the customer’s payment if
detected, too, although not precisely quantified. Currently,                the SLA is not met.
Cicada does not account for this potential error.
                                                                            4     M EASUREMENT R ESULTS
3.4 Prediction Model                                                        We designed Cicada under the assumption that the traffic
Some applications, such as backup or database ingestion,                    from cloud tenants is predictable, but in ways that are not
require bulk bandwidth, and need guarantees that the                        captured by existing models. Before building Cicada, we
average bandwidth over a period of H hours will meet                        collected data from HP Cloud Services, to analyze the
their needs (e.g., a guarantee that two VMs will be able                    spatial and temporal variability of its tenants’ traffic.
to transfer 3.6Gbit in an hour, but not necessarily at a
constant rate of 1Mbit/s). Other applications, such as user-                4.1 Data
facing systems, require guarantees for peak bandwidth                       We have collected sFlow [27] data from HP Cloud Ser-
over much shorter intervals (e.g., a guarantee that two                     vices, which we refer to as the HPCS dataset. Our dataset
VMs will be able to transfer at a rate of 1Mbit/s for the                   consists of about six months of samples from 71 Top-of-
next hour).                                                                 Rack (ToR) switches. Each ToR switch connects to either
   Cicada’s predictions describe the maximum bandwidth                      48 servers via 1GbE NICs, or 16 servers via 10GbE NICs.
expected during any averaging interval δ during a given                     In total, the data represents about 1360 servers, spanning
time interval H. If δ = H, the prediction is for the band-                  several availability zones. This dataset differs from those
width requirement averaged over H hours, but for an in-                     in previous work—such as [2, 3, 7]—in that it captures
teractive application, δ might be just a few milliseconds.                  VM-to-VM traffic patterns.

                                                                       3
1                                        1                                          For each tenant in the HPCS dataset, we calculate its
        0.8                                       0.8                                      temporal variation value, and plot the CDF in Figure 2.
        0.6                                       0.6
CDF

                                            CDF
                                                                                           As with spatial variation, most tenants have high temporal
        0.4                                       0.4               1 hour
        0.2                                       0.2              4 hours
                                                                                           variability. This variability decreases as we increase the
          0                                        0
                                                                  24 hours                 time interval H, but we see variability at all time scales.
           0.01    0.1    1     10    100           0.01     0.1   1    10 100
                                                                                              These results indicate that a rigid model is insufficient
                  Spatial Variation                        Temporal Variation
                                                                                           for making traffic predictions. Given the high spatial
               Figure 2: Variation in the HPCS dataset.                                    cov values in Figure 2, a less strict but not entirely gen-
   We aggregate the data over five-minute intervals, to                                    eral model such as VOC [1] may not be sufficient either.
produce datapoints of the form htimestamp, source, des-                                    Furthermore, the high temporal variation values indicate
tination, number of bytes transferred from source to des-                                  that static models cannot accurately represent the traffic
tinationi. We keep only the datapoints where both the                                      patterns of the tenants in the HPCS dataset.
source and destination are private IPs of virtual machines,
and thus all our datapoints represent traffic within the data-                             5   C ICADA ’ S T RAFFIC P REDICTION M ETHOD
center. Making predictions about traffic traveling outside                                 Much work has been done on predicting traffic matrices
the datacenter or between datacenters is a challenging                                     from noisy measurements such as link-load data [26, 29].
task, which we do not address in this work.                                                In these scenarios, prediction approaches such as Kalman
   Under the agreement by which we obtained this data,                                     filters and Hidden Markov Models—which try to estimate
we are unable to reveal information such as the total num-                                 true values from noisy samples—are appropriate. Cicada,
ber of tenants, the number of VMs per tenant, or the                                       however, knows the exact traffic matrices observed in past
growth rates for these values.                                                             epochs; its problem is to predict a future traffic matrix.
4.2 Spatial and Temporal Variability                                                           Cicada’s prediction algorithm adapts Herbster and War-
To quantify spatial variability, we compare the observed                                   muth’s “tracking the best expert” idea [10], which has
tenants to a static, all-to-all tenant. This tenant is the easi-                           been successfully adapted before in wireless power-saving
est to make predictions for: every intra-tenant connection                                 and energy reduction contexts [4, 18]. To predict the traf-
sends the same amount of data at a constant rate. Let Fi j                                 fic matrix for epoch n + 1, we use all previously observed
be the fraction of this tenant’s traffic sent from V Mi to                                 traffic matrices, M1 , . . . , Mn (below, we show that data
V M j . For this “ideal” all-to-all tenant, all of these values                            from the distant past can be pruned away without affect-
are equal; every VM sends the same amount of data to                                       ing accuracy). In such a matrix, the entry in row i and
every other VM. As another example, consider a bimodal                                     column j specifies the number of bytes that VM i sent to
tenant with some pairs that never converse and others that                                 VM j in the corresponding epoch (for predicting average
converse equally. Let k = n/2. Each VM communicates                                        bandwidth) or the maximum observed over a δ -length
with k VMs, and sends 1/k of its traffic; hence half of the                                interval (for peak bandwidth).
F values are 1/k, and the other half are zero.                                                 The algorithm assigns each of the previous matrices a
   For each tenant in the HPCS dataset, we calculate the                                   weight as part of a linear combination. The result of this
distribution of its F values, and plot the coefficient of                                  combination is the prediction for Mn+1 . Over time, these
variation in Figure 2. The median cov value is 1.732,                                      weights get updated; the higher the weight of a previous
which suggests nontrivial overall spatial variation (for                                   matrix, the more important that matrix is for prediction.
contrast, the cov of our ideal tenant is zero, and the cov                                 In the HPCS dataset, we found that the 12 most recent
of our bimodal tenant is one2 ).                                                           hours of traffic are weighted heavily, and there is also a
   To quantify temporal variability, we first pick a time                                  spike at 24 hours earlier. Weights are vanishingly small
interval H. For each consecutive H-hour interval, we cal-                                  prior to this. Thus, at least in our dataset, one does not
culate the sum of the total number of bytes sent between                                   need weeks’ worth of data to make reliable predictions.
each pair p of VMs, which gives us a distribution Tp of                                        We refer the reader to [15] for a full description of our
bandwidth totals. We then compute the coefficient of vari-                                 prediction algorithm, including the method for updating
ation for this distribution, cov p . The temporal variation                                the weights, as well as an evaluation against a VOC-style
for a tenant is the weighted sum of these values. The                                      prediction algorithm [1]. To summarize those results, Ci-
weight for cov p is the bandwidth used by pair p, to reflect                               cada’s prediction method decreases the median relative
the notion that tenants where only one small flow changes                                  error by 90% compared to VOC when predicting either
over time are less temporally-variable than those where                                    average or peak bandwidth. Cicada also makes predic-
one large flow changes over time.                                                          tions quickly. In the HPCS dataset, the mean prediction
                                                                                           speed is under 10 msec in all but one case (under 25 msec
                                                        p
      2 For   the bimodal tenant, µ = 1/2k and σ =          1/n ∑n (1/2k)2   = 1/2k.       in all cases).
                                                                 i=1

                                                                                       4
Algorithm 1 Cicada’s VM placement algorithm

                                                                   Inter-rack Bandwidth Available
                                                                                                     600000
                                                                                                                                                      bw factor 1x
 1: P = set of VM pairs, in descending order of their                                                500000                                          bw factor 25x
                                                                                                                                                    bw factor 250x
    bandwidth predictions

                                                                            (GByte/hour)
                                                                                                     400000
 2: for (src, dst) ∈ P do
                                                                                                     300000
 3:   if resources aren’t available to place (src, dst)
      then                                                                                           200000

 4:      revert reservation                                                                          100000

 5:      return False                                                                                    0
                                                                                                              1   1.4142   1.5     1.666 1.75     2      4      8    16
 6:   A = All available paths                                                                                                    Oversubscription Factor
 7:   if src or dst has already been placed then
 8:      restrict A to paths including the appropriate end-       Figure 3: Inter-rack bandwidth available after placement.
         point
 9:   Place (src, dst) on the path in A with the most             bandwidth—what remains unallocated after the VMs are
      available bandwidth                                         placed—varies with O p , the physical oversubscription
                                                                  factor. In all cases, Cicada’s placement algorithm leaves
                                                                  more inter-rack bandwidth available. When O p is greater
6   E VALUATION                                                   than two, both algorithms perform comparably, since this
    Although Cicada’s predictive guarantees allow cloud           is a constrained environment with little bandwidth avail-
providers to decrease wasted bandwidth, it is not clear           able overall. However, with lower over-subscription fac-
that these providers treat wasted intra-rack and inter-rack       tors, Cicada’s algorithm leaves more than twice as much
bandwidth equally. Inter-rack bandwidth may cost more,            bandwidth available, suggesting that it uses network re-
and even if intra-rack bandwidth is free, over-allocating         sources more efficiently in this setting.
network resources on one rack can prevent other tenants              Over-provisioning reduces the value of our improved
from being placed on the same rack.                               placement, but it does not remove the need for better guar-
    To determine whether wasted bandwidth is intra- or            antees. Even on an over-provisioned network, a tenant
inter-rack, we need a VM-placement algorithm. For VOC,            whose guarantee is too low for its needs may suffer if its
we use the placement algorithm detailed in [1], which             VM placement is unlucky. A “full bisection bandwidth”
tries to place clusters on the smallest subtree that will         network is only that under optimal routing; bad routing
contain them. For Cicada, we developed a greedy place-            decisions or bad placement can still waste bandwidth.
ment algorithm, detailed in Algorithm 1. This algorithm
is similar in spirit to VOC’s placement algorithm; Ci-            7                                 C ONCLUSION
cada’s algorithm tries to place the most-used VM pairs            This paper described the rationale for predictive guar-
on the highest-bandwidth paths, which in a typical dat-           antees in cloud networks, and the design of Cicada, a
acenter corresponds to placing them on the same rack,             system that provides this abstraction to tenants. Cicada
and then the same subtree. However, since Cicada uses             provides fine-grained, temporally- and spatially-varying
fine-grained, pipe-model predictions, it has the potential        guarantees without requiring the clients to specify their
to allocate more flexibly; VMs that do not transfer data to       demands explicitly. We outlined a prediction algorithm
one another need not be placed on the same subtree, even          where the prediction for a future epoch is a weighted lin-
if they belong to the same tenant.                                ear combination of past observations, with the weights
    We compare Cicada’s placement algorithm against               updated and learned online in an automatic way. Using
VOC’s on a simulated physical infrastructure with 71              traces from HP Cloud Services, we showed that the fine-
racks with 16 servers each, 10 VM slots per server, and           grained structure of predictive guarantees can be used by
(10G/O p )Gbps inter-rack links, where O p is the physical        a cloud provider to improve network utilization in certain
oversubscription factor. For each algorithm, we select a          datacenter topologies.
random tenant, and use the ground truth data to determine
this tenant’s bandwidth needs for a random hour of its ac-        ACKNOWLEDGMENTS
tivity, and place its VMs. We repeat this process until 99%
                                                                  We are indebted to a variety of people at HP Labs and HP
of the VM slots are filled. Using the ground-truth data
                                                                  Cloud for help with data collection and for discussions of
allows us to compare the placement algorithms explicitly,
                                                                  this work. In particular, we thank Sujata Banerjee, Ken
without conflating this comparison with prediction errors.
                                                                  Burden, Phil Day, Eric Hagedorn, Richard Kaufmann,
To get a sense of what would happen with more network-
                                                                  Jack McCann, Gavin Pratt, Henrique Rodrigues, and Re-
intensive tenants, we also evaluated scenarios where each
                                                                  nato Santos. This work was supported in part by the
VM-pair’s relative bandwidth use was multiplied by a
                                                                  National Science Foundation under grant IIS-1065219 as
constant bandwidth factor (1×, 25×, or 250×).
                                                                  well as an NSF Graduate Research Fellowship.
    Figure 3 shows how the available inter-rack

                                                              5
R EFERENCES                                                             ogy, 2014.
 [1]   BALLANI , H., C OSTA , P., K ARAGIANNIS , T., AND          [16]   L AM , T. V., R ADHAKRISHNAN , S., PAN , R.,
       ROWSTRON , A. Towards Predictable Datacenter                      VAHDAT, A., AND VARGHESE , G. NetShare and
       Networks. In SIGCOMM (2011).                                      Stochastic NetShare: Predictable Bandwidth Allo-
 [2]   B ENSON , T., A KELLA , A., AND M ALTZ , D. A.                    cation for Data Centers. SIGCOMM CCR 42, 3
       Network Traffic Characteristics of Data Centers in                (2012).
       the Wild. In IMC (2010).                                   [17]   L EE , J., L EE , M., P OPA , L., T URNER , Y., BANER -
                                                                         JEE , S., S HARMA , P., AND S TEPHENSON , B.
 [3]   B ENSON , T., A NAND , A., A KELLA , A., AND
       Z HANG , M. Understanding Data Center Traffic                     CloudMirror: Application-Aware Bandwidth Reser-
       Characteristics. In WREN (2009).                                  vations in the Cloud. In HotCloud (2013).
 [4]   D ENG , S., AND BALAKRISHNAN , H. Traffic-aware            [18]   M ONTELEONI , C., BALAKRISHNAN , H., F EAM -
                                                                         STER , N., AND JAAKKOLA , T. Managing the
       Techniques to Reduce 3G/LTE Wireless Energy
       Consumption. In CoNEXT (2012).                                    802.11 Energy/Performance Tradeoff with Machine
 [5]   D UFFIELD , N. G., G OYAL , P., G REENBERG ,                      Learning. Tech. Rep. MIT-LCS-TR-971, Mas-
       A., M ISHRA , P., R AMAKRISHNAN , K. K., AND                      sachusetts Institute of Technology, 2004.
       VAN DER M ERWE , J. E. A Flexible Model for Re-
                                                                  [19]   N IU , D., F ENG , C., AND L I , B. Pricing Cloud
       source Management in Virtual Private Networks. In                 Bandwidth Reservations Under Demand Uncer-
       SIGCOMM (1999).                                                   tainty. In Sigmetrics (2012).
 [6]   E RICKSON , D., H ELLER , B., YANG , S., C HU , J.,        [20]   OpenStack Operations Guide - Managing Projects
       E LLITHORPE , J., W HYTE , S., S TUART, S., M C K-                and Users.            http://docs.openstack.
       EOWN , N., PARULKAR , G., AND ROSENBLUM ,
                                                                         org/trunk/openstack-ops/content/
       M. Optimizing a Virtualized Data Center. In SIG-                  projects_users.html.
       COMM Demos (2011).                                         [21]   P OPA , L., K UMAR , G., C HOWDHURY, M., K RISH -
                                                                         NAMURTHY, A., R ATNASAMY, S., AND S TOICA , I.
 [7]   G REENBERG , A., H AMILTON , J. R., JAIN , N.,
       K ANDULA , S., K IM , C., L AHIRI , P., M ALTZ ,                  FairCloud: Sharing the Network in Cloud Comput-
       D. A., PATEL , P., AND S ENGUPTA , S. VL2: A                      ing. In SIGCOMM (2012).
       Scalable and Flexible Data Center Network. In SIG-         [22]   R AGHAVAN , B., V ISHWANATH , K., R AMABHAD -
                                                                         RAN , S., YOCUM , K., AND S NOEREN , A. C.
       COMM (2009).
 [8]   G UI , C., L U , G., WANG , H. J., YANG , S., KONG ,              Cloud Control with Distributed Rate Limiting. In
       C., S UN , P., W U , W., AND Z HANG , Y. SecondNet:               SIGCOMM (2007).
       A Data Center Network Virtualization Architecture          [23]   R ASMUSSEN , A., C ONLEY, M., K APOOR , R.,
       with Bandwidth Guarantees. In CoNEXT (2010).                      L AM , V. T., P ORTER , G., AND VAHDAT, A.
 [9]   H AJJAT, M., S UN , X., S UNG , Y.-W. E., M ALTZ ,                Themis: An I/O-Efficient MapReduce. In SOCC
       D., R AO , S., S RIPANIDKULCHAI , K., AND                         (2012).
       TAWARMALANI , M. Cloudward Bound: Planning                 [24]   R ASMUSSEN , A., P ORTER , G., C ONLEY, M.,
       for Beneficial Migration of Enterprise Applications               M ADHYASTHA , H., M YSORE , R. N., P UCHER , A.,
                                                                         AND VAHDAT, A. TritonSort: A Balanced Large-
       to the Cloud. In SIGCOMM (2010).
[10]   H ERBSTER , M., AND WARMUTH , M. K. Tracking                      Scale Sorting System. In NSDI (2011).
       the Best Expert. Machine Learning 32 (1998).               [25]   RODRIGUES , H., S ANTOS , J. R., T URNER , Y.,
[11]   HP Cloud. http://hpcloud.com.                                     S OARES , P., AND G UEDES , D. Gatekeeper: Sup-
[12]   JAMIN , S., S HENKER , S. J., AND DANZIG , P. B.                  porting Bandwidth Guarantees for Multi-tenant Dat-
       Comparison of Measurement-based Admission Con-                    acenter Networks. In WIOV (2011).
       trol Algorithms for Controlled-load Service. In In-        [26]   ROUGHAN , M., T HORUP, M., AND Z HANG , Y.
       focom (1997).                                                     Traffic Engineering with Estimated Traffic Matrices.
[13]   K IM , G., PARK , H., Y U , J., AND L EE , W. Vir-                In IMC (2003).
       tual Machines Placement for Network Isolation in           [27]   sFlow. http://sflow.org.
       Clouds. In RACS (2012).                                    [28]   X IE , D., D ING , N., H U , Y. C., AND KOMPELLA ,
[14]   K NIGHTLY, E. W., AND S HROFF , N. B. Admission                   R. The Only Constant is Change: Incorporating
       Control for Statistical QoS: Theory and Practice.                 Time-Varying Network Reservations in Data Cen-
       IEEE Network 13, 2 (1999).                                        ters. In SIGCOMM (2012).
[15]   L AC URTS , K., M OGUL , J. C., BALAKRISHNAN ,             [29]   Z HANG , Y., ROUGHAN , M., D UFFIELD , N., AND
       H., AND T URNER , Y. Cicada: Predictive Guaran-                   G REENBERG , A. Fast Accurate Computation of
       tees for Cloud Network Bandwidth. Tech. Rep. MIT-                 Large-Scale IP Traffic Matrices from Link Loads.
       CSAIL-TR-004, Massachusetts Institute of Technol-                 In Sigmetrics (2003).

                                                              6
You can also read