Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI

Page created by Ann Harmon
 
CONTINUE READING
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
sustainability

Article
Optimized Energy Cost and Carbon Emission-Aware
Virtual Machine Allocation in Sustainable
Data Centers
T. Renugadevi 1, *, K. Geetha 1 , K. Muthukumar 2 and Zong Woo Geem 3, *
 1    School of Computing, SASTRA Deemed University, Thanjavur 613401, India; geetha@cse.sastra.edu
 2    School of Electrical and Electronics Engineering, SASTRA Deemed University, Thanjavur 613401, India;
      kmuthukumar@eee.sastra.edu
 3    Department of Energy IT, Gachon University, Seongnam 13120, Korea
 *    Correspondence: renugadevi@cse.sastra.edu (T.R.); geem@gachon.ac.kr (Z.W.G.);
      Tel.: +91-975-0887-871 (T.R.); +82-317-505-586 (Z.W.G.)
                                                                                                      
 Received: 29 May 2020; Accepted: 4 August 2020; Published: 7 August 2020                             

 Abstract: Cloud data center’s total operating cost is conquered by electricity cost and carbon tax
 incurred due to energy consumption from the grid and its associated carbon emission. In this work,
 we consider geo-distributed sustainable datacenter’s with varying on-site green energy generation,
 electricity prices, carbon intensity and carbon tax. The objective function is devised to reduce the
 operating cost including electricity cost and carbon cost incurred on the power consumption of servers
 and cooling devices. We propose renewable-aware algorithms to schedule the workload to the data
 centers with an aim to maximize the green energy usage. Due to the uncertainty and time variant
 nature of renewable energy availability, an investigation is performed to identify the impact of carbon
 footprint, carbon tax and electricity cost in data center selection on total operating cost reduction.
 In addition, on-demand dynamic optimal frequency-based load distribution within the cluster nodes
 is performed to eliminate hot spots due to high processor utilization. The work suggests optimal
 virtual machine placement decision to maximize green energy usage with reduced operating cost and
 carbon emission.

 Keywords: cloud computing; virtual machine placement; sustainable data center; energy efficiency;
 renewable energy; carbon footprint

1. Introduction
      Large data centers are nowadays an integral part of the information technology (IT) industry.
Cloud-based services are of high preference to organizations and individuals. Organizations consolidate
multiple clusters to large data centers. Power consumption has been a significant economic and
environmental issue in data centers due to growing demand. The growth of the data center’s energy
consumption is approximately 10–12% per year [1]. The geo-distributed data centers enable providers
to establish different renewable energy sources based on the environment. The energy cost associated
with data centers is approximately 42% of the overall operating cost of the data centers [2]. The service
providers are compelled to improve the infrastructure related to server power consumption, cooling
provisioning and heat dissipation while maintaining service level agreement (SLA). Data centers
contribute to 2% of the world’s total carbon dioxide (CO2 ) emission due to high energy consumption.
The cost involved with cooling infrastructure can be 50% or more in a poorly designed data center [3].
Due to increasing power densityheat and thermal management are crucial for data centers to increase
the lifetime of the servers and to reduce economic loss in the form of electricity bill. The two possible
ways to overcome the problem of CO2 emission are (1) grid power source to be replaced with renewable

Sustainability 2020, 12, 6383; doi:10.3390/su12166383                       www.mdpi.com/journal/sustainability
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                          2 of 27

energy sources; (2) Improve the Power Usage Effectiveness (PUE) of the data centers. The Green Grid
consortium [4] defines the PUE metric as the ratio between the total power consumed by the data
center (IT power + overhead power) and energy consumed by servers executing IT load (IT power).
The overhead power includes the power consumed by data center infrastructure other than server
power. The overhead power is mainly dominated by the power consumed by Computer Room Air
Conditioning (CRAC) devices. The increase in temperature inside the data center is due to two factors:
(1) Utilization of CPU in higher frequencies; (2) Increase in outside temperature. Thermal management
of CRAC units is performed based on rack-level IT loads [5,6]. Two temperature-aware algorithms were
proposed to prevent hot spots and to minimize the rise of operating temperature [7]. A game-based
thermal-aware resource allocation was proposed in [8]. It uses a cooperative Nash-bargaining solution
to reduce the thermal imbalance in data centers. Threshold-based thermal management was introduced
in [9] to handle hot spots effectively but failed to treat the thermal imbalance. Thermal management is
proposed to distribute the load at the rack level to handle temperature drop effectively but fails to
handle hotspots [10].
      The lower PUE indicates a more efficient data center showing less overhead power and more IT
power. The cloud provider’s PUE ranges from 1.1 to 1.2 [11,12]. Collocated small data centers still
provide PUE up to 2 [13]. Mixed-integer linear programming was used to minimize operating cost,
energy cost and reliability cost by minimizing active PMs in data centers [14]. Stochastic search based
on a genetic algorithm was used to reduce IT power consumption and migration cost by considering
energy-aware vitual machine migration [15]. Facebook, Amazon, Microsoft, Apple and Google have
built their suitable clean energy sources based on its location [16–18]. Since clean energy is not
consistent, it carries more challenges in its efficient usage. Data centers provide a way in for off-site grid
energy to power the infrastructure to balance the inconsistent nature of renewable energy. The nature
of variable workloads in data centers and prediction algorithms contribute to power and resource
management to use clean energy more effectively in data centers. The two popular on-site energy
sources considered are solar and wind. Solar energy follows a pattern; it increases gradually from
the morning, reaches its peak at noon, and progressively slows down. Wind energy does not have a
pattern of generation. Renewable energy availability varies based on the location of the data center.
It paves a way to target the load to the data center with the maximum renewable source to use clean
energy effectively.
      In the current state of the art, the works are carried out in different perspectives considering
traditional energy management techniques to act on energy reduction within data centers. This work
highlights the factors, namely, server energy consumption reduction and service providers’ operating
cost and carbon emission reduction. For server energy consumption reduction, it considers the variation
of the core parameters of DVFS (Dynamic Voltage Frequency Scaling), namely, frequency, utilization and
power consumption. Concerning workload, the on-demand dynamic optimal frequency for the nodes
in the cluster is identified and load balancing is performed to eliminate hot spots due to high processor
utilization. Secondly, as many providers own geo-distributed data centers powered by a mixed supply
of both grid and renewable sources, this work aims to efficiently utilize the renewable source to
reduce the total operating cost and carbon emission. The impact of electricity price, carbon footprint,
carbon cost on server and cooling device power consumption are taken into consideration while
formulating the proposed objective function. In our previous work [19], VM placement considering
dynamic optimal frequency-based allocation and standard power efficient algorithm (C-PE) were
compared. This work is the extension of our previous work with both brown and green energy sources
and related energy cost parameters towards the realization of the proposed objective.
      In this work, we provoked the following questions: (1) When the renewable energy source is not
in a stable condition, how to maximize its usage? (2) How to reduce the power consumed by CRAC
devices and IT devices to reduce the total electricity cost? (3) How to reduce the carbon emission?
In this work, energy source and DVFS-aware VM placement algorithm is proposed to minimize total
cost, carbon footprint and cooling device power consumption for geo-distributed data centers with
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                           3 of 27

a mixed supply of grid and clean energy. Container technology along with virtualization is used to
provide the necessary environment and isolation for task execution [20].
     To achieve the above said objective, the following measures are carried out in this work as
key contributions.

•     Optimal DVFS-based VM scheduling is performed to distribute the load among the servers to
      minimize the operating temperature.
•     Formulation of an objective function for data center selection with the consideration of varying
      carbon tax, electricity cost and carbon intensity.
•     Investigation on the effect of renewable energy source-based data center selection on total cost,
      carbon cost and CO2 emission.
•     The efficient utilization of VMs is carried out by appropriate VM sizing and mapping of containers
      to available VM types.
•     K-medoids algorithm is used to identify container types.
•     Examined the upshot of workload-based tuning of cooling load on total power consumption.

     The remaining sections of the paper are structured as follows: In Section 1, data centers’ power
consumption information is delineated. In Section 2, existing research works in the literature related
to virtual machine placement and containers are discussed. The architecture of the sustainable data
center system model and the problem formulation of stochastic virtual machine placement are given in
Sections 3 and 4. Sections 5 and 6 briefly explains the task classifications of Google cluster workload
and the proposed algorithms. In Section 7, the experimental environment and evaluations of proposed
algorithms are detailed. Section 8 concludes the findings of this research work.

2. Related Works
       Extensive research has been carried out to deal with energy efficiency in data centers. Their focus
is towards the optimal QoS, efficient utilization of resources and operation cost reduction. However,
still, it is a challenging task to satisfy the necessities of users and service providers with efficient energy
management. In an energy efficiency perspective, the focus may be on software level, hardware level
or intermediate level [21].

2.1. DVFS and Energy-Aware VM Scheduling
     The growth of data centers in terms of size and quantity leads to significant increase in energy
consumption resulting in more challenges in its management. In DVFS-based energy efficient power
management approach, the working frequency and voltage of CPU are adjusted dynamically to alter
the energy utilization of the servers. For effective energy savings in data centers, the task scheduling
is carried out based on DVFS. The authors in [22] have proposed an energy-aware VM allocation
algorithm intending to solve a multi-objective problem considering the optimization of job and power
consumption along with its associated constraints. DVFS-based energy management and scheduling
on heterogeneous systems is performed in [23]. Web server’s performance control issues were handled
using DVFS as a control variable to reduce the server’s energy consumption [24].
     DVFS-based approach has been proposed with an objective to enhance the utilization of
resources and minimize the energy consumption without compromising the performance of the
system. The workloads are prioritized based on available resource demand and explicit service
level agreement requisite [25]. DVFS-based technique has been utilized for constrained parallel tasks
in [26]. The authors claim that the proposed method can minimize the energy consumption with
minimum task execution time. DVFS-based approach was applied for optimizing the energy efficiency
of the data centers in [27]. To enhance the trade-offs among application performance and energy
savings, an integrated approach of DVFS and VM consolidation has been addressed and it has been
authenticated using real test bed [28]. The results implicate that there is a trade-off between energy and
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                        4 of 27

migration time while performing energy efficient VM consolidation among geographically distributed
data centers.
      A task model has been proposed in [29] which depict the QoS of the tasks with lowest frequency.
Energy consumption ratio (ECR) has been utilized to estimate the efficiency of diverse frequencies
in task execution. To reduce energy consumption of the servers, the incoming tasks are dispatched
to the active servers and then the execution frequencies are adjusted. Migration algorithm has been
utilized on individual servers to balance the workload dynamically to minimize the ECR of the server.
In [30], a power-aware extension of WorkflowSim has been used to integrate a power model for the
optimization of pre-eminent energy saving management considering computing, reconfiguration,
network costs and host energy saving is achieved through DVFS. fort. The above-mentioned
approaches aim to minimize the energy consumption of the data centers as much as possible with
performance trade-off.
      Comparatively, in our approach, we consider the renewable energy source along with brown
energy for sharing the energy consumption while formulating the optimization problem which would
lead to different scenarios to support performance improvement of the data centers.

2.2. Regional Diversity of Electricity Price and Carbon Footprint-Aware VM Scheduling in Multi-Cloud Green
Data Centers
     Few authors formulated the VM allocation problem by merging the energy consumption of
data centers with its carbon footprint. Carbon-aware resource allocation considering a single data
center was proposed in [31] for provisioning on-demand resources on servers powered by renewable
energy. Load distribution among different data centers was proposed in [32] considering brown
energy consumption cost. A Min Brown VM placement algorithm was introduced in [33] to minimize
brown energy consumption considering the task deadline, VM migration between federated data
centers was performed to minimize brown energy cost by considering dynamic electricity pricing [34].
The migration of VM’s was considered with an aim to minimize carbon footprint in the federated
cloud [35]. A combination of wind and solar energy sources was considered with an aim to distribute
the load with zero brown energy cost [36]. Delay constraint applications were considered with an aim
to reduce electricity pricing [37].
     The authors in [38] have addressed the VM placement problem with an aim to minimize energy
and the cost associated with the carbon footprint in geologically distributed data centers, located within
the same country. A dynamic workload scheduling technique has been proposed in [39] for the servers
powered by renewable energy source. To use the renewable energy in an efficient manner, workload
migration has been addressed in [40]. The authors in [41] proposed a middleware system called
GreenWare with an aim to increase the renewable energy usage by the geo-distributed data centers
powered by wind and solar power. The focus of the study was to minimize the carbon footprint of
certain requests within a predetermined budget cost by the service provider. An adjustable workload
allocation approach within the geographically distributed data centers based on the renewable energy
availability has been proposed in [42]. Few researchers focused their research on resource management
strategies in the multi-cloud environments. To balance the workload optimally among the geographical
distributed data centers, an algorithm has been proposed in [43] to increase the green energy usage
and minimize brown energy.
     With an aim to minimize the brown energy utilization, a load balancing approach has been proposed
by utilizing the available green energy [44]. A framework has been introduced in [45] with an aim
to minimize the total electricity price of data centers. Based on the renewable energy availability,
load balancing has been done among multiple data centers. A workload and energy management scheme
has been introduced to decrease the operational cost of the network and energy costs [46]. A dynamic
workloads deferral algorithm has been introduced in [47] for multi-cloud environment. Based on the
diverged location of the data centers, the dynamic electricity prices are taken into account while ensuring
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                               5 of 27

the workloads deadline. To allocate the workloads in the sustainable data centers located at different
locations, Markov Chain-based workload scheduling algorithm has been proposed in [48].
      In the above mentioned approaches, the authors focused towards their problem formulation for
minimizing the total electricity costs of data centers without the consideration of carbon cost. The data
center partially fed by green energy helps the cloud provider to minimize the coal-based energy sources
dependency. Comparatively in our approach, we consider the renewable energy source along with
brown energy for sharing the energy consumption of the data centers with an aim to reduce the total
electricity costs and carbon cost in the geo-distributed data centers.
      The amount of renewable energy availability and carbon intensity depends on the location of the
data centers. Compared to existing approaches summarized in Table 1, to enhance the renewable energy
utilization, we consider the workload shifting approach within the geographically distributed data
centers with variation in the carbon intensities and its green energy availability. Based on the availability
of green energy, carbon emission in tons/MWh, electricity price and carbon cost, the preference has
been given for the selection of data center for workload shifting. However, due to the intermittent
nature of green energy generation, it is still essential to exploit the aforementioned parameters on
operating cost incurred due to brown energy support.

                Table 1. Comparison summary of existing work for Virtual Machine (VM) placement.

                                Approach              Environment            Metrics Considered
     Ref. No.
                                Green      Workload                           Cost of               Carbon
                   DVFS                               Multi-Cloud   Energy                 SLA
                                Energy     Shifting                          Electricity           Foot- Print
       [25]                                                          Yes                   Yes
       [26]          Yes                                             Yes                   Yes
       [27]          Yes                                             Yes                   Yes
       [28]          Yes                                             Yes                   Yes
       [44]                      Yes                     Yes         Yes        Yes
       [46]                      Yes                     Yes         Yes        Yes
       [45]                      Yes                     Yes         Yes        Yes
       [47]                                  Yes         Yes         Yes        Yes        Yes
       [48]                      Yes         Yes         Yes         Yes        Yes
       [38]                      Yes                     Yes         Yes        Yes                   Yes
       [39]                      Yes         Yes                     Yes                   Yes
    Proposed
                     Yes         Yes         Yes         Yes         Yes        Yes        Yes        Yes
    Approach

2.3. Containers
     Containers are lightweight with less startup time and communication overhead, alternate to
virtual machines. They provide the virtual platform and task isolation at the operating system level.
The containers are more prevalent in providing a platform as a service in a cloud environment [49].
The container technologies, namely Docker, was compared with kernel-based virtualization machine
(KVM) in terms of processing, memory and storage, and the performance of containers was the
same as bare metal with virtualization overhead as in VMs. Containers allow horizontally scalable
systems for hosting microservices. There is a constraint of resource exploitation under process
groups in container-based virtualization techniques [50]. A container as a service lays a bridge
between infrastructure-as-a-service (IaaS) and platform-as-a-service (PaS). Containers offer a portable
application environment by providing the application services with a free environment of platform as
a service-specific environment [51]. Docker is an open platform for launching application containers.
Docker swarm scheduler places containers on available VMs in round-robin fashion without considering
resource usage of VMs [52]. The queuing algorithm is proposed for the placement of containers on VMs
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                       6 of 27

to reduce response time and efficient utilization of VMs [53]. Constraint satisfaction programming-based
container placement algorithm is proposed to decrease billing cost and energy consumption by reducing
the number of instantiated VMs [54]. A metaheuristic approach-based container placement is addressed
to reduce migration, energy consumption, increase SLA, VM and PM utilization. Figure 1 provides
different ways of container placement. The container C1, C2 and C3 emulates an operating system
and runs directly on the operating system as in Figure 1a. Containers provide increased performance
as they do not emulate the hardware as virtual machines. The container engine provides isolation,
security and resource allocation to containers. Hybrid container architecture which the container
engine and containers execute on top of the virtual machine is shown in Figure 1b.

                 Figure 1. Containers (a) Placement on host operating system (b) Placement on VM

3. The Architecture of the Proposed System

3.1. Sustainable Data Center Model
      In the data centers, energy consumption plays a critical role which decides the carbon emission
of the conventional power generating sources. The data centers ought to be aware of the energy
efficiency of IT equipment, cooling subsystems, and carbon footprint with the help of appropriate
metrics. Data center ecosystems offer additional flexibility to incorporate the usage of on-site renewable
power generation to minimize the carbon footprint. The integration of solar and wind energy impose
new challenges into the data center’s energy management. Based on the availability of green energy,
workloads are assigned to sustainable data centers located in diverged geographical locations with
different local weather conditions.
      This paper proposes a comprehensive management strategy for sustainable data centers to reduce
the IT load and cooling supply system’s energy consumption. In such situations, the management
techniques must regulate the IT workload based on the available solar and grid energy sources. It can
be realized by allocating the workload based on the time-varying nature of renewable power. A data
center powered with hybrid power infrastructure integrating grid utility and solar-based renewable
energy is shown in Figure 2. Each rack contains M number of servers powered by both grid and
solar-based renewable energy.
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                         7 of 27

                                   Figure 2. Sustainable data center model.

3.2. Proposed Structure of Management System Model
      The utility of the management system components presented in Figure 3 are detailed below:

•     Energy-Aware Manager (EAM): The data centers of a cloud provider are located in geo-distributed
      sites. In addition to physical servers, data centers have additional energy-related parameters PUE,
      carbon footprint rate with different energy sources, varying electricity prices and proportional
      power. The EAM is the centralized node responsible to coordinate the input request distribution.
      It is responsible to direct the request to the data centers to attain minimum operating cost,
      carbon footprint rate and energy consumption. Each data center registers the cloud information
      service to EAM and updates it frequently. The energy-aware manager maintains information about
      the list of clusters, carbon footprint rate (CFR), data center PUE, total cooling load, server load,
      carbon tax, carbon cost, and the carbon intensity of the data centers.
•     Management Node (MN): Each data center holds several clusters with heterogeneous servers.
      The cluster manager of each cluster updates the cluster’s current utilization, power consumption,
      number of servers on/off to MN. The MN receives user requests from the EAM and based on the
      cluster utilization, distributes the load to the clusters through cluster manager. The main scheduling
      algorithm responsible for the allocation of VM to PM and the de-allocation of resources after VM
      termination is the ARM algorithm (Algorithm 1). It is implemented in the management node.
•     Cluster Manager (CM): Each cluster contains heterogeneous servers with different CPU and
      memory configurations. The power model of the systems in the cluster is considered homogeneous.
      Each node in the cluster updates information about its power consumption, resource utilization,
      number of running VMs, resource availability, and its current temperature to the CM. The cluster
      manager is the head node in the cluster that maintains cluster details concerning total utilization,
      server power consumption, resource availability, power model, type of energy consumed (grid or
      green) and temperature of the cluster nodes.
•     Physical Machine Manager (PMM): The PMM is a daemon responsible for maintaining the host
      CPU utilization percentage, resource allocation for VMs, power consumption, current server
      temperature, status of VM requests, number of VM request received, and so on. The PMM shares
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                       8 of 27

       its resources to the virtual machines and increases its utilization through virtual machine manager
       (VMM). It is responsible to update the aforementioned details to the cluster manager.
•      Virtual Machine Manager (VMM): The VMM utilizes the virtualization technology to share the
       physical machine resources to the virtual machines with process isolation. It decides on the
       number of VMs to be hosted, provisioning of resources to VMs and monitors each hosted VM
       utilization of physical machine resources. It maintains information about CPU utilization, memory
       utilization, power consumption, arrival time, execution time and remaining execution time of all
       active VMs, number of tasks under execution in each VM, current state of the VMs, and other
       resource and process information.

                         Figure 3. Schematic representation of the management system model.

    Algorithm 1: ARM Algorithm Approach
    Input: DCList, VMinstancelist
    Output: TargetVMQ
    1 For each interval do
    2      ReqQ← Obtain VM request based on VMinstancelist;
    3      DCQ← Obtain data centers from DCList;
    4      TargetVMQ← Activate placement algorithm;
    5      If interval >min-exe-time then
    6            Compl-list← Collect executed VMs from TargetVMQ;
    7         For each VM in Compl-list do
    8            Recover the resources related to the VM;
    9 Return TargetVMQ.
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                                9 of 27

4. Problem Formulation
      In this work, each physical machine (PM) is characterized by its resource capacity (processor and
memory) and processor power model. The power consumption is linearly correlated with its processor
utilization [30]. Each PM has fixed k discrete utilization levels in the execution state. When there
is no workload assigned, the processor is set to be in an idle state. The power consumption of the
processor at different utilization level is determined by its power model. The VM request is assumed
to have three parameters: arrival time, resource requirement and execution time. The VM request is
accepted and placed by the placement algorithms, if the required resource requirement is fulfilled by
the available PM resource capacity.4.1. Energy Consumption in Data Centers
      The power consumption by all the servers (SP) and cooling equipment (overhead power (OP)),
plays a major role while modeling the data center energy consumption. The amount of energy
utilization by the data centers has a direct impact on carbon footprint.

4.1. Power Model of Server
   The total facility power (TFP) consumption includes the overhead power consumption (OP) and
power consumption of all the servers (SP). It is formulated as (Equation (1)):
                                                                               
                                                         tc 
                                                         X           M
                                                                     X          
                                                            
                                            TFPd = OPd +     ETc ×   P j (l)                               (1)
                                                                                 
                                                                               
                                                                  c=1            j=1

where tc, d and M represents number of clusters, datacenters and number of machines.
    Pj (l) is the power consumed by jth physical machine. It is derived as [55] (Equation (2)):

                                               S j (l) –U j (l)                           
                                P j (l) =                           × P j (l + 1) − P j (l) + P j (l)           (2)
                                            U j (l + 1) − U j (l)

where Uj (l) < Sj (l) < Uj (l + 1), 0 ≤ l
Optimized Energy Cost and Carbon Emission-Aware Virtual Machine Allocation in Sustainable Data Centers - MDPI
Sustainability 2020, 12, 6383                                                                        10 of 27

4.3. Green Energy
     The availability of green energy is dependent on environmental weather conditions and different
time zones in which the data centers are located geographically. We aim to minimize the carbon
footprint by coordinating the green energy availability of distributed data centers while handling the
user’s demand. In this work, solar energy is assumed as on-site renewable energy used along with
brown energy. The solar energy has been given higher priority during its availability than grid energy.

4.4. Carbon Cost (CC) and Electricity Cost (EC)
    Carbon cost (CC) and electricity cost (EC) of the data center depends upon the carbon tax (CT),
carbon footprint rate (CFR) and energy price (EP). These factors are based on the green or brown
energy sources utilized by the data center. In addition, the carbon footprint rate (tons/MWh) and
carbon tax (dollars/ton), energy price (cents/kWh) are location-specific. We aim to reduce the cost
associated with the data center based on optimal selection of data center considering the nature of
energy source, carbon emission, carbon tax and energy price while satisfying the user requests.

4.5. Objective Function
      We aim to minimize the data center’s overall operating energy cost (TC). An objective function
is formulated to calculate the cost considering power consumption and carbon footprint emission.
The total cost (TC) for handling the workload in a data center d is the sum of carbon cost (CC) and
electricity cost (EC) formulated as (Equation (5)):

                                               TCd = CCd + ECd                                           (5)

     The first part of the Equation (5) represents the carbon cost (CC). It is dependent on carbon tax
(CT), carbon footprint rate (CFR) and total facility power (TFP) consumed by data centers calculated as
(Equation (6)):
                                       CCd = CTd × CFRd × TFPd                                      (6)

     The second part of the Equation (5) calculates the data center electricity cost (EC). It is the product
of electricity price (EP) with total facility power (TFP) calculated as (Equation (7)):

                                               ECd = EPd × TFPd                                          (7)

Constraints Associated with the Objective Function
    The objective function in Equation (5) is subjected to the following constraints:
    The sum of processor requirement R j,i (c) and memory requirement R j,i (m) of the number of
                                                                                            cpu.max
VM’s (n) placed in the physical machine PMi are not supposed to exceed the processing PMi           and
memory limit PMi  mem.max of the physical machine and it is calculated as (Equations (8) and (9)):

                                             n
                                             X
                                                                 cpu.max
                                                   R j,i (c) ≤ PMi                                       (8)
                                             j=1

                                            n
                                            X
                                                   R j,i (m) ≤ PMimem.max                                (9)
                                             j=1

     The relation R between VM and PM is many-to-one. More than one VM can be placed in one PM
but a VM should be placed only in one physical machine, i.e., R ⊆ N × M, if

                                ∀ l N &∀ m, n M : (l, m)R ∧ (l, n)R ⇒ m = n.
Sustainability 2020, 12, 6383                                                                       11 of 27

     The total brown energy (B) and green energy consumed by physical machines should be within
the service provider’s approved grid electricity consumption (B) and generated green energy (G)
(Equations (10) and (11)):
                             TFPd ≤ Total assigned brown energy (B)                        (10)

                                         SPd ≤ Total generated green energy (G)                        (11)

4.6. Performance Metrics
     To check the efficiency of VM to PM mapping, instruction to total energy ratio (IER), instruction to
cost ratio (ICR) and instruction to carbon footprint ratio (ICFR) are calculated as (Equations (12)–(14)):
                                          Ptd Ptc PM PN                                   
                                          d=1 c=1 j=1 i=1 Rd,c,j,i × Rd,c,j,i (c) × VMex
                                                                                        i 
                                                                                           
                                ICRd =                                                              (12)
                                                                                          
                                                        Ptd                                 
                                                             TC  d=1       d
                                                                                             

                                      Ptd Ptc PM PN                                   
                                      d=1 c=1 j=1 i=1 Rd,c,j,i × Rd,c,j,i (c) × VMex
                                                                                    i 
                                                                                       
                            ICFRd =                                                                  (13)
                                                                                     
                                                   td
                                                 P                                      
                                                     CFR × TFP
                                                            d=1         d       d
                                                                                         
                                          Ptd Ptc PM PN                                   
                                          d=1 c=1 j=1 i=1 Rd,c,j,i × Rd,c,j,i (c) × VMex
                                                                                        i 
                                                                                           
                                IERd =                                                              (14)
                                                                                          
                                                        Ptd                                 
                                                             TFP
                                                               d=1          d
                                                                                             

where Rd,c,j,i (c) , VMex
                       i
                           are the processor requirement and execution time of ith VM. td represents the
total number of data centers.
      The value of Rd,c,j,I is the mapping of VM to PM, set to 1, if VMi is allocated to PMj belonging to
cluster c in data center d else set to 0.
      The SLA is calculated by the ratio of VM acceptance (RVA) as (Equation (15)):
                                                         Ptc PM PN              
                                                         c=1 j=1 i=1 Rd,c,j,i 
                                          RVA (V )d =                                               (15)
                                                                                
                                                                  N
                                                                                  
                                                                                  

where N signifies the total number of received VM requests and M is the number of machines.

5. VM Placement Policies
      The VM allocation problem can be considered as a multitier bin-packing problem. In the first-tier,
containers are mapped to VMs with an objective of efficient VM utilization and in the second-tier,
VMs are mapped to PMs to reduce energy consumption and carbon emission. The arrival of a VM
request has different choices for its placement with multiple data centers in different locations each with
its carbon footprint rate, PUE, carbon tax and electricity price. In this section, different VM placement
methods are presented to investigate the impact of different parameters with independent data center
selection policies towards energy consumption, RVA acceptance percentage, carbon footprint rate and
total cost.

5.1. ARM Algorithm
     The allocation and reallocation management (ARM) algorithm is discussed in Algorithm 1.
The utility of the ARM algorithm can be categorized into two parts. Part 1: Lines 2–4 performs VM to
PM allocation. Part 2: Lines 5–6 performs resource deallocation for every interval.
     The input to the algorithm is DCList and VMinstancelist. DCList holds the list of data centers.
VMinstancelist holds the set of VM instances as detailed in Section 6.3. The output of the algorithm is
the TargetVMQ which holds the VM to PM allocation.
Sustainability 2020, 12, 6383                                                                         12 of 27

5.2. Renewable and Total Cost-Aware First-Fit Optimal Frequency VM Placement (RC-RFFF)
      The proposed RC-RFFF algorithm performs strategy plans to allocate the VM on feasible servers
ensuring data center selection based on minimum total cost obtained from Equation (5) including the
carbon tax, carbon footprint rate, energy price for both brown and green energy. The physical machine
choice is based on the server’s optimal first fit frequency. For data center selection, first preference will
be given for renewable source availability followed by the data centers with less total cost.
      The RC-RFFF algorithmic approach is presented in Algorithm 2. DCQ contains the data center
list, ReqQ holds the input VM request, TargetVMQ holds VM to PM mapping information. RC-RFFF
performs data center selection in lines 2–19 of Algorithm 2 based on carbon tax, energy price,
carbon footprint rate and available renewable energy. In line 5, the total dynamic power consumption
of the servers in the cluster is calculated using Equation (1) eliminating OPd . In line 6, the power
consumption of the VM is estimated by considering the power model of the cluster. The Gd in line 8 is
set to the available green energy. Line 9–16, considers the green energy availability while calculating
the power consumption of clusters. The data center selection is based on the sorted order of TCd in line
18. The clusters inside the data center are ordered in increasing order of Spc and ∆tot-uti in line 17.

 Algorithm 2: ARM RC-FFF Virtual Machine Placement Algorithm
Sustainability 2020, 12, 6383                                                                        13 of 27

     The host choice is based on the first-fit optimal frequency with renewable-aware cost calculation.
The host selection procedure starts from line 22 of Algorithm 2. The VM is allocated on the first-fit
feasible host with minimum utilization level. For n number of VM requests, d number of data centers,
c number of clusters, h number of available host, the complexity of the algorithm is derived as O(ndch).
To identify the data center with largest green energy availability, the complexity is O(dclogc). To identify
the host with optimal frequency, the complexity is O(ch). The pseudo codes for remaining algorithms
discussed in subsequent sections are not written as they are derived from the base Algorithm 2.
     The steps of Algorithm 2 carried out in each time interval for new VM allocation is
summarized below.

Step 1: Lines 2–18 identifies the data center to schedule the VM based on renewable energy availability.
Step 2: Line 17 sorts the clusters within the data centers in increasing order of its energy consumption.
Step 3: Line 19 sorts the data centers, first in increasing order of total cost (renewable energy electricity
      cost and carbon tax are set to 0) and then in non-increasing order of green energy availability.
Step 4: Lines 22–28 performs on-demand dynamic optimal frequency-based node selection within the
      cluster and is carried out to decide the placement of VM.

5.3. Cost-Aware First-Fit Optimal Frequency VM Placement (C-FFF)
     The C-FFF assumes all data centers with the only brown energy source. The C-FFF algorithm
performs data center selection based on the carbon tax, carbon footprint rate, and energy price of only
available brown energy. The C-FFF algorithm’s data center selection is the same as RC-RFFF except
after calculating ∆tot-uti in line 7 of Algorithm 2; the available green energy Gd in line 8 is set to zero.
The first-fit optimal frequency-based host selection of C-FFF is the same as RC-RFFF.

5.4. Renewable and Energy Cost-Aware First-Fit Optimal Frequency VM Placement (REC-RFFF)
      REC-RFF varies from RC-RFFF in calculating total cost by eliminating carbon tax, carbon footprint
rate parameters in data center selection. In this case, when there is no sufficient renewable energy
available, the data center selection is based on the energy cost of brown energy. The brown energy
cost is estimated based on the power consumption and electricity price of corresponding data centers.
Renewable energy electricity price is set to 0. The REC-RFFF differs from RC-RFFF in calculating the
total cost in Line 18 of Algorithm 2. The CCd of Equation (5) is set to 0 while calculating total-cost TCd .
The first-fit optimal frequency-based host selection of REC-RFFF is the same as RC-RFFF.

5.5. Energy Cost with First-Fit Optimal Frequency VM Placement (EC-FFF)
     The proposed EC-FFF algorithm assumes all data centers with the only brown energy source.
The EC-FFF data center selection is the same as REC-RFFF in considering only the energy cost of brown
energy for total cost and eliminating carbon emission parameters. The total cost TCd in line 18 of
Algorithm 2 concerning Equation (5) is modified with CCd set to zero and the available green energy
Gd in line 8 is set to zero. The host selection of EC-FFF is same as REC-RFFF.

5.6. Renewable and Carbon Footprint-Aware First-Fit Optimal Frequency VM Placement (RCF-RFFF)
      The proposed RCF-RFFF algorithm ensures data center selection based only on carbon footprint
rate including renewable energy availability. The carbon footprint rate of the renewable source is set to
0. The RCF-RFFF differs from RC-RFFF in data center selection, in calculating total cost in line 18 of
Algorithm 2. Set CTd . as 1 in Equation (6) to calculate CCd and replace the total cost equation in line
18 of Algorithm 2 with Equation (6). The rest of the algorithm is the same as Algorithm 2. The host
selection of RCF-RFFF is same as RC-RFFF.
Sustainability 2020, 12, 6383                                                                           14 of 27

5.7. Carbon Footprint Rate-Aware First-Fit Optimal Frequency VM Placement (CF-FFF)
     The CF-FFF algorithm assumes data center with only brown energy. CF-FFF data center section is
the same as RCF-RFFF except Gd set to zero in line 8 of Algorithm 2. The host selection of CF-FFF is
same as RCF-RFFF.

5.8. Renewable and Carbon Cost-Aware First-Fit Optimal Frequency VM Placement (RCC-RFFF)
     The RCC-RFFF data center selection is based on carbon cost obtained from Equation (6) including
the carbon tax, and carbon footprint rate excluding electricity cost. It is an extension of RCF-RFFF
and varies in calculating the total cost in line 18 of Algorithm 2. The total cost equation in line 18 of
Algorithm 2 is replaced with Equation (6) with CTd set to data center’s carbon tax. The host selection
of RCC-RFFF is the same as RCF-RFFF.

5.9. Carbon Cost-Aware First-Fit Optimal Frequency VM Placement (CC-FFF)
    The CC-FFF algorithm assumes data center with only brown energy. It is the same as RCC-RFFF
except in data center selection; the Gd in line 8 is set to 0. The host selection of CC-FFF is same
as RCC-RFFF.

6. Google Cluster Workload Overview
     Three versions of the cloud dataset [58] that are executed on Google compute nodes are publicly
available to make visible job types, resource usage, and scheduling constraint of the real workload.
The node receives the work in the form of a job. A job contains one or more tasks with individual
resource requirements. Linux containers are used to run each task. In this work, the second version
is used. The second version [59] holds 29 days of workload information of 11K machines from May
2011. In the second version, two tables, namely, task event table and resource usage table provide
information about resource request and resource usage of each task. The task events table provides the
timestamp, job-id, task index, resource request for CPU cores, memory and local disk space with other
related information. In the task event table each task is considered as container request. In this work,
the CPU and memory requirement for each task from the task event table is utilized for container
task categorization.

6.1. K-Medoids Clustering
     K-medoids is an unsupervised partitioned clustering algorithm that minimizes the sum of
dissimilarities between objects in the cluster. It is more robust to noise and outliers. For each cluster,
one object is identified as representative of the cluster. The algorithmic procedure is as follows:

Step 1: K-values from the dataset are identified as medoids.
Step 2: Calculate Euclidean distance and associate every data point to the closest medoid.
Step 3: Swapping of a selected object and the new object is done based on the objective.
Step 4: Steps 2 and 3 are repeated until there is no change in medoids.

      The repetition of steps 2 and 3 will lead to four situations as given below:

1.     The current cluster member may be shifted out to another cluster.
2.     Other cluster members may be assigned to the current cluster with a new medoid.
3.     The current medoid may be replaced by a new medoid.
4.     The redistribution does not change the objects in the cluster resulting in smaller square error criteria.

6.2. Characteristics of Task Clusters
    The random sample of 15,000 records of the first-day trace of Google workload version 2 [59] is
considered in this work to identify the container types. The resource requests (processor cores and
Sustainability 2020, 12, 6383                                                                     15 of 27

memory) of the tasks in the trace are normalized based on the maximum resource capacity of the
machines [59]. The resource request details are de-normalized based on the machine characteristics given
in Physical machine configurations Table. The containers are executed inside the VMs. The containers
placed inside the VM share the VM resources.
     Figures 4 and 5 display the percentage of task distribution among the 10 clusters identified using
K-medoids algorithm presented in Section 6.1. The data pattern represents the container resource
requirement. The first four clusters contribute to 67.47% of the overall tasks and the remaining 32.53%
is shared between clusters 5 to 10. The tasks under clusters 1 to 4 can be categorized as tasks with
minimum resource requirements. The tasks under clusters 3, 4, 5, 7 and 9 can be categorized as
tasks with medium resource requirements. Tasks under 6 and 10 can be categorized as the highest
resource requirement. Cluster 2 has the highest contribution of 23.8% of tasks with the request for
2.5 CPU cores and 2 GB. The task clusters 5 to 10 display tasks with CPU requirements more than
6 cores and memory requirements more than 7 GB. Task clusters 6 has a 1.5% contribution with the
highest CPU and memory request of 22 and 27Gb. Task cluster 10 holds 1.5% with the highest CPU
requirement of 30 cores and memory requirement of 9 GB. The statistics of data, the task with more
resource requirements, has less frequency of occurrence than the tasks with medium and minimum
requirements. The medoids identified under each cluster are considered as the representative of the
cluster to determine the appropriate container size for the task within the cluster, as given in Table 2.

                                Figure 4. Clusters based on resource requests of the task.

                                     Figure 5. Clusters based on resource requests.
Sustainability 2020, 12, 6383                                                                         16 of 27

                 Table 2. Cluster types with container configuration based on the resource request.

                           Cluster Type                vCPU                  Memory (MB)
                                1                       0.5                     186.496
                                2                       2.5                     1889.28
                                3                        6                      4890.88
                                4                       6.25                    2234.88
                                5                       12.5                    9781.76
                                6                      22.19                    27,686.4
                                7                       8.5                     9781.76
                                8                       6.25                   10,968.32
                                9                      18.75                    7304.96
                                10                      30                      9781.76

6.3. Resource Request-Based Optimal VM Sizing for Container Services (CaaS)
     After identifying the cluster types for the tasks from the selected dataset, the virtual machine sizing
to execute the tasks of each cluster type has to be identified. The containers are executed on the virtual
machines. The virtual machine resources are shared between the containers. The physical machines
are partitioned into virtual machines. VM utilizes the virtualization technology to enable the sharing
of physical resources with resource isolation and increases the utilization of the physical resource.
     To estimate the effective VM size for hosting, the identified cluster types the frequency of occurrence
of the task, and its resource usage in each cluster on an hourly basis for 24 h duration is estimated.
The resource requirement per hour (CPU-req-hourh−C1 ) for the tasks in cluster C1 are calculated based
on the average number of tasks (Num_taskh−C1 ) and average resource usage (CPU_Usageh−C1 ) of the
tasks belonging to C1 executed in the system in the hourly basis (h). The CPU-reqh−C1 is approximated
based on frequency of occurrence within 24 h period.
     The number of CPU that a virtual machine can hold depends on the capacity and the number of
virtual machines hosted on a particular physical machine. The number of vCPU a virtual machine can
hold depends on the infrastructure and the limit set by the provider. The virtual machine CPU (vCPU)
for a VM is decided by dividing CPU-reqh−C1 obtained for hourly basis by an integer m. The integer
variable m holds a value between 2 to 9. The set of values obtained by dividing CPU-reqh−C1 by m with
modulus zero is considered for vCPU sizing.
     The virtual machine vCPU configuration for a specific cluster C1 is estimated on hourly basis
(h) as
                            CPU-reqh-C1 = (Num_taskh-C1 × CPU_Usageh-C1 )/m

      The virtual machine memory configuration for a specific cluster C1 is estimated as

                                mem-reqh-C1 = (Num_taskh-C1 × mem_Usageh-C1 )/m

    The further virtual machine vCPU and memory are identified for each cluster based on better
match on number of physical machines and available capacities.

6.4. Determine Optimum Number of Tasks for VM Types
      The optimum number of tasks is estimated for each virtual machine type for efficient utilization
of virtual machines using Algorithm 3. The aim of this mapping is to avoid underutilization of virtual
machines. The Algorithm 3 determines the minimum number of tasks of a cluster type for maximum
utilization of each VM type resources. Each cluster type is mapped to the VM types identified in the
previous Section 6.3 and the list of feasible VM types are identified as given in Tables below. Minimum
Sustainability 2020, 12, 6383                                                                      17 of 27

numbers of tasks Nt for maximum utilization of feasible VMs for each cluster is considered. Table 3
presents the container to VM mapping based on Algorithm 3. The tasks to VM mapping algorithm
identifies the minimum number of tasks to maximize VM utilization. The tasks are mapped to the
VMs based on Table 3.

 Algorithm 3: Identify optimum number of tasks from each cluster for a VM type
 Input: Task-List, VM-instanceist,
 Output: NT (task-type, VMtype)
 For each tasktype in Task-List
       For each VMtype in VM-instancelist
 Nt = Find the minimum number of tasks of tasktype that causes maximum utilization of
 VMtype resources.
 i.e., Min (Ntmax-CPU ,Ntmax-Mem )
 NT (tasktype,Vmtype).add(Nt)
         End
 End

                                 Table 3. Optimal number of containers for VM types.

               Task Type        VM Type-1    VM Type-2      VM Type-3     VM Type-4    VM Type-5
                     1             12             24            48            36          60
                     2              2             5              7             -          12
                     3              1             2              3             -           5
                     4              -             2              4             3           5
                     5              -              -             -             1           2
                     6              -              -             -             -           1
                     7              -             1              -             2           3
                     8              -              -             -             1           3
                     9              -              -             1             -           2
                    10              -              -             -             -           1

7. Performance Evaluation
     The experimental setup and the results obtained from the aforementioned VM placement
algorithms are discussed in this section. In view of the expenditure and time involved in the assessment
of comprehensive experimentation in real-time, environment simulation is done using MATLAB.

7.1. Experimental Environment for Investigation of Resource Allocation Policies

7.1.1. Data Center Power Requirement
      The power consumption of the task is measured based on processor power consumption incurred
due to its utilization. All the servers are considered to be in off state when not in use consuming no
power. 23 ◦ C is considered as the data center’s safe operating temperature. The peak server load
(IT load) power evaluation of the data center is expected ≈ as 52 kW for the server specification given
in Table 4. The floor space of the data center is measured ≈ as 500 square feet. The sum of electricity
power requisite is measured as ≈124 kW (including cooling load, UPS, lighting). The total processor
power consumption of the servers is supposed to be within 17.30 kW. The cooling load due to processor
utilization is restricted to 12.11 kW [60]. The renewable-aware algorithms assume clusters powered
by both grid and renewable energy in all the data centers. The clusters are powered by either one of
Sustainability 2020, 12, 6383                                                                                                   18 of 27

the energy sources at a time. The cooling devices are powered only by grid energy source in all the
data centers.

                                          Table 4. Physical machine configurations.

              Machines           Core Speed (GHz)           No. of Cores               Power Model           Memory (GB)
                   M1                     1.7                         2                      1                       16
                   M2                     1.7                         4                      1                       32
                   M3                     1.7                         8                      2                       32
                   M4                     2.4                         8                      2                       64
                   M5                     2.4                         8                      2                      128

7.1.2. Data Center Physical Machine Configuration
     Tables 4 and 5 correspond to the heterogeneous physical machines used in this simulation with
varying power models based on the SPEC power benchmark [61]. In order to evaluate the algorithms
presented in Section 5, an IaaS is modeled using four small scale data centers with 100 heterogeneous
servers located in four cities, namely Jacksonville, Miami, Orlando and Tampa. Each data center has
two clusters of heterogeneous machines powered by both renewable and grid power. The machines in
each cluster follow a particular power model. All data centers are assumed to have a cooling device
with CoP as in Equation (4) powered only by grid power. VM reservations are modeled as in Table 6
based on Section 6.3. Each data center holds two clusters with unique carbon footprint rates. The data
center’s cluster carbon footprint rate, energy price and carbon tax are observed based on [62,63] and
given in Table 7 [38].

                           Table 5. Utilization (%) and server power consumptions in watts.

            Power                                             Utilization Percentage
                         Idle
            Model                  10       20        30     40           50         60      70       80       90         100
               1          60        63      66.8     71.3   76.8          83.2      90.7    100      111.5    125.4   140.7
               2         41.6      46.7     52.3     57.9   65.4           73       80.7    89.5     99.6     105     113

                                                   Table 6. VM request types.

                                VM Type                      vCPU                          Memory (GB)
                                 Type-1                           1                                7.2
                                 Type-2                           2                                14.4
                                 Type-3                           4                               15.360
                                 Type-4                           3                               17.510
                                 Type-5                           5                               35.020

                                                Table 7. Features of data center.

                                        Carbon Footprint Rate                     Carbon Tax               Energy Price
                Data Center
                                            (tons/MWh)                           (dollars/ton)             (cents/kWh)
                    DC1                            0.124                              24                        6.1
                    DC2                            0.350                              22                       6.54
                    DC3                            0.466                              11                        10
                    DC4                            0.678                              48                       5.77

7.1.3. Solar Energy
     The hourly solar irradiance and temperature data was reported for the entire year of 2018 [64].
The solar output power (P) based on Equation (16) was used to generate solar energy (kWh/m2 /day) for
four data centers. With the Solarbayer configuration detail of flat-plate collectors of 2684 m2 enclosed
Sustainability 2020, 12, 6383                                                                          19 of 27

with fixed angle [65], the solar power output (P) for mean solar irradiance β (kW/m2 ) and ambient
temperature T is calculated as [66] (Equation (16)):

                                     P = λ × A × β (1 − 0.005(T − 25))                                    (16)

      The A (m2 ) is the area of the solar unit; λ is the conversion coefficient of solar. We assume the solar
energy trace as 0 between prior to 6 a.m. and after 6 p.m. Figure 6 displays the solar power generated
at different locations.

                                     Figure 6. Solar power generations.

7.2. Experimental Results
     The Google workload is studied and the tasks are clustered according to their resource request
pattern utilizing the clustering presented in Section 6.1. The VM sizing listed in Table 6 are based
on the procedure defined in Section 6.3. In our experiment, the identified task containers are hosted
in corresponding virtual machine types in each processing window. Each processing window is
considered to have duration of 300 s. At the start of each processing window, input request is received.
Based on the Lublin-Feitelson model [67], the arrival pattern of identified task containers along with
the number of tasks and runtime of the task is generated. The Gamma and hyper Gamma Lublin
parameters are utilized to generate tasks with varying holding time with a standard arrival time model.
The task containers are mapped to appropriate VM types. Figure 7 displays the CPU demand of
VM types for task containers in the generated workload. Only the active execution time of VM is
considered. Each VM is assigned a minimum of the single physical core of the host. All containers get
the same portion of CPU cycles. CPU limit and CPU requests are considered the same. This work
considers only CPU utilization of the VM and does not consider communications between VM and
containers. Memory limit and memory requests are considered the same for guaranteed quality of
service class. The local disk space 10GBis assumed, allotted for each virtual machine to provide enough
space for operating system installation on each VM. The experimental setup is used to evaluate the
proposed VM placement model in terms of carbon cost, consumption of green energy, consumption of
brown energy, carbon footprint and total operating cost.
Sustainability 2020, 12, 6383                                                                    20 of 27

                                Figure 7. CPU demand for VM requests.

7.2.1. Energy and Cost Efficiency of the Proposed Algorithms
     We evaluate the proposed VM placement algorithms to explore the impact on grid energy,
solar energy consumption, carbon emission and total cost for the CPU demand presented in Figure 6.
The renewable-based algorithms, namely, RC-RFFF,REC-RFFF,RCF-RFFF,RCC-RFFF, offers high priority
to renewable sources during its availability to power the servers. When there is insufficient renewable
source, the data center selection policy is independent for each proposed algorithm based on total cost
(TC), carbon cost (CC), and electricity cost (EC). Grid energy-based algorithms, namely, C-FFF, EC-FFF,
CF-FFF and CC-FFF, considers only grid source with independent data center selection policy based
on the aforementioned parameters.

7.2.2. Discussion on Grid Energy Consumption and Carbon Footprint Emission
      The quantity of brown energy consumption by different VM placement algorithms is depicted in
Figure 8. In C-FFF, eliminating renewable energy availability with total cost reduction as an objective,
considering varying electricity price and carbon tax, the brown energy usage is 11,222.78 kWh with 95%
confidence interval (CI): (1007.74, 14,875.94). In RC-RFFF, considering total cost reduction, the brown
energy usage is 7220.28 kWh with 95% confidence interval: (218.44, 14,869.16). It is noticed that the
RC-RFFF brown energy usage is 35.6% lesser than C-FFF due to renewable energy consideration.
In EC-FFF with electricity cost reduction as an objective without the consideration of green energy,
the brown energy consumption is 11,128.31 kWh with 95% CI: (958.84, 14,881.43). In REC-RFFF,
the brown energy usage is 6913.23 with 95% CI: (277.13, 14,878.51). The obtained results reveal that
the REC-RFFF brown energy usage is 37.8% less than EC-FFF due to renewable energy consideration.
Similarly in CF-FFF, the brown energy usage is 12,131.7 kWh with 95% CI: (975.20, 14,875.44).
In RCF-RFFF, it is 7903.63 with CI: (272.06, 14,871.14) which is 34.85% lesser than CF-FFF. In CC-FFF,
the brown energy consumption is 12,029.22 kWh with 95% CI: (1028.02, 14,870.66). In RCC-RFFF,
the energy consumption is 7869.22kWh with CI: (269.13, 14,867.92) which is 34.58% lesser than CC-FFF.
It can be inferred from the results obtained that the renewable-based algorithms’ counterparts hold
less brown energy usage due to the algorithms’ nature of scheduling the workload to the data centers
based on green energy availability to maximize its usage.
Sustainability 2020, 12, 6383                                                                     21 of 27

                                Figure 8. Grid power consumption of servers.

     In Figure 9, the carbon emission of the proposed algorithms is compared. The renewable-based
algorithms hold less carbon emission than grid energy consumption. The C-FFF emits 0.44441 tons of
carbon with 95% CI: (0.03716, 0.59748). The RC-RFFF emits 0.29734 tons with CL: (0.01794, 0.59738)
yields 33.09% less than the former. The EC-FFF holds 0.45197 tons with CL: (0.03796, 0.59842) and
CF-FFF holds 0.46218 with CL: (0.02619, 0.59758). Similarly, the REC-FFF holds 0.30034 with CL:
(0.02234, 0.59792) and RCF-FFF holds 0.30121 with CI: (0.01084, 0.59745). Both the approaches lead to
approximately 34% less carbon emission than the grid counter parts.

                                         Figure 9. Carbon emission.

     It is noteworthy to mention that the energy consumption and carbon emission of renewable-
based algorithms in the beginning intervals is significantly less than grid-based algorithms and has
more similar power consumption at later intervals which reveals the uncertainty of renewable energy
availability in all the intervals within a day.   1

7.2.3. Discussion on Total Cost
     Figure 10 portrays the total operating cost of the proposed algorithms. The C-FFF approach
results in total operating cost of 92.29$ with 95% CL: (7.99, 122.79). The RC-RFFF yields 65.35$ with CL:
You can also read