WHERE DO YOU "TUBE"? UNCOVERING YOUTUBE SERVER SELECTION STRATEGY

Page created by Sandra Haynes
 
CONTINUE READING
WHERE DO YOU "TUBE"? UNCOVERING YOUTUBE SERVER SELECTION STRATEGY
Where Do You “Tube”? Uncovering YouTube Server Selection Strategy

                              Vijay Kumar Adhikari, Sourabh Jain, Zhi-Li Zhang
                                     University of Minnesota - Twin Cities

                        Abstract                                id space, hierarchical logical video server DNS names-
                                                                paces and physical video cache servers. YouTube then
YouTube is one of the most popular video sharing web-
                                                                uses a combination of static and dynamic load balancing
site in the world. In order to serve its globally distributed
                                                                approaches to distribute the demand to its physical re-
users, it requires a massive-scale video delivery system.
                                                                sources. As shown in that work, the video-id space con-
   We analyze DNS resolutions and video playback                sists of fixed length unique identifier referring to each
traces collected by playing half a million YouTube              YouTube video, which are then mapped to hierarchi-
videos from geographically distributed PlanetLab nodes          cal video server namespaces using static hashing. In
to uncover load-balancing and server selection strate-          addition, YouTube DNS servers map these server host-
gies used by YouTube. Our preliminary results indicate          names to IP addresses corresponding to physical hosts in
that YouTube is aggressively deploying cache servers at         a client-location aware manner.
many different locations around the world, and several
of them are co-located with ISP-PoPs to reduce cost and            In this paper, we use the same active measurement in-
improve the end-user performance. We also find that             frastructure used in [1] to provide further insights into
YouTube tries to use local “per-cache” load-sharing be-         the YouTube video delivery network. In particular, we
fore resorting to redirect a user to bigger/central cache       provide additional details on how various strategies are
locations.                                                      used by YouTube to distribute the video request to its ge-
                                                                ographically distributed global cache servers, how these
                                                                strategies interact with each other, and how this interac-
1 Introduction                                                  tion impacts the performance observed by the user. Our
                                                                study shows that YouTube uses 3 different approaches to
YouTube, which started as a “garage-project” to share           distribute the load among various servers. a. Static load
videos online in 2005, has seen an explosive growth in          sharing using hash based mechanism: As noted in [1],
its popularity. Today it is indisputably the world’s largest    YouTube maps each video-id to a unique hostname in
video sharing site. It serves millions of users across the      each of the namespace in the hierarchical DNS based
world every day. However, due to its ever increasing pop-       host namespaces. b. Semi-dynamic approach using loca-
ularity and demand, it is subjected to a continual expan-       tion aware DNS resolutions: YouTube maps each DNS
sion to accommodate the growing demands. As shown               hostname to an IP address, which represents a physical
in a recent study [2], in 2008, it used 6 large data-centers    video cache, based upon the user location and current
located with in United States to serve the videos to users      demand. As seen in our experiments, YouTube redirects
(while LimeLight CDN was used to push the most pop-             user to a geographically close cache location during the
ular videos to users). However, these data-centers were         “normal” hours. However, during the “busy” hours it
not enough to meet the increasing global demand, and            uses DNS based resolutions to direct the user to a slightly
sometimes after it was bought by Google, it started ex-         farther location, which helps in avoiding geographical
panding its video distribution network by using Google’s        hot-spots. c. Dynamic load-sharing using HTTP redi-
infrastructure.    In a recent work [1], Adhikari et.al.        rections: Finally, to further balance the load on differ-
used a reverse engineering based methodology to un-             ent physical servers YouTube caches used HTTP redi-
cover the basic principles behind the design of YouTube         rections to direct user from a busy server to a not so busy
video delivery cloud. It showed that YouTube video de-          video server. It helps in smoothing the skewed load dis-
livery cloud consists of three components: the video-           tribution caused by the combination of video popular-
WHERE DO YOU "TUBE"? UNCOVERING YOUTUBE SERVER SELECTION STRATEGY
ity and the spontaneous video demands. Our findings                 popularity distribution, popularity evolution, and key el-
also show that YouTube caches are present at more than              ements that shape the popularity distribution using data-
to 45 cities in 25 different countries around the world.            driven analysis. The authors in [4] investigate the (top
These trends suggest that Google is aggressively push-              100 most viewed) YouTube video file characteristics and
ing its content close to users by placing a large num-              usage patterns such as the number of users, requests, as
ber of “cache servers” at various geographical locations            seen from the perspective of an edge network. Another
around the world. Moreover, several of these caches are             study[6] analyzed network traces for YouTube traffic at
co-located with ISP-PoPs, which not only helps in reduc-            a campus network to understand benefits of alternative
ing the bandwidth cost for both the ISP and YouTube, but            content distribution strategies. A more recent work [5]
also improves the performance for the ISP users.                    studies the impact of the YouTube video recommenda-
   The remainder of the paper is organized as follows.              tion on the popularity of videos. Our study compli-
We present background and related work in Section 2.                ments and advances these works by shedding lights on
Section 3 describes and characterizes various YouTube               the multi-step load-balancing and server selection strat-
cache locations. We discuss YouTube’s server selection              egy employed by YouTube to serve video contents from
and load balancing strategies in Section 4 and conclude             geographically diverse cache locations.
the paper in Section 5.
                                                                    2.2 YouTube Video Delivery Cloud
2 Background & Related Work                                         YouTube video delivery cloud consists of three major
                                                                    components: video id space, hierarchical logical video
In this section we first summarize the related work.                servers represented using multiple anycast1 (or logical)
Next, we provide an architecture overview of YouTube                DNS namespaces, and a 3-tier physical server cache hi-
video delivery cloud based on the findings from a recent            erarchy. In the following we present a brief overview
study [1].                                                          of these components, while detailed description can be
                                                                    found in [1].
                                                                    YouTube Video Id Space: Each YouTube video is
2.1 Related Work                                                    uniquely identified using a fixed length flat identifier.
The most relevant to our work is the recent study carried           These identifiers construct the video id space.
in [1]. In this study authors used an active measurement            Hierarchical Cache Server DNS Namespaces: YouTube
testbed comprising of several Planet-Lab nodes and open             defines multiple (anycast) DNS namespaces, each rep-
recursive DNS servers. Using the testbed authors played             resenting a collection of logical video servers with cer-
a large number of videos and analyzed the detailed video            tain roles. Together, these (anycast) DNS namespaces
playback logs to distill out the basic principles behind            form a hierarchical organization of logical video servers.
the YouTube video delivery cloud. However, the work                 Logical video servers at each level of this organiza-
mostly focused on the organization of video servers and             tion are mapped to IP addresses (of “physical” video
provides very limited analysis of mechanisms used to                servers residing at various locations) within a partic-
perform load-balancing and how it impacts the video                 ular tier of the physical cache hierarchy. There are
delivery performance. In this work, we use the same                 a total of three sets of anycast namespaces, which
testbed extract the key load-sharing mechanisms used by             we refer to as primary, secondary and tertiary names-
YouTube. In addition, we also provide a detailed chart-             paces; each namespace has a specific format. E.g.
ing of various YouTube cache servers, their locations and           hosts in primary namespace use the following format:
other characteristics. In another recent study [2], authors         v[1-24].lscache[1-8].c.youtube.com. As
utilize the Netflow traffic data passively collected at vari-       seen in the above example there are a total of 24 × 8
ous locations within a tier-1 ISP to uncover the locations          or 192 such hostnames in the primary namespace. Sim-
of YouTube data center locations, and infer the load-               ilarly, there are 192 hostnames in the secondary names-
balancing strategy employed by YouTube at that time.                pace, and 64 hostnames in tertiary namespace. These
The focus of the study was on the impact of YouTube                 logical layered DNS namespaces play an important role
load-balancing on the ISP traffic dynamics, from the per-           in dynamic HTTP request direction mechanism em-
spective of the tier-1 ISP. As the data used in the study is        ployed by YouTube. In general only the DNS names
from spring 2008, the results reflect the YouTube deliv-            belonging to the primary namespace are visible in the
ery infrastructure pre Google re-structuring. There are             URLs or HTML pages referencing videos; whereas DNS
several other existing studies of YouTube, which mainly                1 Here by an anycast (DNS) namespace we mean that each DNS
focus on user behaviors or the system performance. For              name is by design mapped to multiple IP addresses (“physical” video
instance, the authors in [3] examined the YouTube video             servers).

                                                                2
names belonging to the other two namespaces occur                server namespace, and 636 and 320 IP address for sec-
mostly only in the URLs used in dynamic HTTP request             ondary and tertiary namespaces respectively.
redirections during video playback.                                 Next, we performed WHOIS queries for all the ex-
Physical Cache Servers: Although there are three unique          tracted IP prefixes to find the corresponding owner or-
logical namespaces used to represent the DNS names for           ganizations for them. Our results show that although
the hosts in various logical cache servers, each anycast         most of the IP prefixes (/24s) belong either to Google
DNS hostname may map to a large set of IP addresses.             or YouTube, there are approximately 20% of the prefixes
E.g., hostname v1.lscache1.c.youtube.com in                      that belong to several other ISPs. The distribution of the
primary namespace maps to 75 different IP addresses de-          organizations that own these prefixes is shown in Fig-
pending on time and the location of the user.                    ure 1. In this figure the “others” category includes several
Datesets. For this study we used several datasets that           regional ISPs such as Bell-Canada, Comcast and a some
were collected using our active measurement platform             Internet Exchange Points. In addition, our results show
using 500 PlanetLab nodes and 1000 open recursive                that YouTube caches corresponding to IP addresses from
DNS servers.                                                     other ISPs are mostly co-located with ISP-PoPs, and only
   We obtained video playback traces by playing approx-          serve the client IP addresses belonging to those ISPs or
imately half a million videos used in [1]. We also ob-           their customers. We verified this by trying to access a
tained IP geolocation data from the same study. Ad-              number of videos from host IP addresses within those
ditionally, we collected DNS resolution data for the             ISPs and outside those ISPs. In all the cases, we found
YouTube hostnames from all the PlanetLab nodes con-              that only the client IP addresses within the corresponding
tinuously for a month. During that period we also moni-          ISP were able to successfully access the video. Requests
tored the change in view-counts for all 500K videos.             coming from outside hosts were not served.

3 Charting YouTube Global Caches                                 3.2 YouTube Cache Locations and Sizes
In this section we provide a detailed charting of YouTube
physical video servers in terms of the IP addresses used
and their geographical locations. For these analyses we
extract the YouTube server IP addresses by resolving
YouTube video server hostnames from various PlanetLab
nodes and open recursive DNS servers continuously for
more than a month. Next, we utilize the IP to city map-
ping data used in [1] to chart the geographic distribution
of YouTube caches.

                                                                 Figure 2: Three tiers of YouTube video cache loca-
3.1 IP addresses for YouTube servers                             tions(P=Primary,S=Secondary,T=tertiary).

                  YouTube 11%
                                  Others 19.5%                     Using the IP address to location mapping data we
                                                                 identified a total of 45 cities where YouTube video
                                                                 caches are present. Out of these 45 locations, 43 distinct
                                                                 locations correspond to primary caches, while secondary
                                                                 and tertiary caches mapped to only 8 and 5 distinct lo-
                                                                 cations respectively. The geographical span of these lo-
                                                                 cations is shown in Figure 2. The primary, secondary
                                                                 and tertiary cache locations are indicated by letters P, S
                   Google 69.5%                                  and T respectively in that figure. We note that at some
                                                                 locations, more than one tier of the cache hierarchy are
Figure 1: Organizations where the YouTube video-                 present.
serving IP prefixes come from.                                     Although there are 45 unique locations where
                                                                 YouTube caches are present, not all the locations are
  Using the DNS resolutions of YouTube servers’ logi-            same in terms of the available resources. To understand
cal DNS namespaces we extracted a total of 6000 unique           the diversity in the size of different cache locations, we
IP addresses. In terms of the namespace hierarchy, there         plot the distribution of YouTube cache sizes in Figure 3.
are 5000 IP addresses corresponding to primary logical           We use the number of IP addresses present at each lo-

                                                             3
cation as a proxy for the size of that cache location. In           To demonstrate this, we plot the percentile for the la-
this figure the x-axis represents the YouTube locations,         tency between the vantage point and the mapped IP ad-
and the y-axis represents the size in terms of the num-          dress, which is shown in Figure 4. In this figure, X-axis
ber of IP addresses. As seen in this figure, the sizes of        represents the percentile corresponding to the latency.
the cache locations vary widely. In particular there are         This indicates that although the hostnames could have
locations, which have large number of IP addresses, e.g.         been mapped to any of the 75 IP addresses, in almost all
Frankfurt cache location has more than 600 IP address,           cases, YouTube maps them to a physical cache server (or
while cache located in Athens has merely 9 IP addresses.         IP address) that is close to the client requesting a video.

                       800                                        1
Number of IP address

                                                                 0.8
                       600
                                                                 0.6
                       400
                                                                 0.4

                       200                                       0.2

                                                                  0
                         0                                             0   10   20     30       40     50    60     70     80   90   100
                             Cache locations                                      Percentile to which the mapped IPs belong.

Figure 3: Distribution of size of YouTube cache locations        Figure 4: CDF plot showing which decile mapped IPs
(cities) in terms of number of IP addresses seen.                belong to in terms of ICMP ping latency.

                                                                    We also visualize the location-awareness in DNS res-
                                                                 olution using a geographic map as shown in Figure 5.
4 Physical Server Selection & Load Bal-                          In this figure we tag each PlanetLab node’s location us-
  ancing                                                         ing one of the 5 YouTube tertiary cache location that it
                                                                 is mapped to. Here, each balloon represents a PlanetLab
We saw that YouTube aggressively pushes its content              node and the letter indicates the tertiary cache location
closer to user using a large number of geographically dis-       that node gets redirected to using DNS based resolutions.
tributed video cache servers. In this section, we use that       We can clearly see that in almost all of the cases, the
information to uncover the several mechanisms used by            nodes pick the closest tertiary cache location.
YouTube to serve each user request, so as to maximize
user performance and also to evenly distribute load on its
cache servers.
   When a user wants to watch a video, the video ID
gets mapped to a unique hostname. The hostname then
maps to different IP address based upon multiple factors.
YouTube’s server selection strategy can therefore be un-
derstood by examining this hostname to IP mapping. In
this section we investigate the spatial and temporal fac-
tors that influence what IP the hostname gets mapped to.

4.1 Location Awareness in Server Selection
                                                                 Figure 5: Locations PlanetLab nodes indicating
In order to understand how the user’s location plays a           their tertiary-cache location choices (B=BRU, C=CBF,
role in the mapping of YouTube cache server hostname             F=FRA, I=IAD, N=NUQ).
to IP address mapping, we resolved the hostnames corre-
sponding to primary logical servers from all the vantage           These results show that YouTube tries to send users to
points. We find that each primary logical server host-           one of the closest cache locations.
name maps to approximately 75 unique IP addresses. We
also computed latencies between the vantage points and
                                                                 4.2 Temporal Patterns in DNS Mappings
all the mapped IP addresses. These round trip delay mea-
surements show that the hostname to IP address mapping           To understand temporal changes in DNS resolutions, we
provided by YouTube DNS servers tries to direct the user         continuously resolve the logical server hostnames from
to a near by physical cache server.                              all the vantage points for a month-long period.

                                                             4
Our analysis of these DNS resolutions revealed that            approximately 4 times more video requests than others.
we can group the vantage points into two distinct groups          In fact, we can expect that during smaller time windows
based on how frequently the mappings change. In the               at different location this distribution is even more likely
first group of vantage points the mappings change dur-            to be skewed, which can result into highly uneven load
ing a certain time of the day, and the pattern repeats ev-        distribution on different physical servers.
ery day. However, the pattern is more continuous for the
vantage points in the second group.                                                                       x 10
                                                                                                              8

                                                                                                    1.6
   To illustrate this we plot the IP address mappings for

                                                                  Number of video requests served
one of the primary logical server’s hostname. Figure 6                                              1.4

shows an example plot for the second group. In this fig-                                            1.2
ure, the X-axis represents the time which is divided in
the intervals of 5 minutes each, and Y-axis represents the                                           1

mapped IP address. As seen in this group the mapped                                                 0.8
IP address changes almost every time. It suggests that
YouTube is trying to use 4 distinct physical servers to                                             0.6

represent the instance of one particular primary logical                                            0.4
                                                                                                          0       20   40    60     80    100     120   140    160   180   200
server, and changes the mapped IP address to divide the                                                                     Hostnames from primary namespace
client request to each of these physical servers. On the
other hand, there are other locations where new IP ad-            Figure 8: Distribution of video requests per primary
dresses only show up at specific times of the day. Our            cache hostnames.
analysis of DNS resolution pattern for these locations
shows that each logical server hostname maps to a fixed              We can see that in this setting, it is very hard to achieve
IP address most of the time during the day, however, dur-         even load sharing among different physical servers at a
ing the certain hours of the day we see a large number of         given cache location. In addition, these static hash-based
distinct IP addresses for the same hostname. We show an           load distribution and DNS based dynamic mappings, by
example of such a location in Figure 7. In this figure, we        themselves, can not ensure that a location will be able to
see that DNS servers primarily map the hostname to IP             handle a sudden increase in video demand. To deal with
3. However, at “busy” hours of the day, the DNS server            these problems, YouTube uses a more dynamic approach
also starts mapping the hostname to other IP addresses.           using HTTP redirection.
In this week long data, we can clearly see 7 specific peri-          To understand this HTTP redirection based dynamic
ods (one every day) in which we see additional IP address         load sharing approach, we extract all the instances where
other that IP 3.                                                  a user is redirected to a different physical server using
                                                                  HTTP redirections. Our analysis of these redirect logs
4.3 Need for Dynamic Load Balancing                               reveal two distinct patterns in redirections. In the first set
                                                                  of instances, a logical server in one namespace redirects
We have seen that YouTube uses a static hash-based                to another logical server in the same namespace. We re-
scheme to map videos to hostnames. However, the static            fer to these cases as “intra-tier” redirections. In rest of
mapping of videos to hostname can lead to uneven load             the cases, we see that a server in one namespace redirects
on the logical servers due to the different popularity for        to a server in a higher tier of cache hierarchy (such as a
each video. For instance, if a video becomes popular, the         server from primary namespace redirecting to a server
logical server corresponding to that video will experience        in secondary namespace). These cases are referred to as
more traffic than other servers. These uneven loads for           “inter-tier” redirections.
logical servers can then translate into uneven load distri-          In case of intra-tier redirection, we looked at loca-
bution for physical servers corresponding to them.                tions of the servers involved in the redirection. A careful
   To understand this uneven load among different                 inspection of these redirections showed that the IP ad-
servers, we tracked the global video view counts for ap-          dresses of the two servers (the one that redirects the client
proximately half a million videos over a month. We see            and the one which receives and handles the redirected re-
that even when we look at aggregate video counts over a           quests) are in the same location (city). This means in case
month that includes videos views from all over the world,         of intra-tier redirections, YouTube is trying to redirect a
different logical servers receive widely differing video          user another physical server at the same location.
requests. To demonstrate this we plot the distribution of            This suggests that YouTube is trying to distribute the
video request on each logical server during a month for           load locally: i.e. a busy server in one location redirecting
our sample of half a million videos in Figure 8. As seen          a client to a less busy server in the same cache location.
in this figure, some logical servers are responsible for          To see if this is indeed the case, we look at the individual

                                                              5
IP address identifiers.   4

                          3

                          2

                          1
                               200         400       600            800            1000            1200            1400      1600          1800          2000
                                                       Time slot identifiers. Each point indicates a 5 minute interval.

                                     Figure 6: Example plot showing hostname to IP mapping changing continuously.

                          8
IP address identifiers.

                          6

                          4

                          2

                               200         400       600            800            1000            1200            1400      1600          1800          2000
                                                       Time slot identifiers. Each point indicates a 5 minute interval.

                              Figure 7: Example plot showing hostname to IP mapping changing only during peak hours.

IP addresses involved in redirections. We see that at each                             geographic locations and other characteristics. We also
location and at any given time period, there are servers                               uncovered key ideas behind the application-level redirec-
that seem to redirect to a different set of servers at the                             tion mechanism that YouTube uses and how that impacts
same location.                                                                         the performance as observed by its users. These results
   We see that the use of HTTP redirects helps YouTube                                 demonstrate how one of the best content distribution net-
to achieve a more fine granular load distribution among                                work works. This has implications on how any such fu-
servers. The use of intra-tier and inter-tier redirections                             ture systems can be developed and deployed.
require no global coordination among resources. The                                       In the future, we plan to build upon these prelimi-
local load sharing only requires sharing load informa-                                 nary findings to investigate other aspects of YouTube
tion among nodes in the same location. Redirecting to                                  video delivery system such as the effects of popularity of
a higher tier also does not need any global coordination                               videos and size of cache locations on server selection.
since the target hostname can easily be obtained by the
same static hashing mechanism.                                                         References
   However, the use of redirections also has its own draw-
                                                                                       [1] A DHIKARI , V. K., JAIN , S., C HEN , Y., AND Z HANG ,
backs. First of all, since YouTube first tries to load-                                    Z.-L. How Do You ”Tube”?         In Tech Report, http:
balance locally, it leads to multiple redirections if the                                  //www-users.cs.umn.edu/˜viadhi/resources/
host that receives the redirection can not serve the video.                                youtube-tech-report.pdf (2010).
In addition, each redirection requires a client to make an                             [2] A DHIKARI , V. K., JAIN , S., AND Z HANG , Z. YouTube Traffic
additional HTTP request, it also leads to higher delays                                    Dynamics and Its Interplay with a Tier-1 ISP: An ISP Perspective.
before the video starts playing back. Moreover, inter-tier                                 In IMC ’10 (2010), ACM.
redirections generally leads a client to a distant cache lo-                           [3] C HA , M., K WAK , H., RODRIGUEZ , P., A HN , Y.-Y., AND
                                                                                           M OON , S. I tube, you tube, everybody tubes: analyzing the
cation because the higher tier caches are only present at                                  world’s largest user generated content video system. In IMC ’07
small number of locations.                                                                 (2007), ACM.
                                                                                       [4] G ILL , P., A RLITT, M., L I , Z., AND M AHANTI , A. Youtube traffic
                                                                                           characterization: a view from the edge. In IMC ’07 (2007), ACM.
5 Conclusion                                                                           [5] Z HOU , R., K HEMMARAT, S., AND G AO , L. The Impact of
                                                                                           YouTube Recommendation System on Video Views. In IMC ’10
We examined YouTube’s video distribution architecture                                      (2010), ACM.
to uncover key aspects that determine how video servers                                [6] Z INK , M., S UH , K., G U , Y., AND K UROSE , J. Characteristics
and their locations are determined when users try to                                       of youtube network traffic at a campus network - measurements,
watch YouTube videos. We found that YouTube uses a                                         models, and implications. Comput. Netw. (2009).
large number of data-centers and caches that vary in size,

                                                                                   6
You can also read