Achieving Near-Optimal Traffic Engineering Solutions for Current OSPF/IS-IS Networks

Page created by Juan Hamilton
 
CONTINUE READING
1

             Achieving Near-Optimal Traffic Engineering
             Solutions for Current OSPF/IS-IS Networks
                                                                                                       
                   Ashwin Sridharan , Roch Guérin , Christophe Diot
          ashwin,guerin @ee.upenn.edu , christophe.diot@intel.com

       Abstract— Traffic engineering aims to distribute traffic           and (optimally) assigning traffic to them, typically
    so as to “optimize” some performance criterion. This                  calls for changes to both the routing protocols
    optimal distribution of traffic depends on both the routing           and the forwarding mechanism they rely on, e.g.,
    protocol and the forwarding mechanisms in use in the
                                                                          through the introduction of new technology such as
    network. In IP networks running the OSPF or IS-IS
    protocols, routing is over shortest paths, and forwarding             MPLS [15].
    mechanisms distribute traffic “uniformly” over equal cost                Currently, two of the most widely used Interior
    shortest paths. These constraints often make achieving an             Gateway routing protocols are OSPF [14] and IS-
    optimal distribution of traffic impossible. In this paper,            IS [4], and devising solutions that allow these
    we propose and evaluate an approach that can realize                  protocols to emulate “optimal routing,” is therefore
    near optimal traffic distribution without changes to routing
                                                                          desirable. There are two main difficulties in doing
    protocols and forwarding mechanisms. In addition, we
    explore the trade-off that exists between performance and             so. The first is that these protocols use shortest
    the configuration overhead that our solution requires. The            path routing with destination based forwarding. The
    paper’s contributions are in formulating and evaluating               second is that when the protocols generate multiple
    an approach to traffic engineering in IP networks that                equal cost paths or next hops for a given desti-
    achieves near-optimal performance while preserving the                nation routing prefix, the forwarding mechanism
    existing infrastructure.                                              equally splits the corresponding traffic across them
      Index Terms— Routing, Networks, Traffic Engineering,                to balance the load. This “equal” splitting is an
    Aggregation.                                                          approximation that depends on the selection of
                                                                          which next hop to use for a given packet. This
                    I. I NTRODUCTION                                      selection is based either on the value (one for each
       As the amount and criticality of data being carried                possible next hop) of a hash function applied to
    on IP networks grows, managing network resources                      the packet header (source and destination addresses
    to ensure reliable and acceptable performance be-                     and port numbers), or on the state of a simple
    comes increasingly important. Furthermore, this                       round-robin scheme that cycles through the possible
    should be accomplished while minimizing or defer-                     next hops. Although the latter option is occasionally
    ring costly upgrades. One of the techniques that is                   used and provides for a more even distribution of
    being evaluated by many Internet Service Providers                    load across possible next hops, the former option
    to achieve this goal is traffic engineering. Traf-                    is the more commonly used in practice. This is
    fic engineering uses information about the traffic                    because it preserves packet ordering within a flow,
    entering and leaving the network to generate a                        and because the large number of flows that modern
    routing scheme that optimizes network performance.                    routers handle results in a good approximation of
    Most often the output of traffic engineering is an                    equal splitting ([5]). For simplicity, in the paper we
    “optimal” set of paths and link loads that produce                    assume that traffic is split in exactly equal fractions
    the best possible performance given the available                     across equal cost next hops.
    resources. However, explicitly setting up such paths                     The constraints imposed by the use of both short-
                                                                          est path routing and equal splitting across equal cost
                                                                         paths make it difficult or even impossible to achieve
 Dept. of Elec. Eng., Univ. of Pennsylvania, Philadelphia, PA 19104.   optimal traffic engineering link loads. One of the
    Intel Research, Cambridge, UK
 The work of A. Sridharan and R. Guérin was supported in part by         first works to explore this issue was [7], where a
 NSF Grant ANI-9902943                                                    local search heuristic was proposed for optimizing
2

OSPF weights assuming knowledge of the traffic                    other words, for each prefix we define a (sub)set
matrix. [7] showed that in spite of these constraints,            of allowable next hops by carefully selecting this
properly selecting OSPF weights could yield sig-                  subset from the set of next hops corresponding to the
nificant performance improvements. However, the                   shortest paths computed by the routing algorithm.
paper also showed that for some topologies, per-                  This allows us to control how traffic is distributed
formance could still differ substantially from the                without modifying routing protocols such as OSPF
optimal solution. Subsequently, a result from linear              or IS-IS, and without requiring changes to the
programming ([2][Chap. 17, Sec. 17.2]) was used                   data path of current routers, i.e., their forwarding
in [21] to prove that any set of routes can be                    mechanism. It does, however, require some changes
converted into a set of shortest paths based on                   to the control path of routers in order to allow the
some link weights that matches or improves upon                   selective installment of next hops in the forwarding
the performance of the original set of routes. This               table.
establishes that the shortest path limitation is in                  Our initial focus is on gaining a better understand-
itself not a major hurdle. However, the result of                 ing of how well the selective installment of next
[21] assumes forwarding decisions that are specific               hops for different routing prefixes can approximate
to each ingress-egress pair, and more importantly,                an optimal traffic allocation (set of link loads).
the ability to split traffic in an arbitrary ratio over           We show that this problem is NP-complete and
different shortest paths. Both of these assumptions               hence one must resort to heuristics. Toward this
are at odds with current IP forwarding mechanisms.                end, we propose a simple local search heuristic for
   Compatibility with destination based forwarding                implementing the solution. We obtain a bound on
can be achieved through a simple extension to the                 the gap between the heuristic and optimal allocation,
result of [21], simply by taking advantage of a                   and also demonstrate its efficacy through several
property of shortest paths ([17]), or by readjusting              experiments. Even though we study the heuristic in
the program formulation itself ([1])1 . Accommo-                  the context of a routing problem, we believe that
dating the constraint of uneven splitting of traffic              it is generic enough to be of potential use in other
across multiple shortest paths is a more challeng-                load balancing scenarios. The main finding from our
ing task. It is typically not supported with the                  investigation is that the performance achieved by
forwarding path implementations that accompany                    this approach is essentially indistinguishable from
most routing protocols (some limited support is                   the optimal.
available with Cisco’s proprietary routing protocol                  This being said, an obvious drawback of “hand-
EIGRP2 ). A new protocol, OSPF-OMP, was pro-                      crafting” the set of next hops that are to be installed
posed in [19] that circumvents this constraint by                 for each routing prefix in a router’s forwarding
modifying the hash function in the router to allow                table, is the configuration overhead it introduces.
unequal load balancing. However, this will require                The potential magnitude (proportional to the size
significant changes to the forwarding mechanisms in               of the routing/forwarding table) of this overhead
the router’s data path, which is typically not some-              could make this approach impractical. As a result,
thing that can be accomplished through a simple                   our next step is to investigate a solution that can
software upgrade.                                                 help mitigate this overhead, albeit at the cost of
   The solution we propose leverages the fact that                a possible degradation in performance. Specifically,
present day routers have thousands of route entries               we limit the number of routing prefixes for which
(destination routing prefixes) in their routing table.            we perform the proposed selective installment of
Instead of changing the forwarding mechanism re-                  next hops. Our results indicate that a significant
sponsible for distributing traffic across equal cost              reduction in configuration overhead can be achieved
paths, we plan to control the actual (sub)set of                  without a major loss of performance.
shortest paths (next hops) assigned to routing prefix                The rest of the paper is structured as follows.
entries in the forwarding table(s) of a router. In                Section II introduces the linear program formulation
                                                                  used in [21] to generate an ”optimal” set of shortest
  1
   This is explained in detail in Section II-A
                                                                  paths, and introduces the proposed modifications
  2
   Cisco’s specification of EIGRP allows unequal load balancing   to make it compatible with existing IP routers.
across shortest paths and non-shortest paths.                     Section III presents a heuristic for approximating
3

                                                                                                                                              
                                                                                                                                   >;                   )*J+          M#%&N
an optimal traffic distribution by manipulating the                                                                                         *J+LK        (                        
                                                                                                                 C
                                                                                                                  7 I
set of next hops assigned to each routing prefix. A                                                                                        
                                                                                                                 D K           ;                     G           M#=&'          O
                                                                                                                                         4* 3 +LK                                     PH   (1)
performance bound is also derived (see Appendix).
                                                                                                                          
Section IV presents several experiments that first                                                                 ;
                                                                             where                      is the fraction ,* +                of traffic for commodity 
establish the efficacy of the heuristic of Section III,                                                                             M#=&'
                                                                             that flows through link                                               . Solving            the linear pro-
and then explore the impact on performance of low-                                                                                                        ;
                                                                             gram gives a traffic allocation Q                                               ( 4
                                                                                                                                                               *   3 
                                                                                                                                                                     + R that consumes
ering configuration overhead. Section V provides                                                               )*43 +
                                                                             no      more            than       (        amount               of     bandwidth               on any link
a brief summary of the paper’s contributions and                               M#=&'
outline directions for future work.                                                     . Note that the formulation is not dependent
                                                                             on the original network                   "  R
                                                                                                                                            cost function used to ob-
                                                                             tain the paths Q                                    . Instead it achieves the same
     II. F ROM O PTIMAL ROUTING TO S HORTEST                                 performance by matching the desired bandwidths
                        PATH ROUTING                                         )*,+
                                                                              ( . In order to obtain link weights for shortest path
     In this section, we first briefly review the clas- routing, the dual of the linear program as formulated
 sic result from linear programming [2][Chap. 17, in [21] needs to be solved:
                                                                                                                             
 Sec. 17.2] that was cast in the context of routing                                                                   Y
                                                                                    -TSCU                                  [
                                                                                                                           W   Z\A 2                 )*,+C]^*,+
 in communication networks in [21] to show how                                                  7IV3$W47CX                          *43 +65879
                                                                                                                                                        (
 optimal routing can be achieved using only shortest
 paths. We then discuss why this result is not directly                             subject to 
                                                                                          Y               Y                                           `                   aM#=&'
                                                                                                 + A          * K ]H*J+\_ G                                 T^                    
 usable in current IP networks, and finally propose
 solutions that allow us to implement the result under                                      ]^*,+Lb D
                                                                                                   
 the existing paradigm.                                                                      Y              Ded
                                                                                                  c Z                                                                                      (2)
     The network is modeled
                         
                                            as a directed graph 
                                                                                                                                                                               G
           with                routers and                              The dual gives a set of link weights Q _ ]^                                                       ( *43 + R ,
 directed links. We assume the                  existence
                                                     
                                                                of a traffic from which a set of shortest paths can be con-
 matrix  where entry                                       denotes the structed.
 average intensity
                        of     traffic       entering      the  network
                                                                     
                                                                          at        It is however important to understand that al-
 ingress router         and exiting at egress router for though routing can now be done over shortest paths,
 commodity ! . A good reference on how to this is still quite different from the forwarding
 construct such a traffic matrix can be found in [6]. paradigm currently deployed in existing OSPF and
 Assume that an optimal allocation based on some IS-IS networks. The reasons are two-fold and both
 network wide cost function yields a set of paths can be traced to the output of the primal
"                                                                                                                                                                                       LP,
       for each commodity  (ingress-egress router Eqn. (1), namely, the traffic allocation Q ; ( *?3 +CR . They
 pair), so that the $total
                      #%&'
                                bandwidth consumed by these govern the fraction of traffic forwarded on each
        3
 paths on link                   is )( *,+ . It can be shown that next-hop.
 the same performance, in terms of the bandwidth                                    Firstly, observe that the traffic allocation is for
 consumed on each link, can be achieved with a set each commodity or ingress-egress router pair. This
 of shortest paths by formulating and solving a linear means that the routing protocol could possibly gen-
 program and its dual. The linear program can be erate different set of next hops for each ingress-
 formulated as ([21]):                                                       egress pair on which traffic is to be forwarded. This
                                                                                                          would impact the forwarding mechanism on the data
  -/.10    2                                ;
                                                  *,+                                                     path, as the router would need to make decisions
               *43 +65879  7:                                                                           on the basis of both ingress and egress routers.
   subject to                                                                                              Clearly, this is at odds with the current paradigm
                                                                                 D #FE             of destination-based forwarding.
                                                                                      
                2            ;                           2          ;             G   #                    The second problem relates to the fact that current
                                   @+ *BA                               ,* +               
          +=< +>3 *?5179                         +=< *43 +=517C9                                          forwarding mechanisms support only equal splitting
                                                                                        /H
                                                                                                           of traffic on the set of equal cost next hops. The
   3
     The bandwidth consumed on a link is assumed to be the sum of                                          linear program yields a traffic allocation that is not
the traffic on all paths that use the link                                                                 guaranteed to obey this constraint. Modifying the
4

                                                                                                                                                                                           `                 #                 k                             
forwarding engine to support unequal splitting (see,                                                                                                                                                                                                      
                                                                                                                                    W
for example, [19]) of traffic would involve changes                                                                               *,+ K           )*,+ L`m$#%&'
                                                                                                                                                   (                
                                                       W47Clmf
to the data path, namely modification of the hash
function to allow for controllable hash boundaries. where
                                                            W                                                                                                             d
In the next two sub-sections we suggest methods         i                                                                          <  7%:n3 W[ZopW 
that overcome both these problems.
                                                                                                                                                                                                                                                                                        (3)
                                                                                                                                                                                                                                                                            W
A. Destination Based Aggregation of Traffic                                                                                   Note that the variables  of the Primal, f , repre-                                                                                         4* 3 +
                                                                                                                                                #=&
   The first problem of translating a traffic alloca-                                                               
                                                                                                                        sent traffic on link         headed toward egress router
tion that distinguishes between ingress-egress router                                                                     . Also, the flow conservation constraints have been
pairs into one that only depends on egress point is                                                                     modified
                                                                                                                            W        to accommodate the aggregation, where
                                                                                                                        i      represents all traffic headed toward destination
relatively straightforward. It can be achieved simply                                                                 
by transforming the individual splitting ratios of                                                                        .
ingress-egress pairs that share a common egress                                                                               The dual is given by :
router into a possibly different splitting ratio for the
                                                                                                                                                          W@s             W
aggregate traffic associated with the common egress                                                                           -/SjU                                                A           2                               )*,+]H*,+
                                                                                                                                                                                                                                 (
router. The reason this is possible is because all                                                                                              W47lrq                                            *43 +=517C9
chosen routes are shortest paths. Shortest paths have
                                                                                                                        subjectW to
the property that segments of shortest paths are also                                                                         Y                  Y W            ]^*,+\_                              G           `u                                x$#%&'
                                                                                                                                  *nA              +tK                                                                              wv                                           
shortest paths, so that once two flows headed to the                                                                                                            `yM#=&'
                                                                                                                              ]H*J+Lb D
same egress point meet at a common node they will                                                                                                                                                        
                                                                                                                               Y W                        `B
subsequently follow the same set of shortest paths.                                                                              W  D                                        zv
This means that we need not distinguish between
                                                                                                                                                                                                                                                                                      (4)
these packets based on their source address, and                                                                                        W
can make splitting and forwarding decisions simply                                                                    where q represents the right hand side array in
based on their destination address.                                                                                 the flow conservation constraints of the primal, Eqn.
   The aggregation of flows headed to a common                                                                      (3), that is :
egress could either be done after the optimal traffic                                                                                                                                  W                                        #            
                                                                                                                                                            Ai                                                                      
allocation for each commodity has been computed                                                                            { W M#6                                          j                                      #                                       
                                                                                                                                                                                                  /|                                               
(for example by the LP in Eqn. (1)) or in the                                                                                                                        D
                                                                                                                                                                                                                          otherwise
formulation of the linear program itself. The first
                                                                                                                                                      G
method is presented in [17]. The second scheme                                                                      and as before, Q _ ]H( *43 + R still represent link weights
was presented in [1] and illustrated below. Since                                                                   for shortest path computation. The above formu-
traffic allocation is done based only on the egress                                                                 lation is not only conformal to the paradigm of
router, we can aggregate all commodity flow vari-                                                                   destination based forwarding, but also reduces the
ables headed toward a common egress point and                                                                       number of variables by a factor of                through
then suitably modify the conservation constraints.                                                                  removal of information regarding ingress-egress
A re-formulation of Eqn. (1) and Eqn. (2) in the                                                                    router pairs. This greatly improves the computa-
destination aggregated form as presented in [1] is                                                                  tion complexity involved in obtaining an optimal
reproduced below.                                                                                                   solution and will be used henceforth throughout
   The primal Linear Program is given by :                                                                          the remainder of the paper to compute the optimal
                                                                                                                    traffic allocation and link weights.
                                                W
    -/.10       2                             *,+
                    *43 +65879 W47Xgf
                                                                                                                    B. Approximating Unequal Split of Traffic
subject to                                                                   W                                        In the previous sub-section, we saw that solving
                                                                                         #           
                                                                      Ai                                          the problem of source-destination based forwarding
                              W                             W           C       #               j       
       2                    +@*hA         2               *,+                                                   decisions was relatively straightforward. Unfortu-
    +6< +>3 *[5879     f           +=< *43 +=5179   f           D
                                                                                 otherwise                          nately, the same does not hold for the uneven
5

 splitting issue, and as mentioned earlier providing                         to yield information at the granularity of a routing
 such a capability is a significant departure from                           prefix.
 current operations. Our proposal to overcome this                              The second issue concerns the configuration over-
 problem is to take advantage of the fact that today’s                       head involved in communicating to each router the
 routing tables are relatively large, with multiple                          subset of next hops to be used for each routing pre-
 routing prefixes associated with the same egress                            fix. This can clearly represent a substantial amount
 router. By controlling the (sub)set of next hops                            of configuration data, as routing tables are large
 that each routing prefix is allowed to use, we can                          and the information that needs to be conveyed is
 control the traffic headed toward a particular egress                       typically different for each router. The approach
 router(destination). In other words, instead of the                         we propose and study is to identify a small set
 current operation that has all routing prefixes use                         of prefixes for which careful allocation of next
 the full set of next hops, we propose to selectively                        hops is done and rely on default behavior for the
 control this choice based on the amount of traffic                          remaining prefixes. The trade-off will then be in
 associated with each routing prefix and the desired                         terms of how close one can get to an optimal
 link loads for an optimal traffic allocation.                               traffic distribution, while configuring the smallest
      The following example illustrates the idea behind                      possible number of routing prefixes. We investigate
 the approach. Assume that at some node, there are                           this trade-off in Section IV, where we find that
 four routing prefixes, '} ~ € and  that map to                      near optimum performance can often be achieved by
 a common        destination      and have traffic intensities              configuring only a small number of routes (routing
‚          ‚         ‚               ‚     
    '}   „ƒ      ~   „       €   „† and        „‡ . Let          prefixes).
 there be three shortest paths associated with destina-                         The third and last challenge is to actually for-
 tion  , so that the routing table has 3 possible next                      mulate a method for determining which subset of
 hops to  , and assume that the optimal distribution                        next hops to choose for each routing prefix in
 of traffic to the three next hops is ˆp}‰‹Š ˆj~r„‡                        order to approximate an optimal allocation. The goal
 and ˆj€ŒŽ . We can then intuitively match this traffic                    of any solution will be to minimize some metric
 distribution by the following next hop assignment:                          that measures discrepancy between the optimal traf-
                                                                             fic allocation and the one achieved under equal-
                    '}        Hops 1, 3                                   splitting constraints on any hop. In this work, we
                    ~         Hop 1                                       use the maximum load on any hop as a measure of
                    €         Hop 3                                       performance, where the load on a hop is the ratio
                             Hop 2                                       of the allocated traffic to the optimal traffic. Details
                                                               
                                                                             regarding the use of another metric, the gap between
                                                                       _
   The resulting  traffic distribution is        ˆ’} ‘        ƒ “”ƒ
                                                                  e          optimal traffic and allocated traffic can be found in
   G                  G                           _        G 
p“     •Š ˆ’~ ‘  ‡e“   „‡ ˆ’€ ‘          ƒe“”ƒ     , †e“         –     [17].
which matches the optimal allocation.
   The advantage of the above approach is that the                                III. H EURISTIC FOR T RAFFIC S PLITTING
forwarding mechanism on the data path remains                                   Ideally, one should consider the problem of se-
unchanged, as packets are still distributed evenly                           lective next-hop allocation at the global level, that
over the set of next hops assigned to a routing prefix.                      is, do a concurrent optimal assignment of next hops
This means that a close approximation of an optimal                          for each routing prefix at each node. However, even
traffic engineering solution might be feasible even                          the single node allocation problem is NP-complete4 ,
in the context of existing routing and forwarding                            and hence may not be computationally tractable.
technologies. There are, however, a number of chal-                          Consequently, we propose heuristics that perform
lenges that first need to be addressed. The first is the                     independent computations for each routing prefix
need for traffic information at the granularity of a                         at each node. These computations are based only
routing prefix entry instead of a destination (egress                        on the incoming traffic at the node and the desired
router). This in itself is not an insurmountable task                        outgoing traffic profile. A potential problem with
as most of the techniques currently used to gather                           this approach is that the traffic arriving at a node
traffic data, e.g., router mechanisms’ like Cisco’s
Netflow or Juniper’s cflowd, can be readily adapted                           4
                                                                                  See Appendix for proof.
6

may not match the optimal profile due to the heuris-               of assigned traffic to the optimal traffic load over all
tic decisions at some upstream node. Consequently,                 hops. The difference now is that each task (stream)
the profile of the outgoing traffic from the node                  can be split equally among multiple processors
in question, could further deviate from the desired                (next hops) and the processors (next hops) can have
one. However, in our experiments we observe that                   different speeds (optimal traffic loads).
usually the heuristic is able to track the optimal                    Algorithm MIN-MAX LOAD:
load profile and hence incoming traffic seen at any                   1) Sort the set of prefixes £xš 3 œ in descending
node and the resultant outgoing traffic have a near-                      order of traffic intensity.
                                                                                            #       3œ
optimal profile. We have proposed three heuristics                    2) For each prefix      §£xš    choose a subset  of
                                                                                     ¨                               ¨
that are greedy in nature and try to minimize one                         next hops ( w™›š 3 œ , with cardinality  ( 
of the two metrics mentioned in Section II-B. Since                       which minimizes
the heuristics are similar in nature, we shall limit                                                                  *?®
                                                                                                                            } _                  ¯° ±
                                                                                                                  ¦                           ±²³
our discussion to only one of them, namely the                                            -TSCU       7©«ª¬ ­  ¡                                    
                                                                                                  ¡
MIN-MAX LOAD heuristic, because in all our ex-                                                                                        ˆ ¡
periments, this heuristic performed the best. Details              Step 2) can be G achieved         in two stages. First, for
regarding the other two heuristics can be obtained                                      ddd ž        3œ
                                                                   each index ´m     ƒ
                                                                                              #
                                                                                                      š    , do a virtual assign-
from [17].                                                         ment of routing prefix to a set of ´ hops which
   The heuristic we propose works broadly in the                   yields the smallest maximum.  *?®        This can be done by
following fashion. When performing computation at                                                               } _               *
                                                                                                           ¦                ¢           “C´
an arbitrary node                                                  simply sorting the set Q ¡        ˆ ¡
                                                                                                               R    µ–™Œš
                                                                                                                             3œ

   1) Sort routing prefixes destined to a particular               in increasing
                                                                              #
                                                                                 order, re-indexing them and virtually
        egress router in decreasing order of traffic               assigning only to the first     ´ hops.
                                                                                             ž
        intensity,                                                    Second, from all the š 3 œ such possible assign-
   2) Sequentially assign each routing prefix to a                 ments, choose the one with the smallest maximum
        subset of next hops so as to minimize the                  for an actual assignment. In case of a tie, choose a
        given metric.                                              lexicographically smaller assignment.
                                                                                               
                                                                                                                 The complex-
                                                                                                                  ž ~ 
For clarity we use the following notation in our                   ity of the algorithm is ¶ ž vŽ·1¸pž ¹xv _ v          , where
                                                                                         3œ                3œ
subsequent discussion:                                             we have set v‹vtš                  š    for simplicity.
At an arbitrary node — , when assigning routing                       We outline our implementation of the heuristics
prefixes associated with an arbitrary egress router                in the pseudo-code below :
(destination) ˜ to next hops:
                                                                     procedure     
                                                                                          Selective HopG Allocation
   1) Denote the set of        next     hops to egress router
                   3œ     G     ddd ž     œ                       Input º          Link    Weights Q _ ]H    ( *J+ R , optimal traffic
      ˜    by ™›š     Q    ƒ            špŸ R .                                       W                              
                                                                     allocation Q f ( *,+R , Traffic Matrix 
   2) Denote the desired (optimal) traffic load (for
                                                                     For each destination node ˜ do
        egress router ˜ ) on hop /z™›š 3 œ by ˆj¡ . #                                                                         G
                                                                        Run Dijkstra’s algorithm    E
                                                                                                            with weights Q _ ]^     ( *,+ R
   3) Denote the traffic intensity of routing prefix
                                                                        For each node — Ž˜ in order of decreasing
        by ¢ * . Denote the collective set of the routing
                                                                     distance from ˜ do
        prefixes (at — for ˜ ) that need to be assigned
                                                                          Apply the heuristic to the set of routing
        to next hops by £aš 3 œ . Let vuš 3 œ ¤£aš 3 œ  .                      3œ
                                                                     prefixes  £aš        to determine,
                                                                                                      *     for each routing prefix
   4) Denote the traffic load          on hop after heuris-   *    #
                                 #                                    , the set of next hops ™ š        3 ¼
                                                                                                          œ  » ™›š 3 œ
        tic ¥ has* assigned          routing    prefixes by ¦¡ .                                                  #
                        D            #
                                        K D .                             For each routing prefix ½£xš 3 œ do
        Assume ¦ ¡         for
                                                                             Update the intensity& of the            *    corresponding
                                                                     routing prefix at each node w™ š 3 œ
A. The MIN-MAX LOAD heuristic                                             done
   The Min-Max Load heuristic is similar to a                           done
work conserving scheduling algorithm which tries                     done
to minimize the maximum load on any processor                           For the sake of continuity, analysis of the heuristic
(the maximum makespan, [10]). Heuristic MIN-                         is postponed till the appendix. There we show that
MAX LOAD tries to minimize the maximum ratio                         the MIN-MAX LOAD heuristic achieves a load
7

                                          G       ž                                                                                                                                                                                                                            W        #                    
                                 _   0                                                                                                                                                                                                                           A‰i                               
ratio that is within a factor of   ·    “‚ƒ of the                                                                                   W                                                                                      W                                                            #                        
                                                                                        2                                         >+ 3 *hA                     2                                                         4* 3 + 
ratio achieved by an optimum allocation (under the                                                                                                                                                                                                                                                                      (5)
                                                                           +=< +>3 *?517C9                              f                               +6< *?3 +=517C9                               f                                                      D
equal splitting constraint).                                                                                                                                                                                                                                                     otherwise
                                                                                               `Œ#                            h                              
                                                                                                                                                  
                                                                                                                                            W                              `nM#%&N
                                                                                       *43 +
                                                                               ˆ                                                        ¾ *43 +                                                                               
                            IV. E XPERIMENTS                                                                         W47X
                                                                           ÁÂ*43 +                                          *43 +                  *43 +                         `n$#%&'
                                                                                                            ˆ                          “p¿                                                                                            
   In order to evaluate the effectiveness of our                                            *43 +                            *43 +                 ÁÂ*43 + K                            G
approach, we conducted two sets of experiments on                          À                                    ˆ                                                                             “”à                                                                                                                       (6)
artificially generated topologies as well as on an                                          *43 +                                    *43 + A               ƒ                    *43 +                                G                        K           ÁÂ*43 + K
                                                                           À                                ŽÃpˆ                                                  ¿                                                          “‚à                                                      ƒe“”à                              (7)
actual ISP topology, namely the Sprint Backbone.                                                                                                           Ã G
In the first set we studied the performance of our                                          *43 +                      GCD                *43 + A                          Š                    *?3 +                                                 K               ÁÂ*43 + K                               GjD
                                                                           À                                                        ˆ                                             ¿                                               ƒe“”à                                                          e“                     (8)
heuristic when compared against optimal routing.                                                                                                                   GÃ
In the second set of experiments we studied the                                             *43 +                             D           *43 + A                          Äp†                            *43 +                                      GCD              K         ÁÂ*43 + K                 G
                                                                           À                                ŽÄ                      ˆ                                                      ¿                                               e“                                                                           (9)
trade-off between performance and configuration                                                                                                                            ÃG                                                                                                                           G”G
overhead by varying the number of routing prefixes                                          *43 +                             DpD              *?3 + A                            ‡”Šp†                                        *43 +                 G K                   ÁÂ*43 + K
                                                                           À                                Ž                             ˆ                                                                     ¿                                                                                      GCD              (10)
for which we controlled the set of next hops they                                                                                                                                  GÃ                        G
were assigned.                                                                              *43 +                             DpDpD                    *43 + A                              Špà                           †                  *43 +                   GpG         GjD          K         ÁÅ*?3 +
                                                                           À                                Ž                                 ˆ                                                                                    ¿                                       “                                            (11)
   For purposes of comparison, we solved a linear                                                                                                                                                 Ã
multi-commodity flow routing problem with the                         where,W as before
                                                                                                                                                                       
                                                                                   i                                                   
same piecewise linear cost function used in [7].                                                                     
8

                                      4

                              6
                                  x 10
                                                                                                                  number of routing prefixes and distribution7
                                                                                                                  from which the ingress traffic intensity of each
                              5
                                                                                                                  routing prefix was picked.
                                                                                                             2) Hot spots were introduced in the traffic matrix
                                                                                                                  by randomly selecting elements from the traf-
                              4
                                                                                                                  fic matrix and scaling them to create several
       Cost Function Φ(f,C)

                                                                                                                  instances of the traffic matrix. We tested cases
                              3                                                                                   where only some of the traffic elements were
                                                                                                                  chosen and also cases where all entries were
                              2                                                                                   chosen. In the latter case, this involves scaling
                                                                                                                  the entire traffic matrix.
                              1
                                                                                                             3) The “optimal routing problem”, Eqn. (12),
                                                                                                                  was then solved for each such instance (topol-
                                                                                                                  ogy and traffic matrix).
                              0
                                  0       20   40       60             80         100         120            4) The linear program, Eqn. (3), with the optimal
                                                    Link Load f
                                                                                                                  link bandwidths from the “optimal routing
                                                                  ¬ ÇÉÈ4Ê ¬ ÇËÍÌ ¬ ÇÉÎ
Fig. 1. Evolution
         Ê ¬Ç     of cost function Æ °                                    °       °     as a function of          solution” as input, was solved to obtain the
link load °                                                                                                       traffic allocation (which was aggregated based
                                                                                                                  on destination, ref. Section II-A) and the set
                                                                                                                  of link weights.
generated by picking the traffic intensity of each                                                           5) Finally, our heuristic was run over the network
routing prefix from a pareto or uniform distribution.                                                             with the link weights and traffic flows from
The choice of a pareto distribution was motivated                                                                 the previous step (please refer to pseudo-
by measurements taken from several routers on the                                                                 code) in order to approximate the optimal load
Sprint backbone. We also experimented with other                                                                  profile of Eqn. (3).
distributions, eg. uniform, bimodal, exponential and                                                       We used ILOG CPLEX to solve the optimal routing
gaussian. Of these, we present results for the uni-                                                        problem and the linear program. On a Dell 1500
form distribution since it is representative of the                                                        1 Ghz machine, it took about ƒ Gjhours         to solve
others. The other parameter of importance is the                                                                                                     D
                                                                                                           the optimal routing problem and             minutes for
number of routing prefix associated with each egress                                                       the linear program and our heuristic on the largest
router, that is, the granularity of the matrix. For this,                                                  networks.
we again used both uniform and pareto distributions,
as it gives a reasonable coverage for the possible dif-                                                    B. Performance Comparison against Optimal Rout-
ference in the number of available routing prefixes                                                        ing
to a given egress router. The Sprint traffic matrix
was based on actual traffic traces downloaded from                                                            We now present and discuss the results of our
access links to two of the Sprint backbone routers.                                                        experiments. In Figure 2 we plot Cost vs Total
The traces was measured at the granularity of the                                                          Traffic for the MIN-MAX LOAD heuristic and the
routing table entries6 and gives us two rows of the                                                        optimal routing solution for the Sprint topology.
traffic matrix. The routing prefix intensities in the                                                      The horizontal lines represent various levels of
remaining rows were generated artificially using a                                                         maximum average link utilization over all links for
pareto distribution for both intensity and granularity.                                                    optimal routing. The entire traffic matrix was scaled
Details of how the entire Sprint traffic matrix was                                                        for the experiments involving the Sprint backbone.
constructed can be found in [18].                                                                          From the figure, we see that in all the cases, the
                                                                                                           heuristic is very near the optimal solution indicating
   Each experiment was conducted in the following
fashion :                                                                                                  that it is able to match the optimal traffic split very
                                                                                                           closely.
   1) For each network topology, we generated ran-                                                            For comparison, we have also shown the per-
       dom traffic matrices, varying both the total                                                        formance of standard OSPF routing with weights
  6                                                                                                         7
      The routing prefix intensities are averages over 10 hrs.                                                  Except in the case of the Sprint traffic matrix.
9

                                          6                                  ISP Backbone                                                                                        5                                  50 Nodes 200 Edges Graph
                                      x 10                                                                                                                               x 10
                                 2
                                                                                                                                                                     8                 Optimal
                                              Optimal
                                                                                                                                                                                       Min−Max Load
                                              Min−Max Load
                                1.8                                                                                                                                                    F&T Heuristic
                                              F&T Heuristic
                                                                                                                                                                                       Max Link Util=1.00
                                              Max Link Util=1.00
                                                                                                                                                                     7                 Max Link Util=0.75
                                              Max Link Util=0.75
                                1.6                                                                                                                                                    Max Link Util=0.25
                                              Max Link Util=0.25

                                1.4                                                                                                                                  6

                                1.2                                                                                                                                  5
     Cost Function

                                                                                                                                          Cost Function
                                 1
                                                                                                                                                                     4

                                0.8
                                                                                                                                                                     3

                                0.6

                                                                                                                                                                     2
                                0.4

                                                                                                                                                                     1
                                0.2

                                 0                                                                                                                                   0
                                      0          2           4           6         8          10         12         14          16                                               0.5                        1                1.5                    2           2.5
                                                                              Total Traffic                               x 10
                                                                                                                                 9                                                                                        Total Traffic                           x 10
                                                                                                                                                                                                                                                                        9

Fig. 2.                                      Sprint Backbone: Performance of heuristic vs. optimal                                    Fig. 4. 50 Node 200 Edge graph (BRITE Generated): Performance
routing.                                                                                                                              of heuristic vs. optimal routing

                                                                             ISP Backbone                                                                                                                            50 Nodes 200 Edges Graph
                                0.3                                                                                                                                  0.6
                                                                                                               Min−Max Load                                                                                                                              Min−Max Load

                                0.2                                                                                                                                  0.4
     % Deviation from Optimal

                                                                                                                                          % Deviation from Optimal

                                0.1                                                                                                                                  0.2

                                 0                                                                                                                                       0
                                 0.4                 0.6           0.8             1               1.2        1.4               1.6                                          0                 0.5              1              1.5              2       2.5              3
                                                                              Total Traffic                              x 10
                                                                                                                              10                                                                                           Total Traffic                          x 10
                                                                                                                                                                                                                                                                         9

Fig. 3. Sprint Backbone:% Deviation of the heuristic from optimal                                                                     Fig. 5. 50 Node 200 Edge graph (BRITE Generated):% Deviation
routing.                                                                                                                              of the heuristic from optimal routing.

computed using our implementation of the heuristic                                                                                    routings under different weight settings. In order
proposed in [7] (denoted by “F&T Heuristic” in                                                                                        to avoid getting trapped in local minima valleys,
the graph). The heuristic uses local search tech-                                                                                     the heuristic performs diversification by randomly
niques to determine the best OSPF weight set-                                                                                         selecting a new neighbourhood, again by weight
ting. Specifically, it searches the neighbourhood                                                                                     perturbation. Similar to [7] we run the heuristic for a
of a given weight setting by performing random                                                                                        fixed number of iterations (5000) and retain the best
link weight changes in order to determine a better                                                                                    weight setting. In our experiments, we found the
weight setting. It then repeats the search in the                                                                                     heuristic to perform quite well, closely tracking the
neighbourhood of the new weight setting. A unique                                                                                     optimal cost for utilizations below ĔpÏ . However,
feature of the heuristic is its use of hash tables to                                                                                 instances in [7] show that the heuristic can deviate
avoid cycling as well as re-computation of same                                                                                       significantly from the optimal routing even at low
10

                                         6                          60 Nodes 250 Edges Graph                                                                                   6                              50 Nodes 154 Edges Graph
                                   x 10                                                                                                                                     x 10
                               7
                                             Optimal                                                                                                                                 Optimal
                                             Min−Max Load                                                                                                           4.5              Min−Max Load
                                             F&T Heuristic                                                                                                                           F&T Heuristic
                               6             Max Link Util=1.00                                                                                                                      Max Link Util=1.00
                                             Max Link Util=0.75                                                                                                         4            Max Link Util=0.75
                                             Max Link Util=0.25                                                                                                                      Max Link Util=0.25

                               5                                                                                                                                    3.5

                                                                                                                                                                        3
    Cost Function

                                                                                                                                          Cost Function
                               4
                                                                                                                                                                    2.5

                               3
                                                                                                                                                                        2

                                                                                                                                                                    1.5
                               2

                                                                                                                                                                        1

                               1
                                                                                                                                                                    0.5

                               0                                                                                                                                        0
                                                  1               1.5                     2              2.5              3                                                        0.8           0.9           1                1.1       1.2    1.3
                                                                             Total Traffic                              x 10
                                                                                                                               10                                                                                       Total Traffic                    x 10
                                                                                                                                                                                                                                                                10

Fig. 6.   60 Node 250 Edge graph (GT Topology Generator):                                                                            Fig. 8.   50 Node 154 Edge Graph (GT Topology Generator):
Performance of the heuristic vs. optimal routing                                                                                     Performance of the heuristic vs optimal routing.

                                                                        60 Nodes 250 Edges Graph                                                                                                             50 Nodes 154 Edges Graph
                                0.06                                                                                                                                6
                                                                                                               Min−Max Load                                                                                                                     Min−Max Load

                                0.05
                                                                                                                                                                    5

                                0.04
                                                                                                                                                                    4
    % Deviation from Optimal

                                                                                                                                         % Deviation from Optimal

                                0.03

                                                                                                                                                                    3

                                0.02

                                                                                                                                                                    2
                                0.01

                                                                                                                                                                    1
                                     0

                               −0.01                                                                                                                                0
                                   0.5                1           1.5               2              2.5         3               3.5                                  0.7                  0.8           0.9         1                1.1   1.2   1.3             1.4
                                                                               Total Traffic                            x 10
                                                                                                                              10                                                                                       Total Traffic                           10
                                                                                                                                                                                                                                                         x 10

Fig. 7.   60 Node 250 Edge graph (GT Topology Generator):%                                                                           Fig. 9. 50 Node 154 Edge Graph (GT Topology Generator):%
Deviation of the heuristic from optimal routing                                                                                      Deviation of the heuristic from optimal routing.

utilizations. Furthermore, we note our heuristics can                                                                                plot Cost vs Total Traffic for the heuristic, optimal
be used to improve the performance of the Fortz and                                                                                  routing as well as standard OSPF routing, on a
Thorup heuristic (see Section IV-D).                                                                                                 50 Node 200 Edge topology with a granularity
    In Figure 3, we plot the % deviation of the                                                                                      of 26500 routing prefixes per node. This number
MIN-MAX LOAD heuristic from the optimal. The                                                                                         was chosen simply as an approximation of the
maximum
G
            deviation of the heuristic is well within                                                                                number of routing prefixes in a backbone router.
  Ï   of the optimal, highlighting the ability of the                                                                                We have conducted experiments with up to 100,000
heuristic to approximate the desired (optimum) load                                                                                  routing prefixes and as few as 500 routing prefixes
very closely.                                                                                                                        without any significant change in performance of
    We now look at 4 artificially generated topologies                                                                               the heuristic. The topology was generated using the
that we used in our experiments. In Figure 4, we                                                                                     BRITE generator and all links were set to a capacity
11

                                       6                            50 Nodes 206 Edges Graph
                                   x 10
                               9
                                           Optimal
                                                                                                                            the heuristic.
                               8
                                           Min−Max Load
                                           F&T Heuristic                                                                       For all the 3 remaining topologies, the traffic
                                           Max Link Util=1.00
                                           Max Link Util=0.75                                                               matrix granularity as well as intensities were chosen
                                           Max Link Util=0.25
                               7                                                                                            from a uniform distribution. Hot spots were created
                               6
                                                                                                                            by scaling 50% of the node-pairs in the traffic
                                                                                                                            matrix. Cost and % deviation curves for a 60 node
    Cost Function

                               5                                                                                            250 edge topology are shown in Figures 6 and
                               4
                                                                                                                            7 respectively. Results for a 50 node 154 edge
                                                                                                                            random topology are presented in Figures 8 and 9.
                               3                                                                                            Both topologies were generated by the Georgia Tech
                               2
                                                                                                                            topology generator using the uniform distribution
                                                                                                                            and their link capacities were set to 500 Mbps.
                               1                                                                                            Finally, the cost and % deviation graphs for a 50
                               0
                                                                                                                            node 206 edge random topology generated using
                                   0          1         2       3         4           5
                                                                          Total Traffic
                                                                                                6   7      8
                                                                                                                x 10
                                                                                                                        9
                                                                                                                       10
                                                                                                                            the waxman-1 distribution are shown in Figures
                                                                                                                            10 and 11. The link capacities for this topology
Fig. 10.     50 Node 206 Edge Waxman Graph (GT Topology                                                                     were randomly chosen from a uniform distribution.
Generator): Performance of the heuristic vs. optimal routing.                                                               We note that for all three topologies, the heuristic
                                                                                                                            performs very well, closely tracking the optimal
                                                                     50 Nodes 206 Edges Graph
                               0.03
                                                                                                        Min−Max Load
                                                                                                                            cost.

                                                                                                                            C. Non-Integer Link Weights
                               0.02
                                                                              We should point out that the dual of the linear
    % Deviation from Optimal

                                                                           program, Eqn. (4), is not guaranteed to yield exact
                                                                           integer solutions for link weights. In practice, rout-
                                                                           ing protocols like OSPF and IS-IS have a finite field
      0.01
                                                                           width  for link weight information ([11],[16]). If the
                                                                           link weights obtained from the linear program are
                                                                           not exact integers, fitting them within the provided
                                                                           field length can affect performance in the form of
                                                                           routings that are different from the optimal routing.
         0
           0  1    2     3     4            5       6    7      8       9  Hence it is important to study how errors in link
                                Total Traffic                                                                          10
                                                                   x 10
                                                                           weights influence performance.
Fig. 11.     50 Node 206 Edge Waxman Graph (GT Topology                       There are two factors which can introduce inac-
Generator): % Deviation of the heuristic from optimal routing.             curacies in link weights. First of course is the loss
                                                                           of precision in rounding off link weights, especially
                                                                           in the presence of recurring fractions, e.g. 2/3. The
of 500 Mbps. The number of prefixes for each second factor is the non-zero tolerance required by
node-pair were chosen from a uniform distribution optimizers to converge to feasible solutions. This
and the intensity of the entries in the traffic matrix implies that the optimal link weights (and path
for this experiment were generated from a pareto costs) are accurate only within a certain tolerance.
distribution. Hot-spots were generated by scaling For example, two path costs are treated to be equal
70% of the traffic elements. We note from Figure by the optimizer if the difference in costs is less
4, that the heuristic closely tracks the optimal. An than the specified tolerance. InGCD ®”our                   Ð    experiments,
alternate view is provided in Figure 5 where we plot the tolerance limit was set to                                . The presence
the % deviation of the heuristic               Ded
                                                    from  G
                                                            the
                                                              
                                                                  optimal. of  recurring fractions and  non-zero   tolerance limits
The low percentage deviation ƒpÏ                       A    Ï    from the means that even with large precision integer repre-
optimal value again highlights the effectiveness of sentations, these errors still persist and in fact are
12

                                                                                                                            50 Nodes 206 Edges Graph
exacerbated as the integer precision decreases8                                                    3
                                                                                                  10
                                                                                                            Mask − 16
   Figure 12 shows the impact of the length of the                                                          Mask − 24
                                                                                                            Mask − 8
link weight field on % deviation of the heuristic                                                  2
                                                                                                  10

from optimal cost for the 50 Node 206 Waxman
                                                                                                   1
Graph. The link weights are treated as exact integers                                             10

                                                                       % Deviation from Optimal
with precision governed by the field length. We
                                                                                                   0

plot curves with field lengths of 8, 16 and 24 bits.                                              10

Observe that the field-length has little impact on                                                 −1
                                                                                                  10
performance till link utilizations start increasing.
This is because, in all our experiments, at low                                                    −2
                                                                                                  10
utilizations ( ÑÒÄppÏ ), the link weights are almost
always exact integers9. At higher utilizations, errors                                             −3
                                                                                                  10
due to non-zero tolerances and non-integer link
weights come into play so that the performance                                                     −4
                                                                                                  10
starts to deviate significantly. Note that even with                                                    0         1     2              3
                                                                                                                                  Total Traffic
                                                                                                                                                       4   5
                                                                                                                                                                  10
                                                                                                                                                                      6
                                                                                                                                                               x 10
a 24-bit field length, performance is off by at least
50%, indicating that precision is not the dominant                Fig. 12. 50 Node 206 Waxman Topology: Impact of Integer weights
issue. Instead the errors are caused by the inherent              on performance
inaccuracy in link weights. When treated as exact
integers, regardless of the precision used, this pro-
duces different routings.
   In order to avoid problems due to such errors,
we use an approach similar to that used by the op-                         improvement may be obtained with our heuristic.
timizer. We treat path costs be to be equal, whenever                      As an example, one could compute shortest paths
they differ by less than the specified tolerance thus                      using the heuristic proposed by Fortz et. al. [7].
allowing for inaccuracies. This explains the much                          However, instead of splitting traffic equally over
better performance (within ÃpÏ ) seen for the same                         all equal cost next hops, it could be distributed
instance in Figure 11.                                                     in an “optimal fashion” over the computed paths
                                                                           using the next hop allocation technique presented
                                                                           in this paper. In order to achieve this, we first solve
D. Applications of the Next Hop Allocation Tech- the optimal routing problem but with traffic now
nique                                                                      constrained to flow only on paths computed using
     Although we have presented our technique for the link weights obtained from the ’F & T’ heuristic.
next hop allocation within the framework of shortest The MIN-MAX LOAD heuristic is then run to
path routing as computed by the Linear Program, shape the allocated traffic to match the optimal load.
Eq. (3), it is also applicable to paths computed We present an example of this procedure for the
through other methods. Given a set of paths com- 60 Node 250 Graph in Figure 13, were we show
puted using any other technique, we can find the the performance improvement of traffic allocation
optimal distribution of traffic over the set of pre- with the MIN-MAX LOAD heuristic over equal
computed paths and then use our next hop allocation splitting. Note that for both the curves, the same
technique to match this profile. If the constraining set of paths were used, but the MIN-MAX Load
factor is a poor set of paths, then modifying the heuristic allows for finer load balancing. In other
traffic profile may not offer much improvement. words, the improvement afforded over the standard
However, if the deciding factor is coarse granu- ’F & T’ heuristic by the use of the MIN-MAX
larity of load balancing over the paths, significant Load heuristic can be solely attributed to its ability
                                                                           to better match optimal traffic loads by distributing
   8
     Note that since the computer is a finite precision machine, these traffic unevenly. We revisit this issue in the next
errors are already present in the link weights obtained from the
solution of Eq. (4).
                                                                           section, where we explore the impact of the number
   9
     At low utilizations, the optimizer can easily find feasible solutions of equal cost paths (next hops) generated by the
and hence the tolerance limits are not enforced.                           routing algorithm.
13

                           6                           60 Nodes 250 Edges Graph                                                           6
                        x 10                                                                                                        x 10                               60 Nodes 250 Edges Graph
                    7                                                                                                           8
                               Optimal
                               F&T With Min−Max Load Balancing                                                                                    Optimal
                               F&T Heuristic                                                                                                      Min−Max Load
                               Max Link Util=1.00                                                                               7                 Single Hop
                    6          Max Link Util=0.75
                               Max Link Util=0.25

                                                                                                                                6
                    5

                                                                                                                                5
    Cost Function

                                                                                                                Cost Function
                    4

                                                                                                                                4

                    3
                                                                                                                                3

                    2
                                                                                                                                2

                    1
                                                                                                                                1

                    0                                                                                                           0
                                   1                 1.5                      2      2.5          3                             0.5                    1         1.5              2                  2.5          3               3.5
                                                                 Total Traffic                  x 10
                                                                                                      10                                                                     Total Traffic                                    10
                                                                                                                                                                                                                           x 10

Fig. 13. 60 Node 250 Edge Topology: Impact of MIN-MAX LOAD                                                 Fig. 14. Impact of Equal Cost Paths on performance for the 60
Heuristic on load balancing with weights computed using Fortz and                                          Node 250 Edge Graph.
Thorup Heuristic
                                                                                                                                              6                              ISP Backbone
                                                                                                                                      x 10
                                                                                                                                 3
                                       Network                                 Number of Hops
                                                                                                                                                                                              Optimal
                                                                             Average Maximum                                                                                                  Min−Max Load
                                                                                                                                                                                              Single Hop
                                  Sprint N/W                                  1.69        5
                                                                                                                                2.5
                         50    Node, 200 Edge              Graph              1.31        4
                         50    Node, 154 Edge              Graph              1.20        4
                         50    Node, 206 Edge              Graph              1.26        4
                                                                                                                                 2
                         60    Node, 250 Edge              Graph              1.05        5
                                                                                                               Cost Function

                                                           TABLE I
                                                                                                                                1.5
                         N UMBER OF E QUAL C OST N EXT H OPS FOR EACH
                                              I NGRESS -E GRESS PAIR
                                                                                                                                 1

                                                                                                                                0.5

E. Equal Cost Paths
   One of the key features of OSPF/ISIS discussed                                                                                0
                                                                                                                                      0            2         4          6          8            10           12       14          16
                                                                                                                                                                              Total Traffic                                        9

in this paper is the ability to balance traffic across                                                                                                                                                                      x 10

multiple equal cost paths. In this section, we ex-                                                         Fig. 15. Impact of Equal Cost Paths on performance for the Sprint
amine the effectiveness of this feature in improving                                                       Backbone.
performance. In other words, how is routing per-
formance dependent on its ability to send traffic on
more than one path. Intuitively, optimal routing is                                                           We observe that the number of equal cost next
more likely to use multiple paths to balance traffic at                                                    hops used by the optimal formulation in order to
higher loads rather than when capacity is plentiful.                                                       balance load (although the distribution of traffic
Hence, we focus on reasonably
                            GCDpD
                                         high utilization                                                  across them need not be uniform) varies with the
scenarios, albeit less that       Ï link loads, for which                                                  topology. In other words, the benefit of routing over
we compute statistics regarding the number of equal                                                        multiple equal cost paths is a function of how the
cost next hops used by optimal routing. Table I                                                            logical topology is designed and also how well the
shows the average and maximum number of equal                                                              traffic is matched to the topology. To emphasize the
cost next hops computed by the optimal formulation                                                         dependence of benefits of using equal cost paths
across all nodes and for all destinations for several                                                      on the topology, we evaluate performance when no
network configurations,                                                                                    splitting of traffic is allowed. The “SINGLE HOP”
14

heuristic chooses exactly one next hop, the one with  the routing prefixes, confirming the results reported
largest “optimal traffic, ˆj¡ ”, for each destination at
                                                      in [3]. Figure 16 highlights this observation, where
each node. In all other aspects it functions exactly  we have plotted the cumulative traffic intensity as a
like the MIN-MAX Load heuristic.                      function of the number of routing prefixes sorted in
   Figure 14 shows performance curves for the         decreasing order of their traffic intensities. We can
“SINGLE HOP” heuristic the MIN-MAX LOAD               potentially exploit such a phenomenon by config-
heuristic on the 60 Node, 250 edge graph. Ob-         uring the set of next hops for only a few selective
serve the near-optimal performance of the “SIN-       routing prefixes that carry most of the traffic and
GLE HOP” heuristic, indicating that multiple equal    allowing the default assignment of all next hops
cost paths have little impact on performance. The     for the remaining routing prefixes. This has the
average number of next hops for the graph in Ta-      advantage of lowering configuration overhead, but
ble I is close to 1, which confirms our observation.  raises the question of how it impacts performance.
Note that this configuration is the same as that of      We carried out a systematic study of such a
Figure 13 which illustrated the improvement that      trade-off on all the previous topologies. In each
could be achieved by the MIN-MAX Load heuristic       instance, we configured the set of next hops at each
over the standard ’F & T’ heuristic. This is because  node for only a certain set of routing prefixes that
whenever the ’F & T’ heuristic computes multiple      were selected based on the amount of traffic they
paths, it ends-up allocating traffic equally across   carried. The remaining routing prefixes were split
them, even when an optimal solution calls for a       equally over the entire set of next hops as would
very uneven traffic allocation. The MIN-MAX Load      happen with default OSPF/IS-IS behavior. The set
heuristic, even if it is constrained to using the sameof configured routing prefixes was then progres-
paths, is able to generate the uneven loads the the   sively increased in each experiment to determine
optimal routing solution calls for.                   the evolution of the impact on performance. In all
   In contrast, performance on the Sprint network     cases, the MIN-MAX LOAD heuristic was used
(Figure 15) is significantly enhanced by distributing when configuring the set of next hops.
traffic across multiple next hops, and as a result       The resulting performance curves for the 50 Node
the “SINGLE HOP” heuristic performs quite poorly.     200 Edge topology are shown in Figure 17 and the
The average number of equal cost next hops for the    number of configured routing prefixes are shown
Sprint network from Table I is close to 2, reflecting in Table II. Each curve on the plot is referenced
this behaviour. Indeed, some major ISPs specifi-      by the amount of traffic that was accounted for by
cally design their logical topologies to increase the the configured routing prefixes. This can be cross-
number of equal cost paths, especially between geo-   referenced from the table against the number of
graphical hubs, for purposes of both load balancing   routing prefixes that were configured. We observe
(as explained in Section I) as well as robustness.    that on an average, by configuring about 165 routing
                                                                                                        D
We expect load balancing over equal cost multiple     prefixes per router (which accounts for about ƒ Ï
paths to markedly improve performance in these        of the traffic), we get good performance till about
networks.                                             50% maximum link utilization. If we configure next
                                                      hops for about 17 % of all routing prefixes, or 4500
                                                      entries, at a router, we account for approximately
F. Lowering Configuration Overhead                    75% of the traffic and the resulting performance is
   Our other goal was to investigate the trade-off quite close to that of optimal routing.
between configuration overhead and performance.          Experiments conducted on the Sprint Backbone
Recall that in the original approach the heuristic (Figure 18, Table III) yield similarly encouraging re-
decides the subset of next hops assigned to every sults. We get good performance up to approximately
routing prefix. However, it has been observed that 50-60% maximum link utilization, by configuring
in practice ([3]), a large fraction of the traffic is only 200 routing prefixes per router and up to more
distributed over a relatively small number of rout- than 70% link utilization if we configure 600 routing
ing prefixes. Our analysis of the backbone traces prefixes per router.
obtained from the Sprint router show that 95% of         The 60 node 250 edge topology (Figure 19, Table
the total traffic was accounted for by only 10% of IV) and the 50 node 154 edge topology (Figure
15

20, Table V) present an interesting case in that by                                                                                           x 10
                                                                                                                                                     5                  50 Nodes 200 Edges Graph

carefully configuring only à A
                                                                                                                                          8
                                 % of the prefixes                                                                                                                            Optimal
                                                                                                                                                                              10%
we get close to optimal performance over a very                                                                                           7
                                                                                                                                                                              20%
                                                                                                                                                                              50%
                                                                                                                                                                              75%
wide range of link utilizations. We do not get                                                                                                                                90%
                                                                                                                                                                              Max Link util=100%
quite such dramatic results for the 50 node 206                                                                                           6                                   Max Link Util = 50%
                                                                                                                                                                              Max Link Util=25%
edge Waxman topology, (Table VI, Figure 21), but
                                                                                                                                          5
even in this case, we get good performance D by

                                                                                                                          Cost Function
configuring around a quarter of the prefixes ( Ï                                                                                          4

of the traffic).
                                                                                                                                          3
                                             Cumulative Contribution of Routes at an ISP Router
                              1
                                                                                                                                          2

                             0.9

                                                                                                                                          1
                             0.8

                             0.7                                                                                                          0
                                                                                                                                              0              0.5          1                    1.5        2               2.5
                                                                                                                                                                               Total Traffic                              9
                                                                                                                                                                                                                       x 10
    Fraction of Total Rate

                             0.6

                             0.5
                                                                                                                   Fig. 17. BRITE 50 Node 200 Edge topology : Performance as a
                                                                                                                   function of configuration overhead
                             0.4

                             0.3                                                                                                                   Prefixes        Total No. of           % Allocated           % of
                                                                                                                                                  configured         Prefixes              Fraction            Traffic
                             0.2
                                                                                                                                                     160              30700                 0.5%               10 %
                             0.1
                                                                                                                                                     200              30700                 0.6%               20 %
                                                                                                                                                     620              30700                  2%                50 %
                              0                                                                                                                     1750              30700                  6%                75 %
                                   0   1         2             3            4               5      6           7
                                                 Route Index in Decreasing Order of Intensity           x 10
                                                                                                            5                                       4180              30700                 14 %               90 %
                                                                                                                                                                         TABLE III
Fig. 16. Cumulative contribution of routing prefixes at a Sprint
                                                                                                                   C ONFIGURATION OVERHEAD : S PRINT BACKBONE , ALL ENTRIES
Router sorted in decreasing order of intensity.
                                                                                                                                                                        ARE PER NODE

                                Prefixes       Total No. of                % Allocated             % of
                               Configured        Prefixes                   Fraction              Traffic
                                   75             26500                       0.3%                10 %                                        x 10
                                                                                                                                                   5                          ISP Backbone
                                                                                                                                          7
                                   165            26500                      0.62%                20 %                                                                         Optimal
                                                                                                                                                                               10%
                                  1252            26500                       4.7 %               50 %                                                                         20%
                                                                                                                                                                               50%
                                  4500            26500                      17.0 %               75 %                                    6                                    75%
                                 11747            26500                     44.32 %               90 %                                                                         90%
                                                                                                                                                                               Max Link util=90%
                                                                                                                                                                               Max Link Util=70%
                                                           TABLE II                                                                       5
                                                                                                                                                                               Max Link Util = 50%
                                                                                                                                                                               Max Link Util=25%
 C ONFIGURATION OVERHEAD : 50 N ODE 200 E DGE T OPOLOGY,
                                            A LL ENTRIES ARE PER NODE
                                                                                                                        Cost Function

                                                                                                                                          4

                                                                                                                                          3

                 V. C ONCLUSION                                                                                                           2

  In this paper, we have described and evaluated
                                                                                                                                          1
an approach that offers the benefits of traffic engi-
neering to IP networks without requiring changes
                                                                                                                                              4          5   6      7     8         9          10    11   12      13      14
to either the routing protocols or the forwarding                                                                                                                              Total Traffic                           x 10
                                                                                                                                                                                                                              9

mechanisms.
  Our contribution is three-fold. First, we devised                                                                Fig. 18. Sprint Backbone: Performance as a function of configuration
                                                                                                                   overhead
a solution for closely approximating optimal link
16

                                                                                                                                                     6                        50 Nodes 154 Edges Graph
                                                                                                                                               x 10

                        Avg No. of                   Total No. of           % Allocated          % of                                  4.5
                                                                                                                                                                                      Optimal
                          Routes                       Routes                Fraction           Traffic                                                                               10%
                        configured                                                                                                         4
                                                                                                                                                                                      20%
                                                                                                                                                                                      50%
                           5125                        117890                     4.3%           10   %                                                                               75%
                                                                                                                                                                                      90%
                          14183                        117890                    12.0%           20   %                                3.5                                            Max Link util=90%
                                                                                                                                                                                      Max Link Util = 75%
                          32754                        117890                    27.5 %          50   %                                                                               Max Link Util = 50%
                                                                                                                                           3
                          57940                        117890                    49.1 %          75   %                                                                               Max Link Util=25%

                                                                                                                       Cost Function
                          80500                        117890                    68.2%           90   %
                                                                                                                                       2.5

                                                              TABLE IV
                                                                                                                                           2
 C ONFIGURATION OVERHEAD : 60 N ODE 250 E DGE T OPOLOGY
  (GT T OPOLOGY G ENERATOR ), ALL ENTRIES ARE PER NODE                                                                                 1.5

                                                                                                                                           1

                                                                                                                                       0.5

                                                                                                                                           0
                                                                                                                                                         0.8       0.9            1             1.1         1.2       1.3
                                                                                                                                                                                        Total Traffic                              10
                                                                                                                                                                                                                                x 10

                           6                              60 Nodes 250 Edges Graph
                        x 10
                    7
                               Optimal
                                                                                                                   Fig. 20. 50 Node 154 Edge Topology : Performance as a function
                               10%                                                                                 of configuration overhead
                               20%
                               50%
                    6          75%
                               90%
                               Max Link util=90%                                                                                                Prefixes                 Total No. of             % Allocated          % of
                               Max Link Util = 75%
                    5          Max Link Util = 50%                                                                                             configured                  Prefixes                Fraction           Traffic
                               Max Link Util=25%
                                                                                                                                                  4221                      97772                    4.3%             10 %
                                                                                                                                                 11706                      97772                   11.9%             20 %
    Cost Function

                    4
                                                                                                                                                 27131                      97772                   27.5 %            50 %
                                                                                                                                                 48098                      97772                   49.8 %            75 %
                    3                                                                                                                            55754                      97772                   68.5 %            90 %
                                                                                                                                                                               TABLE VI
                    2
                                                                                                                     C ONFIGURATION OVERHEAD : 50 N ODE 206 E DGE WAXMAN
                                                                                                                                                           T OPOLOGY, ALL ENTRIES ARE                       PER NODE
                    1

                    0
                                      1                 1.5                  2            2.5             3
                                                                Total Traffic                         x 10
                                                                                                              10
                                                                                                                                                 6                            50 Nodes 206 Edges Graph
                                                                                                                                           x 10
                                                                                                                                       9

Fig. 19. GT-ITM 60 Node 250 Edge Topology: Performance as a
                                                                                                                                       8                                              Optimal
function of configuration overhead                                                                                                                                                    10%
                                                                                                                                                                                      20%
                                                                                                                                                                                      50%
                                                                                                                                       7                                              75%
                                                                                                                                                                                      90%
                                                                                                                                                                                      Max Link util=90%
                                                                                                                                       6                                              Max Link Util = 75%
                                                                                                                                                                                      Max Link Util = 50%
                                                                                                                                                                                      Max Link Util=25%
                                                                                                                       Cost Function

                                                                                                                                       5
                          Prefixes               Total No. of              % Allocated           % of
                         configured                Prefixes                 Fraction            Traffic                                4
                            5640                   145827                     3.8%              10 %
                           15805                   145827                    10.8%              20 %                                   3
                           37108                   145827                    25.4 %             50 %
                           67132                   145827                     46 %              75 %                                   2
                           95420                   145827                    65.4 %             90 %
                                                                                                                                       1
                                                              TABLE V
 C ONFIGURATION OVERHEAD : 50 N ODE 154 E DGE T OPOLOGY,                                                                               0
                                                                                                                                           0                   1     2        3             4           5         6         7           8
                                             ALL ENTRIES ARE PER NODE                                                                                                                  Total Traffic                            x 10
                                                                                                                                                                                                                                       10

                                                                                                                   Fig. 21. 50 Node 206 Edge Waxman Topology: Performance as
                                                                                                                   function of configuration Overhead
17

loads using routing protocols and packet forward-                        [3] S. Bhattacharya, C. Diot, J. Jetcheva, and N. Taft. Pop-level and
ing mechanisms, as they exist today. Second, we                              access-link level traffic dynamics in a tier-1 pop. Proceedings of
                                                                             ACM SIGCOMM Internet Measurement Workshop (IMW 2001),
proposed a simple heuristic with a provable perfor-                          November 2001.
mance bound (see Appendix) to implement our so-                          [4] R. Callon. Use of OSI IS-IS for routing in TCP/IP and dual
lution. We performed several experiments in which                            environments. Request For Comments (Standard) RFC 1195,
                                                                             Internet Engineering Task Force, December 1990.
the heuristic consistently matched the optimal load                      [5] Z. Cao, Z. Wang, and E. Zegura. Performance of hashing-
profile. We believe the heuristic and those presented                        based schemes for internet load balancing. In Proceedings of
in [17] to be general enough to be potentially useful                        INFOCOM, Tel Aviv, Israel, March 2000.
                                                                         [6] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford,
in their own right. Finally, we showed, using actual                         and F. True. Deriving traffic demands for operational IP net-
traffic traces, that configuration overhead can be                           works: Methodology and experience. IEEE/ACM Transactions
vastly reduced without significant loss of perfor-                           on Networking, 9, 2001.
mance. Specifically, by only configuring next hops                       [7] B. Fortz and M. Thorup. Internet traffic engineering by
                                                                             optimizing OSPF weights. In Proceedings of INFOCOM’2000,
for a small set of prefixes, we were able to obtain                          Tel Aviv, Israel, March 2000.
near-optimal performance for link loads of up to                         [8] B. Fortz and M. Thorup. OSPF/IS-IS weights in a changing
70%. This is obviously an important aspect for                               world. Journal on Selected Areas in Communication, February
                                                                             2002.
the practical deployment of our traffic engineering                      [9] M. R. Garey and D. S. Johnson. Computers and Intractability:
solution.                                                                    A Guide to the Theory of NP-Completeness. W. H. Freeman
   Overall, we believe that the paper provides initial                       and Company, 1979.
                                                                        [10] R. L. Graham. Bounds on multiprocessing timing anamolies.
arguments in favor of evolving the current infras-                           SIAM Journal on Applied Mathematics, 17, March 1969.
tructure to support traffic engineering, if and when                    [11] D. Katz, D. Yeung, and K. Kompella. Traffic Engineer-
needed, rather than embark on a migration to a                               ing Extensions to OSPF Version 2. INTERNET-DRAFT,
                                                                             draft-katz-yeung-ospf-traffic-09.txt, October
rather different technology. There may be justifica-                         2002. work in progress.
tions for such a migration, e.g., better support for                    [12] M. Kodialam and T.V. Lakshman. Dynamic routing of band-
policies or VPNs, but traffic engineering does not                           width guaranteed tunnels with restoration. In PROCEEDINGS
                                                                             of INFOCOM, Tel Aviv, Israel, March 2000.
appear to be one of them, and we hope that the
                                                                        [13] A. Medina, A. Lakhina, I. Matta, and J. Byers. BRITE: Boston
results of this paper can help clarify this issue.                           university representative internet topology generator. http://cs-
   There are several directions in which this work                           www.bu.edu/brite, Boston University, April 2001.
can be extended and further improved. One of                            [14] J. Moy. OSPF Version 2. Request For Comments (Standard)
                                                                             RFC 2328, Internet Engineering Task Force, April 1998.
them is in dealing with non-integer link weights as                     [15] E. Rosen, A. Viswanathan, and R. Callon. Multiprotocol label
discussed in Section IV-C. Although we have been                             switching architecture. Request For Comments (Standards
able to successfully deal with this issue by coupling                        Track) RFC 3031, Internet Engineering Task Force, January
                                                                             2001.
the computational tolerance of the optimizer with                       [16] H. Smit. IS-IS Extensions for Traffic Engineering. INTERNET-
the available precision of link weights, we are                              DRAFT, draft-ietf-isis-traffic-04.txt, Decem-
investigating more general solutions to the problem.                         ber 2002. work in progress.
                                                                        [17] A. Sridharan, R. Guérin, and C. Diot. Achieving Near-Optimal
Another direction we are currently pursuing is that                          Traffic Engineering Solutions for Current OSPF/IS-IS Net-
of making our traffic engineering solution robust to                         works. Technical report, University of Pennsylvania, May 2002.
unexpected changes in network topology, e.g., link                           Available at http://einstein.seas.upenn.edu/publications.html.
                                                                        [18] Ashwin Sridharan, J. Jetcheva, S. Bhattacharyya, C. Diot,
or router failures. This is obviously an important                           R. Guérin, and N. Taft. On the impact of traffic aggregation
aspect. One that has been considered by both traffic                         on traffic aware routing. Proceedings of the 18th International,
engineering solutions that rely on new forwarding                            COM-31(10), October 1983.
technologies such as MPLS, e.g., [12], and solutions                    [19] C. Villamizar. OSPF optimized multipath (OSPF-OMP). Avail-
                                                                             able at http://www.fictitious.org/omp, 1997. Un-
targeting current IP networks, e.g., [8].                                    published Work.
                                                                        [20] C. Villamizar.       OSPF optimized multipath OSPF-OMP.
                                                                             INTERNET-DRAFT, draft-villamizar-ospf-omp-
                        R EFERENCES                                          01.txt, February 1999. (work in progress).
[1] H. Abrahamsson, B. Ahlgren, J. Alonso, A. Andersson, and            [21] Z. Wang, Y. Wang, and L. Zhang. Internet traffic engineer-
    P. Kreuger. A multi path routing algorithm for IP networks               ing without full mesh overlaying. In Proceedings of INFO-
    based on flow optimization. In International Workshop on                 COM’2001, Anchorage, Alaska, April 2001.
    Quality of future Internet Services, Zurich, Switzerland, October   [22] E.     W.     Zegura.              GT-ITM:       Georgia     Tech
    2002.                                                                    Internetwork          Topology          Models         (software).
[2] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows,             http://www.cc.gatech.edu/fac/ellen.zegura/gt-itm/gt-itm/tar.gz,
    Chapter 17, Section 17.2. Prentice-Hall Inc., 1990.                      Georgia Tech, 1996.
You can also read