Revised Default Values for the BGP 'Minimum Route Advertisement Interval'
Sun Microsystems
Springfield
Linlithgow
West Lothian
EH49 7LR
Scotland
+44 1506 673150
paul.jakma@sun.com
Routing
Inter-Domain Routing
BGP
IDR
MRAI
Convergence
This document briefly examines what is known about the effects of the
BGP MRAI timer, particularly on convergence. It highlights published
work which suggests the MRAI interval as deployed has an adverse
effect on the convergence time of BGP.
It then recommends revised, lower default values for the MRAI timer,
thought to be more suited to today's Internet environment.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in .
The proper functioning of the routing protocol is
of great importance to the Internet. Issues regarding matters of its
stability and convergence have been documented widely, such as in
, and .
One such issue is the effect of 'Minimum Route Advertisement Interval'
(MRAI).
The Minimum Route Advertisement Interval (MRAI) timer is specified in
RFC4271. This timer acts to rate-limit
updates, on a per-destination basis. suggests
values of 30s and 5s for this interval for eBGP and iBGP respectively.
The MRAI must also be applied to withdrawals according to RFC4271, a change from the earlier RFC1771.
Some implementations apply this rate-limiting on a per-peer basis,
presumably an adequate approximation. Some implementations apply it to
withdrawal methods (often called "WRATE" in the literature). Some
implementations do not apply MRAI at all.
The MRAI timer serves to suppress messages which BGP would otherwise
send out to describe transitory states, and so allow BGP to converge
with significantly fewer messages sent. This beneficial effect of the
MRAI timer, in terms of # of messages, increases as the timer is
increased until an optimum value is reached, after which the
beneficial effect stabilises.
In terms of convergence time, a similar beneficial effect is seen as
the MRAI increases to near the same optimum value. However as the
timer value is increased past this point, the convergence time
increases again linearly. The scale of this increase is significantly
worse with WRATE, i.e. applying the MRAI to withdrawals has an adverse
effect on BGP convergence time.
The optimum MRAI timer value is dependent on several factors, most
particularly the topology in its layout and propagation times. The
optimum value will differ between different subsets of the Internet.
It is believed to be infeasible to try directly calculate this value.
However a useful approximation can be made from the diameter of the
topology if it is known, along with some assumptions about the the
topology, such as the latency between nodes.
The interaction between extensions to BGP designed to improve
convergence, such as those that allow propagation of additional and/or
backup paths, and the MRAI timer is as yet unknown. However, it seems
reasonable to speculate these extensions might have the effect of
leading to a lower optimum MRAI than would be indicated by an
approximation based on the diameter of the BGP topology. Further work
on these questions would be useful.
As the MRAI helps eliminate some updates, it interacts with flap-damping. The lower the MRAI timer, the
greater the risk of crossing below the threshold of the optimum value.
If that threshold is crossed, there will be an increased number of
updates somewhere within the BGP system, and hence an increased risk of
paths being dampened which otherwise would not.
So, in presence of significant flap-damping deployment and given
the uncertainty of what the optimum is, it is reasonable to err
towards selecting a value of the MRAI timer significantly higher
than the optimum.
However, given that flap-damping increasingly is discouraged in Internet routing, this particular need to be
conservative in the choice of MRAI timer value may be less important.
The current recommended value of 30s may be far higher than is optimal
for the Internet, based on observations of certain parameters related to
its topology. In it is suggested that the
optimal value may be between 5s ('semi-safe') to 15s ('safe'). The
estimation of the 'safe' value here is of no relevance if WRATE is
universally deployed, as in such a case the 'semi-safe' value will be
the same as the 'safe' value. Further empirical work by the same authors
suggests that the optimal, Internet MRAI
may be below 5s.
Further, and
argue that operational conditions (e.g. different routers using
different MRAI values) mean the MRAI is having an adverse effect even on
the number of messages sent, and so further exacerbating convergence
problems in the global BGP system, such as path hunting. The document goes further still and argues that MRAI be
deprecated in favour of some better way of damping BGP UPDATES, however
there are no clear proposals before the IDR as of this writing for such
changes to BGP.
Though there is an optimum value for the MRAI, it's unlikely that it can
be determined empirically or otherwise for the general Internet. It may
even not be possible, as the optimum MRAI will differ for different
subsets of the Internet. Some degree of guesstimation at a reasonable
value for the MRAI is required, which is an exercise in risk; whether to
err towards fast convergence at the risk of a disproportionate increase
in BGP messaging, or to err to the side of an optimal number of messages
at the expense of convergence.
Arguably, economising on bandwidth and control-plane processing power is
of less importance than the convergence time of BGP, compared to times
past. Presuming this, any new recommendations for the MRAI should seek
to err slightly to the side of convergence, rather than erring towards
minimising BGP traffic.
Further, if we assume most implementations apply the MRAI to
withdrawals, then the Internet BGP topology effectively is
WRATE-enabled, and suggests there is even
less benefit to erring toward a higher MRAI.
There may be risks in mismatched MRAI values between speakers in an AS
as revised MRAI values are deployed. The MRAI values in RFC4271 were deliberately specified to be lower for
iBGP than for eBGP, so as to allow interior routing to converge while
minimising the effect on eBGP state. So with mismatched values there is
an increased risk that the external stability of an AS's routes would
be affected by transient, internal states.
This last risk suggests that the existing iBGP/VPN values should be the
lower-bound for any conservative revision of the eBGP MRAI value.
The most definite risk of lowering the MRAI is the increased risk of
flap-damping, if the value is set too much below the optimum. Therefore,
taking into account estimations of that optimum is required. That said,
at least one BGP implementation by default does not apply any MRAI at
all.
The suggested default values for the MinRouteAdvertisementIntervalTimer
given in RFC4271 are revised to be 5s or less
for eBGP connections, and 1s or less for iBGP connections, for use on
Internet topologies.
These values may not be suitable for topologies which differ from the
Internet, be that in scale, arrangement or otherwise. Such non-Internet,
BGP topologies likely would have lower optimum values, assuming they are
always significantly smaller in scale than the Internet BGP topology.
Hence, implementations SHOULD allow the MRAI value to be configured
administratively on a per-AFI/SAFI basis, as well as a per-peer basis.
Given the beneficial effects on convergence time, implementations MAY
exempt withdrawals from the MRAI timer.
There are no requests made to IANA in this document.
This document raises no new security considerations.
The author would like to thank Manav Bhatia for his helpful review and
comments; as well as Robert Raszuk, Samita Chakrabarti, Danny McPherson
and Jeffrey Haas for their useful comments; dissenting or otherwise.
The authors of the cited documents are thanked for their contributions
to the understanding of BGP, of which this document is a simple summary.
A Border Gateway Protocol 4 (BGP-4)
Unknown
Unknown
Unknown
Key words for use in RFCs to Indicate Requirement
Levels
Harvard University
BGP Stability Improvements
Cisco Systems, Inc.
APNIC
BGP Route Flap Damping
Cisco Systems, Inc.
Cisco Systems, Inc.
Cisco Systems, Inc.
Damping BGP
APNIC
RIPE RRG: Recommendations on Route-flap Damping
An Experimental Analysis of BGP Convergence Time
AT>T Research
Dartmouth College
An Experimental Study of the BGP Rate-limiting Timer
Tsinghua University
Bell Labs Research China
Tsinghua University
The Optimal Rate-Limiting Timer of BGP for
Routing Convergence
University of Massachusetts
Bell Labs Research China
Tsinghua University