Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Elguapo's Guide to Routing - Part 3, BGP

By el_guapo in Internet
Wed Aug 20, 2003 at 03:14:41 AM EST
Tags: Technology (all tags)
Technology

Chapter 1 was an introduction. Chapter 2 was RIP. Chapter 4 will be Open Shortest Path First Protocol (OSPF). Chapter 5 will be the Interior Gateway Routing Protocol (IGRP). Chapter 6 will be the Enhanced Interior Gateway Routing Protocol (EIGRP). Chapter 7 will be the Intermediate System to Intermediate System protocol.


First, another aside: BGP is a BEAST. If I went very in depth here, this article would be huge. Thus, please don't expect to be able to configure an ISP connected full BGP internet peer after reading this. :-)

BGP - The Border Gateway Protocol

For the uninitiated, BGP is "what the internet runs". Since the internet USED to be called the ARPANET, and since ARPANET was a seen as a huge asset for the DoD, BGP was built fundamentally for survivability. IE: if the Russkies nuked a major ARPANET hub (say San Francisco), they wanted BGP to quickly and efficiently route around that outage. As ARPANET exploded like wildfire after it turned into what we know as the internet today, scalability was added to BGP's fundamental design tenets. As wonky as BGP can be, it lives up to these two disparate design requirements quite well.

BGP v1 was "born" in RFC 1105 in 1989, and the current version is BGP v4, released in RFC 1654 in 1994. It is notable that the internet basically runs on a routing protocol that is pushing one decade in age. Being fairly old, you will see some serious similarities between BGP, and that other fairly old protocol, RIP. The need for BGP came about when, in the early 1980's, ARPANET admins/designers saw that the protocol they were running, Gateway-to-Gateway Protocol, didn't scale. GGP required every gateway to know about every other gateway, and it's networks (routers were called "gateways" back then). It was pretty obvious this thing would just eventually fall over if it kept growing, and they knew it would keep growing. So, the admins floated the idea of an "Autonomous System", wherein only those devices within the AS would know all of that AS's routes. Also, the admins of an AS would be free to run their network how they pleased. To keep track of AS's, they were assigned an AS number (a 16 bit integer, with 64512-65535 reserved al la 10.x.x.x in the IP world) by the same authority that handed out and tracked IP addresses. [A quick aside, I find it ironic that they chose only a 16 bit integer to track each unique AS number, when the whole reason for this new protocol was scalability, why only allow 64511 unique AS's worldwide?] Damnit, elguapo! Enough history, tell us about this BGP!

OK, first, some more concepts:

Autonomous System (AS) - this is simply all of the routers configured to run by a single entity. Take the aforementioned RIP, all of the routers that a company might configure to run RIP could be considered an AS. A company could break their network up into divisions by AS, or not. Totally determined by their local conditions.

Interior Gateway Protocol (IGP) - This is a protocol designed to run within an AS. RIP, for instance, is an IGP.

Exterior Gateway Protocols (EGP) - This is a protocol that is designed to exchange information between differing AS's. BGP is an EGP. Note: EGP's can run within the AS as well, and in fact this is what creates the two "flavors" of BGP. Internal BGP (iBGP) is when two BGP peers are within the same AS, External BGP (eBGP) is when two peers are in differing AS's.

OK, first, BGP is a Distance Vector Protocol. (first article if that term is foreign to you!) It is also, oddly enough, not really a routing protocol at all! You'll see why in a bit. As you'll recall from the prior articles, Distance Vector Protocols use the concept of "Hop Count" to make routing decisions (anal-retentive types will point out that BGP is actually a path-vector protocol. OK). BGP's "Hop Count" is a concept called "AS Path". AS Path is just that: "Which AS's did I have to go through to get to that network?" "Out of the box", BGP simply grabs the shortest AS path, and shoots the packet in question that way. It doesn't take a rocket scientist to see that BGP could easily bite you in the ass. If a network is 1x 56kbs AS away, and 2x 45Mbs DS3 AS's away, BGP's sending that puppy over the 56k link every time. I know, make a scrinchy face. But that's BGP. That obviously blows chunks from a logical routing decision standpoint, so BGP gives you lots and lots o' metrics and other tricks to keep that kind of stupid shit from happening. But note: you have to do it! BGP very muchly "gives you enough rope to hang yourself", as it were.

BGP operates on TCP port 179, it uses TCP so that BGP doesn't spend too many resources on communications reliability. TCP will handle acks, retransmissions, etc. Here's the rub, since BGP only exchanges AS_PATH and network information, it really doesn't have a mechanism for "finding" it's neighbors, unless they have a directly connected route to each and every one of them. Having a directly connected route to each neighbor is feasible if your AS has, say, 10 or fewer BGP peers. What if you have 600? That would require 359,400 connections!! (n*n-1, FYI) Hey, elguapo, I thought you said this thing scaled? Well, it does. It is a very common practice to run an IGP (like OSPF) inside your AS for the sole purpose of having your BGP peers be able to "find" each other, thus eliminating the need for directly connected BGP peers.

So, how's it route? A BGP route update will look something like this: 172.18.0.0/16 (325, 127, 1256) That is, the class B network 172.18.x.x is reachable via AS's 325, 127 and 1256 - in that order. The first thing BGP will do with this update is search the AS_PATH for it's own AS. If it's own AS is in the update, then adding this route to it's routing table would cause a loop. So if it is there, it just ignores that update. At this stage, BGP sort of takes note of whether it learned the route from an iBGP peer or an eBGP peer. IF it gets two of the same routes, with identical AS_PATHs, it will choose the eBGP route over the iBGP route. The next thing it does is compare that update to any other BGP routes to that network (it keeps a seperate "BGP table" for all the updates), if this one is the shortest AS_PATH, it drops whatever was there first, and adds this one. At any one time, there may be countless route entries for the same route in the BGP table, but it'll only populate the "live" routing table with the best one. Maybe now you can see why I said earlier that BGP wasn't really a routing protocol: It's really just a "prefix exchanger". (to borrow a term from a buddy at my previous employer - thanks Bill!!)

This brings us to those metrics I mentioned earlier. (This is totally Cisco-BGP centric. It's what I know, sorry). There are various types of these metrics, basically resulting from BGP engineers getting continuously bit in the ass, and therefore tacking on one more metric to fix whatever was the "problem du jour". (A full blown assumption on my part there) Those types are:

Well Known Mandatory (WKM) - Well known means all BGP vendors need to support it, and Mandatory means just that, it has to be present in every update.

Well Know Discretionary (WKD) - Again, all vendors need to support it, but Discretionary means it doesn't need to be present in each and every update.

Optional Transitive (OT) - Optional means a vendor can support it, or not. If it chooses not to, then it just ignores that part of the update. Transitive means that if it does choose to ignore it, it should still leave it in the update when it passes that update on to it's peers.

Optional Non-Transitive (ONT) - Again, Optional means the same, but Non-transitive means that if you choose to ignore it, you can drop that metric from the updates you forward to your peers.

Now for the metrics:

  • ORIGIN (WKM) - Just that, where the route originated
  • AS_PATH (WKM) - Just that, the list of AS's you have to go through to get to that network. If you're sending an update to an eBGP peer, you will prepend your own AS to the AS_PATH.
  • NEXT_HOP (WKM) - this is the address of the next-hop router. "Well, WTF? Isn't this always going to be the address of the router that sent the update?" Nope. If you're advertising an eBGP route to an iBGP peer, NEXT_HOP will be the eBGP peer, not the advertising router. Otherwise, yes, it's the advertising router. You can force this with "NEXT_HOP SELF" when configuring the router.
  • LOCAL_PREF (WKD) - This is an attribute that doesn't leave your AS, as it is just what it says. It is your AS's "Local Prefernce" on how to deal with that route. You could use this attribute to fix the 56k vs DS3 example I used above. Just set the LOCAL_PREF of the DS3 to 200, and the 56k to 100, and your AS will use that DS3 unless it goes away.
  • ATOMIC_AGGREGATE (WKD) - This is how A BGP speaker tells it's peers that it's summarized numerous smaller routes into one big route. Why does this matter? Because when it does this, it whacks those smaller routes AS_PATH, and substitutes it's own AS.
  • AGGREGATOR (OT) - If a BGP peer does aggregate routes, this is a method for letting them know who did the aggregating.
  • COMMUNITY (OT) - this lets you create BGP "Communities", thereby letting you apply huge swaths of consistent BGP metrics to numerous BGP peers, without having to apply those metrics for each and every peer. Having an "iBGP" community and an "eBGP" community is a pretty obvius example.
  • MULTI_EXIT_DISC (ONT) - MED is sort of a LOCAL_PREF, but for the outside world. It's a way of telling an eBGP peer how you'd like them to send traffic into your AS.
  • ORIGINATOR_ID (ONT) - This prevents route loops when using "route reflectors". Think of route reflectors as a "route distributor". It kind of hands out routes to it's clients, thereby allowing net admins to minimize the number of full BGP peers. If I get a route with the ORIGINATOR_ID set to my router ID, we had a loop.
  • CLUSTER_LIST (ONT) - This is a list of RR ID's thorugh which that route has passed. Kind of like an AS_PATH of RR clusters. If a RR sees it's cluster ID in a received route, it knows there's been a loop.

Well, that's it. Using and manipulating those metrics is how a net admin can "steer" traffic within his network, and how he can try and steer traffic from the outside world into his AS. Why "try"? Well, MED, for instance, is optional. Your eBGP peers may just ignore the damn thing. If you're multihomed to numerous eBGP peers, and your managers throw out the term "load balancing", run for cover. In my experience, load balancing via BGP blows chunks. Almost always, one of your eBGP ISP peers is going to "better connected", and thus that link is going to get a majority of your traffic. Note in my design goals above, scalability and rerouting around major outages were why BGP was made in the first place. "Load balancing" was never in the mix, so I can't fault BGP for doing what it was designed to do, I guess.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Poll
BGP?
o r0x0rs 40%
o sux0rs 3%
o dead fish 9%
o a box of mints 12%
o nei! 21%
o a tiny little speck of phlegm 3%
o a slightly larger speck of phlegm 0%
o a full fledged honking booger 9%

Votes: 32
Results | Other Polls

Related Links
o introducti on.
o RIP.
o Also by el_guapo


Display: Sort:
Elguapo's Guide to Routing - Part 3, BGP | 35 comments (28 topical, 7 editorial, 0 hidden)
Well, there it be (4.66 / 3) (#1)
by el_guapo on Tue Aug 19, 2003 at 06:40:04 PM EST

Part 3. Sorry for the long delay/hiatus between parts 2 and 3 for those who like them. my previous employer basically threatened to fire me if i finished the series, and then 2 months later turned around and fired me anyways. (bitter? no - **i** would have fired me. seriously) :-/ lack of a paycheck meant lack of an internet connection for the last 7 months...
mas cerveza, por favor mirrors, manifestos, etc.
I have to ask (5.00 / 1) (#3)
by ZorbaTHut on Tue Aug 19, 2003 at 07:42:23 PM EST

why would he fire you for finishing the series? Unless you were doing it on company time, of course, but then wouldn't he fire you for not getting your work done?

[ Parent ]
(shrugs) (5.00 / 2) (#5)
by el_guapo on Tue Aug 19, 2003 at 07:47:51 PM EST

i think that since the article was posted during working hours, he thought i wrote the whole thing during working hours. or something. i didn't ask. wanting to keep my job and all. i admittedly DID write portions of it during working hours, but i didn't exactly have a lot to do. my new manager was obviously shifting work away from me to minimize the impact of my up and coming layoff. when your boss sends out an email begging for people with "Extra cycles", and you reply to said email saying "yes! yes! send me stuff", and then he doesn't send you stuff, one should assume SOMETHING was up :-/
mas cerveza, por favor mirrors, manifestos, etc.
[ Parent ]
Sorry (none / 0) (#28)
by The Amazing Idiot on Wed Aug 20, 2003 at 03:22:13 PM EST

Sorry about your job loss. All the best with you for finding another job ;-)

Still, I love these kind of articles. This is the kind of stuf that slashdot should have been (not said sarcastically). Instead they tured out to be a sort of "Your Rights" tripefest.

Also, something you may want to look into: http://www.phenoelit.de/irpas/

Quite a powerful suite of routing tools. It may be possible to explain what these tools' purposes are. VIPPR is also a nasty tool to virtualize computers on your own linux machine. You can create a virtual routing nightmare that most admins wouldnt begin to understand.

[ Parent ]

cool, thanks :) (5.00 / 1) (#29)
by el_guapo on Wed Aug 20, 2003 at 03:37:42 PM EST

i have a new job now. thus access to the internet. but thanks. it was a ROUGH 7 months. lost the house, yada yada. but we're back on our feet now and ready to rejoin the middle class, thank god :P
mas cerveza, por favor mirrors, manifestos, etc.
[ Parent ]
Routing :::peniz:::Q (2.60 / 10) (#2)
by peniz Q on Tue Aug 19, 2003 at 06:40:15 PM EST

Routing is like "war driving". It's wrong, plain wrong. But that doesn't stop me from reading about war driving, and it didn't stop me from reading this article and liking it. +1 FP

Dangerous content.

trivia tidbit (none / 0) (#11)
by el_guapo on Tue Aug 19, 2003 at 08:50:46 PM EST

"is there an AS 1"? Yup, it was owned by BBNPlanet, now Genuity...
mas cerveza, por favor mirrors, manifestos, etc.
Level3 (none / 0) (#21)
by mct on Wed Aug 20, 2003 at 12:30:26 PM EST

...and now Level3:

http://www.genuityestate.com/pr11-27-02.html
http://www.genuityestate.com/pr02-04-03.html
http://www.level3.com/genuity/

-mct

[ Parent ]
Question/clarification (none / 0) (#13)
by Ta bu shi da yu on Wed Aug 20, 2003 at 04:36:24 AM EST

I was studying this for my CCNP so I suppose I should check this out myself, but what do you mean by: "Because when it does this, it whacks those smaller routes AS_PATH, and substitutes it's own AS."?

A mite confused here.

Yours humbly,
Ta bù shì dà yú

---
AdTIה"the think tank that didn't".
ה

BGP route aggregation (5.00 / 2) (#14)
by Will242 on Wed Aug 20, 2003 at 05:16:15 AM EST

When a BGP router aggregates a set of routes, it is reducing the amount of routing information. Typically, it will lop off the AS-path of the smaller routes its aggregating and advertise the summary route under it's own AS number.

To indicate to others who might receive this aggregate announcement that the data has been munged in this way, the aggregator sets the ATOMIC_AGGREGATE atrtibute as a flag. It also has the option of keeping the "hidden" ASes from which the summary is built from in the AS-path, which it can do with an AS-SET (the aggregated ASes contributing routes to the aggregate space become an unordered set in the path).

So yeah, route aggregation in BGP gets messy; given IP's longest-match routing paradig, routes need to be aggregated consistently by all who might see them. Otherwise, you could end up with say, a bunch of /24s and an aggregate /20 out in the global table, and nobody will ever route on the /20 because the /24s are more specific. For this reason, most aggregation is done internally to an AS, or in specific cases on a customer-by-customer basis.

[ Parent ]

Routing Protocol Security (4.00 / 1) (#15)
by jsonic on Wed Aug 20, 2003 at 10:28:38 AM EST

Thanks for the series, el_guapo.  Can anyone give a general explaination on how routers authenticate the information they recieve from others?  For instance, with simple switches, mischievous users can redirect traffic by poisioning arp tables with false information.  This happens because the switches simply accept any arp table updates sent to them.

I assume/hope that routers are more picky about the information they accept, but hoping someone could provide more info.

depends on the protocol (5.00 / 1) (#16)
by el_guapo on Wed Aug 20, 2003 at 11:32:23 AM EST

Generally, they support MD5 password authentication between peers. RIP v1 does not, so any bozo running routed can totally fuck up your network if you're running RIP v1....
mas cerveza, por favor mirrors, manifestos, etc.
[ Parent ]
I understand the question but..... (none / 0) (#32)
by Yaroslav The Wise on Thu Aug 21, 2003 at 04:31:58 PM EST

...I just wanted to clarify that you meant to use "router" in place of "switch". You could not poison the arp table of a switch as it really does not have one (ignoring any mgmt ports). You could flood the CAM table of a switch (that is where MAC addresses are stored) and essentially turn a switch into a hub (makes sniffing easier for the hackers). As for protecting routers, you could use MD5 passwords, as el guapo stated in his reply, or you could use other tools like access-lists or prefix-lists, or even regular expressions using AS numbers that specify what and from whom you are willing to accept an update. On a global scale, you really rely on the ISPs to be careful, efficient and smart about routing updates. There is a large concern among many about the ease in which one could inject false routing info into the BGP routing tables. If you stick with the larger players, they tend to have old timers who know their stuff.

[ Parent ]
eek! (5.00 / 2) (#17)
by el_guapo on Wed Aug 20, 2003 at 11:38:09 AM EST

Slight error, my "BGP connections" formula was wrong. it's SUPPOSED to be: n*(n-1)/2. Still grows exponetially. so 600 router full-mesh would be 179,700 connections. sorry :-/
mas cerveza, por favor mirrors, manifestos, etc.
Quadratic growth, not exponential (n/t) (none / 0) (#22)
by niom on Wed Aug 20, 2003 at 01:27:20 PM EST



[ Parent ]
ya, my bad n/t (5.00 / 1) (#23)
by el_guapo on Wed Aug 20, 2003 at 01:44:08 PM EST


mas cerveza, por favor mirrors, manifestos, etc.
[ Parent ]
Load Balancing with BGP (4.00 / 1) (#18)
by encore on Wed Aug 20, 2003 at 11:38:58 AM EST

In your article you state that "load balancing via BGP blows chunks," but how do you do it?

I have two Ts for redundancy that both have the same weighting. One has around 75% utilization, and the other 25%. The vast majority of traffic is http being served to the public.

Is there any way of load balancing short of taking specific AS numbers and weighting them differently? Example take AS numbers for AOL or MSN and weight them so their traffic uses the under-utilized connection.

You said it (none / 0) (#19)
by jungleboogie on Wed Aug 20, 2003 at 12:14:07 PM EST

Sounds like you are doing it. Now you see why it blows chunks? Your 75% utilization link has better routes to the parts of the internet that you communicate with than your 25% utilization link. You just described how to assist it, weight various ASes differently. You could even try to weight everything coming from your 75% link down lower by 1 or 2 count and see if that helps to balance things out more.

[ Parent ]
to get 50/50 load balancing (none / 0) (#20)
by el_guapo on Wed Aug 20, 2003 at 12:29:17 PM EST

decouple bgp from your network. basically, have 2 "layers". the "top" layer connects to your ISPS, runs eBGP to the ISPs and iBGP between the 2 top layer boxes. the "bottom" layer is where all of your infrastructure connects. (total of 4 routers). THIS layer runs OSPF between all 4 boxes (the 2 bottom, and the 2 top) OSPF equal-cost multipath with "round-robin" between the 2 top level boxes, and when one of those boxes gets that packet, it will have 1)an iBGP route to one ISP, and 2)an eBGP route to the other. it will choose the eBGP route every time, and you'll get damn near 50/50 load balancing. the delta will be the rare case where only 1 of you ISPs has a route to a destination. got it? it works quite well....
mas cerveza, por favor mirrors, manifestos, etc.
[ Parent ]
AS path length is evaluated before IBGP/EBGP (none / 0) (#30)
by troydavis on Thu Aug 21, 2003 at 02:32:43 AM EST

> it will have 1)an iBGP route to one ISP, and 2)an
> eBGP route to the other. it will choose the eBGP
> route every time

AS path length is used earlier in BGP's best-route decision algorithm than is IBGP vs. EBGP, so that statement is a bit off.

For a multihomed transit customer with the 4-router setup you describe, any routes with different AS path lengths will be evaluated on path length without considering whether the route came from an IBGP or EBGP peer.  Exterior vs. interior source will be evaluated when all other preceding decision factors -- including AS path length -- are equal.

In that scenario, the percentage of routes that have equal AS path lengths depends totally on how different your transit providers are.  When one is the largest provider and one is a small local provider, almost every route will have different path lengths.  When both are transit-free and have global presences, the path lengths will be similar a lot of the time.

Someone who is already seeing 75/25 balance probably falls more into the former category.

As El Guapo says, AS path length is not indicative of real world performance in terms of latency or packet loss.  In the case above, the local provider could easily have longer AS paths across the board yet provide faster, more reliable connectivity.  But I digress..

If you want both border routers to always blindly prefer its own EBGP-learned routes ahead of IBGP-learned routes regardless of path length, manually set the route weight (which is evaluated earlier than AS path length).  The other posts have more elegant ways of solving the same problem, though. =)

[ Parent ]

BGP peers in two different ASs (none / 0) (#31)
by m a r c on Thu Aug 21, 2003 at 02:59:44 AM EST

That will only work if you are connected with two links to a single AS. If you are multihomed (i.e. connected to two diff AS's via different service providers), then you can't use a protocol like OSPF to load balance because it won't be being run by yourself and both providers. Basically BGP is a reachability protocol and as such was not designed for load balancing. Due to BGP's composite metric you do have some control about which path is preferred for outgoing traffic (Local pref, weight - on cisco). You also have some control over inbound traffic with the MED option, but this does not have to be adhered to by the neighbour AS and its influence is only to the direct neighbour.
I got a dog and named him "Stay". Now, I go "Come here, Stay!". After a while, the dog went insane and wouldn't move at all.
[ Parent ]
AS Prepending (none / 0) (#24)
by Kebinu on Wed Aug 20, 2003 at 02:06:19 PM EST

Other options include do AS Path Prepending, here is a document from Cisco's website on Load Sharing with BGP.

[ Parent ]
re: prepending (5.00 / 1) (#26)
by el_guapo on Wed Aug 20, 2003 at 02:13:57 PM EST

all i could ever accomoplish with as prepending was to make one route "turn completely off", ie: completely go away. it was a meat cleaver approach to traffic shifting, for sure. (IE: all or nothing) the best way is just to add a layer between your internal net and your isp connected boxes, and run OSPF equal-cost multi-path between the 2.
mas cerveza, por favor mirrors, manifestos, etc.
[ Parent ]
Load Balancing with BGP (5.00 / 2) (#27)
by Will242 on Wed Aug 20, 2003 at 02:55:15 PM EST

Other than the big stick of AS-path prepending, your provider may have some finer-grained controls. Check out Sprint's setup for example -- down at the bottom of the page there's a 'what you can control' section. They let you set communities (basically numeric tags in your BGP routes that you send them) that tell Sprint's routers how to distribute your routes to others.

In a *huge* ISP like Sprint all under one AS (1239), this kind of control can be very important, because even prepending your AS once in your outbound announcements can have a huge affect on traffic patterns if that prepending gets propagated to *everyone*. By setting the communities they outline, you can control your announcements at the next AS up, telling Sprint to prepend your routes once to AS X, Y, Z, and 0 times to A,B,C. It can be time consuming to set up, but is fun to play with (if you're into this sort of thing).

If your upstreams don't support stuff like this, bug them! This is the kind of neat feature you should get when you buy high-quality IP transit.

Of course, the balancing act gets easier as your network grows. The more prefixes you announce, the more you can spread your balancing tweaks among them to get the traffic patterns you want.

[ Parent ]

Further reading (5.00 / 1) (#25)
by Kebinu on Wed Aug 20, 2003 at 02:11:05 PM EST

I just thought I would chime in and recommend a nice source for further reading on BGP. Internet Routing Architectures by Sam Halabi is sometimes referred to as the BGP Bible for its comprehensive coverage of the protocol and its workings. This book is an all around good read and reference book would be Internet Routing Architectures by Sam Halabi [Bookpool.com] Routing TCP/IP Volumes I & II are also very good books for anyone intersted in more in depth reading on routing protocols... These books have a slight Cisco slant to them, but are general enough to be useful in other environments as well.

fun article, but what about ARP? (none / 0) (#33)
by mveloso on Wed Aug 27, 2003 at 12:27:09 AM EST

You forgot the blackest of the protocols, ARP. Does anyone actually know how it works anymore?

quite easy (none / 0) (#34)
by alejos1 on Fri Aug 29, 2003 at 10:58:32 PM EST

ARP is a simple protocol for ip discovery on a segment of a local network. You didn't even google it. RTFM!

[ Parent ]
How ARP works (none / 1) (#35)
by jtk on Sun Dec 28, 2003 at 01:06:11 AM EST

Yes I know how it works and so do a lot of people. The first place to look is at IETF RFC 826. ARP has a few different modes of operation in addition to what is usually referred to as just "ARP". There is also Proxy ARP, Reverse ARP, DHCP ARP and others. ARP has often been called a "hack". In other networking protocols (e.g. IPX and IPv6) something akin to ARP is not required. The reason it was required in IPV4 was to be able to associate the smaller IPv4 address (32 bits) with the larger, commonly used Ethernet MAC address (48 bit).

It is often referred to as a layer 2 1/2 protocol, because it does not use a network layer packet format such as that provided by IPv4 underneath.

In its simplest form of use, a host wishing to discover the destination MAC address of a station on the local IP subnet, sends an ARP request. The destination MAC address in this discovery phase is the broadcast (all 1's) address. Inside the message is the IP address that the sender is looking for. All MAC stations on the local segment will receive this ARP message and inspect it. If a host finds that the destination IP address inside the ARP message is one it is responsible for, it sends an ARP reply containing the needed MAC address to the requestor.

Stations typically build up an "ARP cache" so future transmissions do not have to go through the ARP discovery process again. That way stations can send layer 2 frames as unicast messages rather than having to always broadcast them. Routers typically timeout ARP entries in the cache after a few hours. End stations typically keep cache entries for only a few minutes.

If you monitor network traffic from your local host, you will likely see most ARP traffic from your host to your local default gateway (also known as your local router). Although if you are on a large subnet and talk to many hosts on the same subnet you will certainly see plenty of ARPs between your host and those local subnet hosts as well.

Also note, ARP is inherently insecure. There is no authentication mechanism and it is easy for a rogue station to send bogus ARP information to a host.

I'm leaving out a lot, but hopefully you get the idea. Perhaps I should do a more complete write-up in a story of my own?

John

[ Parent ]
Elguapo's Guide to Routing - Part 3, BGP | 35 comments (28 topical, 7 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!