Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Let Bogons Be Bogons: A Nightmare from ISP Hell

By lamppter in Meta
Tue Jul 04, 2006 at 02:05:07 PM EST
Tags: TEH WIERDNESS, TEH INTARWEB, TEH STUFF, TEH NOC, TEH K5 ADMIN (all tags)

I work for a tier 2 ISP as a WAN and Systems Administrator. We have approximately 160,000 users and provide Internet Services to 13 large organizations.

A couple of years ago our backbone was on a tier 2 provider with its hub located in Austin, Texas. We decided to provide our customers with greater bandwidth, less cost and better service by changing to a major tier 1 backbone and getting a two 45 Mbps(DS3) pipes to the Internet. We also had the ability to add more bandwidth later with a newly purchased second Cisco 7513 router.

We carefully planned the move for 6 months and brought our customers into the planning process. So we sent out an RFP, awarded the bid and began the process. What followed was to become a disaster. It was the "perfect storm" for an ISP failure.

"What's the problem? Why can't I go to my favorite website?" My-Pointy-Headed Boss asked.

"So, what's your favorite website?" I asked her. I knew how I was going to answer.

"AOL.COM." she tells me, quite disturbed and very angry.

"We're on the bogons ip list." I told her. She was furious and didn't have the foggiest idea what it was that I was talking about. To top it off the Executive Director was also furious, it's his favorite website too.

The glitch was this. The problem was caused by a multi-million dollar tier one Internet Service Provider and ended up costing us a great deal of money in the short run.

Here's the story of what happened.


The Preparation
The reason it is such a concer to change ISPs is that your customers have to change their IP address space. For our largest customers that is a big issue. They all have to look at their infrastructure very carefully, even devices they have forgotten about. For example, every router, layer-3 switch, mail server, DNS server and file server needs to be looked at and reconfigured. Also, they all had to change their DNS entries at their Internet registrar, so that the world can find them.

So I prepared an extensive checklist for each customer to go through and check. Some customers had considerable technical expertise and didn't need this. They used it as a reference. Other customers really needed hand holding and relied on us for technical support to get them through it. Our preparation included a seminar for the less technical customers so that they would at least understand what was going on. Many of them had hard coded workstations with DNS entries instead of using DHCP. We went so far as to show them how to set up a DHCP server and showed them how to set up DHCP on the individual workstations.

We also provided primary and secondary DNS services for at least half of our customers. To make things easier we purchased two new blade servers for DNS and had them reconfigured to the new CIDR block. We then requested and received a /20 CIDR block from our ISP. This provided us with essentially 16 Class C networks for our customers' needs. With the smaller customers we had to slice up our /20 into a couple of /26s. It all seemed simple, well planned and prepared. I had documented everything necessary and provided this to our customers.

Classless Inter-Domain Routing (CIDR)
Why is CIDR important? Without it, the Internet would probably have run out of IP addresses long before now. CIDR allows ISPs to more efficiently use the IP address space that we are quickly using up. IP addresses are a finite resource, like oil. We then sliced up the /20 that we received into /26 CIDR blocks to give to our customers. This gave them 62 usable IP addresses.

Our customers in turn, would put their customers on private networks and then use various forms of Network Translation (NAT). For example, most people reading this probably have a private 192.168.0.0 IP address. Private IP addresses are in the following ranges:
10.0.0.0-10.255.255.255
172.16.0.0-172.31.255.255
192.168.0.0-192.168.255.255
This is probably the case if you have a router/firewall at home. Your ISP might then assign your router/firewall another private IP address before it finally receives a public IP address. This brings us back to the public CIDR blocks of IP addresses.

It was now early June, I had been working on this since the previous Christmas and was sure things would go smoothly. We had a date for the cut over from our old backbone provider and then notified our customers well ahead of time of the date of the cut over to the new ISP.

Gathering Storm On the Horizon
The big day finally arrived. I had double checked procedures with our customers and they were all prepared. Part of our plan was to do one-a-day with our largest customers and then two-a-day with our smaller customers. We decided to do our largest customers on the weekend. They both had thousands of their own customers. Downtime would be minimal, DNS issues notwithstanding, would take awhile to update throughout the Internet.

A couple of weeks earlier our company switched over to the new backbone and we saw no problems. We are quite small but we tested connectivity and were quite pleased. We decided that our customers would be quite happy with the larger pipes, more bandwidth and lower costs.

The changeover for the first large customer went flawlessly, they had planned well and were ready to go. I was running the other DNS servers for a couple of weeks and some of our customers for awhile pointed to them. The whole changeover occurred in less than 30 minutes. Our small team hung around and helped them through a few glitches and called it a day. We told them they could call us anytime day or night and they received my cell number if they wanted to call me with any questions. Pleased and satisfied we went for drinks at a nearby bar.

The next day was a Sunday and we switched over our next largest customer. They had some concerns about the changeover, so while we did it they sent some of their technical specialists over to take a look and monitor their new network. We had a switch available for them to plug into monitor their nets. All went like the previous customer and the changeover was flawless. They were pleased. It continued this way for the rest of the week with little hand holding on our part.

Bogon IPs
Lurking in the background all this time but unknown to us was the bogons IP listing. Our tier 1 provider assured us several times over the phone and via email and their bulletin board that the /20 CIDR block was good, fresh and not previously used. The following week as we were changing our smaller customers we began getting tech calls from our larger customers. The problem; there were certain web sites they couldn't reach. I remember calling up the Provisioning Manager and asking if some other ISP had previously used the /20 we now had. We were beginning to have sporadic website and email issues.

What are bogon IPs you are asking? Well I didn't know either. There is no reason to really. Here is what they are.

Officially they are not IP blocks officially allocated by IANA or RIRs. Additionally, they should not be routable. This is no big deal unless you are an organization that puts these lists in your routers so that if they show up on your WAN circuit the bad packets are immediately dropped. Still, this is a common practice with some organization, no big deal nothing wrong with that. However, there is a very big problem that happens very rarely and it happened to us.

The /20 CIDR block that had been allocated to us by our tier 1 backbone provider was allocating an IANA unallocated /16 CIDR block! The result was 1000s of ISPs dropping our packets to them. This was happening on DNS lookups, MX lookups and for websites. For example we were getting calls from our customers saying that they could not get to a airline's website or some email was not getting to recipients.

"Have you double checked your firewall settings?" I would ask them scratching my head.

"Yeah we have double checked a hundred times! Something is not right." The network admin advised me.

"Will you check some DNSRBL lists?"

"ARRRGGHHH!" would come the response.

The first 10 of these I answered this way. I couldn't figure out what was going on. Finally, I call our upstream provider.

"I need to speak with our Provisioning Manager." after slamming through the lame voice menu crap.

"Our customers cannot get to certain websites, was the CIDR block you gave us clean and never used before?"

"Yes, they are clean and have never been used." That was the answer I would get in a somewhat condescending tone.

"Would you check for me please?"

"Let me put you on hold while I talk to the IP Address Team."

This happened a couple of times until I got disgusted and hung up. It was difficult diagnosing this problem from work because I could only see what I could see from there. So, on the second day of this nonsense I took a freebsd box home, put it on my network and started looking at our CIDR block from the outside.

First I tried to ping and traceroute to our router. What I saw were packets being dropped long before they even reached our backbone. In fact, I tried doing DNS lookups off my DNS server and nothing, email bounced also ... not a good thing. I only received non-existent domain responses.

Deciding that something somewhere was dropping ICMP packets I decided to traceroute using mtr (Matt's Traceroute). It is a nice tool that combines ping and traceroute. Also it gives good statistics that you can copy and paste. I was consistently getting dropped packets to my network at work. I ran mtr going to all our interfaces on our two Cisco 7513s from home. Then I used an online traceroute and ping at DNSStuff.com and started gathering data from there. None of it was looking good but I had come to the conclusion that there was some vague firewall blocking me going to various sites.

Then I decided to google the first Class C slice out of the /20 CIDR block. BINGO! A ton of information sprung up. At the top of the list was a bulletin board entry buried deep inside our tier 1 backbone's website. The problem first occurred in February and noone noticed for a month.

The bogons list is called the Bogon IP List. What some network administrators do is put this list in their router's ACL and other infrastructure devices. It is a good idea unless you forget to update the list. Which is what we were now confronted with.

The /20 CIDR block was only on the list for a short time, one month in fact from February to the end of March. When I discovered the bogons list and did not see the references to the URLs that were on the tier 1 ISP's BBS I figured out that it had been quickly removed by whoever made the mistake. But during that short period 1000s of Network Admin had stuffed it into their routers and hadn't bothered updating the list. Our /20 did not show up on the June and July list.

Help Desk Nightmare
Our phones are now ringing off the hook with help desk calls, angry customers and I don't blame them. I go to work every morning and stay on the phone all day, literally.

I email all my evidence to the ISP's Project/Provisioning Manager. Then I call him up. They tell me they are working on the problem. My bosses are fuming.

"Did you read my email?"

"We are going over it now. Can I call you back?"

"NO, I want some answers now and so does my boss! I have a bunch of angry customers. Do you want me to transfer their calls to you?"

Once again I am put on hold and in anger I hang up and report the results to the boss. I call the Provisioning Manager back and he directs me immediately to a network specialist.

"Did you know about the complete /16 being on the bogons list in February?"

"Well um...yes but it is not on there now."

The fact is once a CIDR block is on the list it takes years for it to get clean. This is due to busy admin not updating the list very often and thousands of router admin use the list.

We demanded a new /20 and it provided us with a work around. The new /20 was clean and we started routing everyone through it. The IPs that blocked us we entered in the router. Eventually, this ate up all the router resources.

As it turns out, the Executive Director knows the VP at the ISP. He calls him up. Our ISP wants us to call all the main admin together for a big egg-on-the-face meeting with lots of swag and a free breakfast. I sense this will be fun to watch these fat cats weasel out of this. But I was worried that our customers would start leaving us.

Over the weekend, I had written a simple PHP weblog for our customers to log sites they could not reach. I was beginning to number hundreds of sites and and the MySQL database was growing.

So the morning breakfast came and it was filled with Bosses and network specialists. The tier 1 ISP had sent VPs, PR people and a Technical Staff. They gave their canned speech then the techs started in on them.

"How is this gonna be fixed?"

"Are you gonna charge us? Will we get rebates until this is fixed?"

This went on for about an hour. Finally, our Executive Director rose and said,

"There will be no charge and they will fix this today or tomorrow."

My boss, who at this point knew all about the Bogons list, said,

"And for those that want to switch will receive new Class C IP addresses. Those that can't switch immediately will have their Class C addresses reserved when they are ready."

Over the next month everyone switched to the clean /20s and we lost no customers. We eventually straightened out all the billing problems with the upstream provider. Even with all that mess billing became the next problem.

Eventually the storm calmed. I returned to sleeping at nights once again.

NOTES:
As suggested in the comments by nasty1, a better bogons listing: Team Cymru

Tier 1 ISP: A Tier 1 ISP is a telco or Internet service provider IP network which connects to the rest of the Internet only via a practice known as peering.

Tier 2 ISP: A Tier 2 carrier (or Tier 2 ISP) is an Internet service provider who peers with other networks, but still pays for IP transit to reach some portion of the Internet.

Peering: Peering is the practice of voluntarily interconnecting distinctly separate data networks on the Internet, for the purposes of exchanging traffic between the customers of the peered networks. Peering is also known as settlement-free interconnection, which indicates that neither party pays the other for the traffic being exchanged. It is my understanding that depending on how Net Neutrality turns out, this may change the way you use the Internet, peering as it exists now may no longer exist.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o Google
o tier 2
o tier 1
o DNSStuff.c om
o Bogon IP List
o this list
o nasty1
o bogons listing: Team Cymru
o Also by lamppter


Display: Sort:
Let Bogons Be Bogons: A Nightmare from ISP Hell | 78 comments (63 topical, 15 editorial, 0 hidden)
i am so glad to see this back where it belongs.. (1.91 / 12) (#2)
by dakini on Mon Jul 03, 2006 at 09:20:57 PM EST

i will certainly vote +FP when in voting..

" May your vision be clear, your heart strong, and may you always follow your dreams."
+1, bomb the system. $ (1.57 / 7) (#5)
by akostic on Mon Jul 03, 2006 at 09:31:15 PM EST


--
"After an indeterminate amount of time trading insane laughter with the retards, I grew curious and tapped on the window." - osm
HERE ARE THE ORIGINAL COMMENTS... (2.88 / 9) (#10)
by lamppter on Mon Jul 03, 2006 at 10:36:26 PM EST

It is nearly impossible to get them back to their original state. SO HERE IS THE LINK TO THE ORIGINAL COMMENTS BEFORE IT WAS SHIT-CANNED BY SOMEONE WHO-KNOWS-WHO.

If you hate me or the story so much then get your nullo/trolls/dupes to work. When it goes to voting again.

TROLL AWAY...

Naive Bayes Classification and K5 Dupes

Didn't you learn (1.04 / 25) (#11)
by AlwaysAnonyminated on Mon Jul 03, 2006 at 11:24:26 PM EST

the first time I dropped this story?
---------------------------------------------
Posted from my Droid 2.
It's raining mud, it's about to be (1.04 / 22) (#12)
by lamppter on Mon Jul 03, 2006 at 11:38:14 PM EST

the 4th of July, I will be eating brisket that is going to be dripping with fat, no I won't choke, I am not overweight, I have fun, I have a social life, I won't mindpixel or klerck myself, I will drink rye whiskey tomorrow, shoot off some fireworks (ala T1ber), let this story die, it has errors, my last edit disappeared, I am too disgusted to fix any more of it, it wasn't worth the effort, ALL comments were good even the trolls and  obvious lozer comments, thanks to those that helped me with the edits, good night.

I DON'T GIVE A SHIT EITHER.">

KTHX HAND

Naive Bayes Classification and K5 Dupes

I pushed it to voting .... (1.09 / 11) (#18)
by lamppter on Tue Jul 04, 2006 at 06:12:13 AM EST

so get rid of it.

Naive Bayes Classification and K5 Dupes
+1 FP (2.11 / 9) (#20)
by tetsuwan on Tue Jul 04, 2006 at 06:28:36 AM EST

Where is aphrael's vote?

Njal's Saga: Just like Romeo & Juliet without the romance

cast, +1fp. (none / 1) (#31)
by aphrael on Tue Jul 04, 2006 at 01:19:19 PM EST

i'm in west coast usia, and today is a holiday, so i didn't see it had gone to vote until just now. :)

[ Parent ]
APPARENTLY YOU DIDN'T LEARN (1.02 / 39) (#23)
by AlwaysAnonyminated on Tue Jul 04, 2006 at 09:20:38 AM EST

THE FIRST TIME I BANISHED YOU FROM THE VOTING QUEUE. I SUPPOSE YOU ARE IN NEED OF ANOTHER LESSON. DON'T GO TO SLEEP.
---------------------------------------------
Posted from my Droid 2.
Glad to see this is back to voting +FP (1.55 / 9) (#24)
by free2delude on Tue Jul 04, 2006 at 09:27:48 AM EST


$$Any fool can criticize, condemn, complain. And most do.$$
spigot would be pleased (1.36 / 11) (#26)
by indubitable on Tue Jul 04, 2006 at 11:17:47 AM EST


What kind of sick fuck doesn't want to roger some dude wearing a bear suit?

spigot (2.25 / 4) (#33)
by indubitable on Tue Jul 04, 2006 at 02:52:58 PM EST

spigot

What kind of sick fuck doesn't want to roger some dude wearing a bear suit?
[ Parent ]

my favorite thing about this ascii (2.90 / 10) (#29)
by Tex Bigballs on Tue Jul 04, 2006 at 12:48:42 PM EST

is that the poster didn't even bother taking out the anti-slashdot/taco stuff at the bottom

[ Parent ]
that often happens, (2.33 / 3) (#37)
by livus on Tue Jul 04, 2006 at 11:26:30 PM EST

it's a feature.

---
HIREZ substitute.
be concrete asshole, or shut up. - CTS
I guess I skipped school or something to drink on the internet? - lonelyhobo
I'd like to hope that any impression you got about us from internet forums was incorrect. - debillitatus
I consider myself trolled more or less just by visiting the site. HollyHopDrive

[ Parent ]
you are a disgusting piece of shit, two year (1.04 / 21) (#27)
by dakini on Tue Jul 04, 2006 at 11:21:36 AM EST

olds have more sense then you and more morals..

" May your vision be clear, your heart strong, and may you always follow your dreams."
0 - unoriginal, unfunny. (1.13 / 15) (#30)
by tetsuwan on Tue Jul 04, 2006 at 01:16:06 PM EST


Njal's Saga: Just like Romeo & Juliet without the romance

About time an excellent tech write up (2.50 / 6) (#32)
by Sandwormrum on Tue Jul 04, 2006 at 01:50:23 PM EST

showed up. Well Done.
**Any technology distinguishable from magic is insufficiently advanced.**
Attn Kurons: This is exactly what happened... (1.80 / 15) (#34)
by lamppter on Tue Jul 04, 2006 at 05:54:54 PM EST

A tip of the hat to aphrael, he owns his stuff. That is cool coz few people do.
Below is the email exchange I had with the admin that caused all this...aphrael. It took some balls for him to write the email to me I am sure. He is allowing me to reprint it here.

Email #1 My complaint:
Date: Mon, 3 Jul 2006 15:18:25 -0700 (PDT)
From: "lampp ter" <lamppter>
Subject: Is there an explanation for why
To: rusty@kuro5hin.org
CC: help@kuro5hin.org
my story was taken from voting with a score of 51 and total votes of 63? Also, the timestamp was at the time it went to voting was not even reset.

For what it's worth, I spent a lot of time on that story.

Can someone explain?

Email #2 Response from aphrael:
Date: Mon, 3 Jul 2006 23:24:39 -0700 (PDT)
Date: Mon, 3 Jul 2006 23:24:39
-0700 (PDT)
Subject: Re: Is there an explanation for why
From: aphrael@discontent.com
To: "lampp ter"
CC: help@kuro5hin.org

> my story was taken from voting with a score of 51 and total votes of
63?
> Also, the timestamp was at the time it went to voting was not even
reset.

yes.

the explanation is that i'm an idiot: i read a comment you had made
saying
that you were going to pull and resubmit, saw that the story was in
voting, and concluded (incorrectly) that the story had been moved out
of
editing *while you were still actively editing it* by misuse of 'move
to
voting'.

(this happens a couple of times a year; normally it's signaled by an
author complaining in a comment that his story went to voting before he
was ready.)

so i put it back in editing, figuring this would give you time to make
the
changes you'd wanted to make.

of course, i was very badly wrong about the whole thing.

my apologies for the horrible fuckup.

TEH SPECIAL RECOGNITION GOES TO Girls Dont Like You without his comment the story would have never have hit Front Page.

Thanks to the people in the edit queue that helped me edit the story. Some of it got lost though.

I want to thank the following users and their diaries, in the order they appeared.

Stories Taken out of voting dakini
zOMG K5 ISN'T A MERITOCRACY creative dissonance
dear k5 admin desudesudesu
(L)user revolt tetsuwan
Dear Admins GhostOfTiber
Adding to the whiny indignant protest Shower of Gold
K5 is not dying crazy canuck
Why don't you people just shut up? norm
Let's to be having a Crusade! Roger Mexico
Tinfoil hat Kurobots... ktakki
You useless wetback admins, I'm calling you out. johndaego
Where is the Admin Accountability? AlwaysAnonyminated
How many kurons does it take to post a story regeya
pre-emptive ending of gay diary flood desudesudesu
Corrupt admin finally outed by yours truly: desudesudesu
Dear Admins t1tber


Naive Bayes Classification and K5 Dupes

Glad to have helped. (1.00 / 2) (#35)
by Emacs Or Pico on Tue Jul 04, 2006 at 06:39:04 PM EST

Buttsechs?
--
   .-' &   '-.
  /           \
 :   o    o    ;
(      (_       )
 :             ;
  \    __     /
   `-._____.-'
     /`"""`\
    /    ,  \
   /|/\/\/\ _\ 
  (_|/\/\/\\__)
    |_______|
   __)_ |_ (__
  (_____|_____)

[ Parent ]
Thanks (2.33 / 3) (#36)
by CAIMLAS on Tue Jul 04, 2006 at 07:33:16 PM EST

Ya (re)learn something every day... I've heard of bogons - way back in high school when I had closet dealings with such things. Haven't been around that stuff for a while, good for the refresher...
--

Socialism and communism better explained by a psychologist than a political theorist.

BGP? (2.90 / 10) (#38)
by jammib on Wed Jul 05, 2006 at 08:57:06 AM EST

Sorry if this seems a bit trollish.

How comes you don't use BGP, and what we in Europe call PI IP space.  PI IP space is basically your own range of IP addresses i.e. your own /19 which you then advertise out to your upstream transits via BGP.  That way you can chop and change transits to your hearts content without ever changing IP addresses.

It makes life a lot easier.

Jammib

I don't think it would have helped (1.50 / 2) (#64)
by Perpetual Newbie on Wed Jul 12, 2006 at 01:29:25 AM EST

Bogons are added into routers as explicit null-route statements to toss anything coming to or going from certain IP blocks. I'd expect the explicit ip route statements to take precedence over BGP.

[ Parent ]
Portable address ranges? (2.84 / 13) (#39)
by vectro on Wed Jul 05, 2006 at 05:06:30 PM EST

Why in gods name are you using an IP address range allocated from a single upstream ISP? The right thing for an organization of this size to do is get a portable /16 directly from ARIN, allocated to your company, and then tell your upstream ISP to announce it to the world.

Also, then you can have connections to multiple upstream ISPs, so you're not dependent on a single supplier for your entire product.

“The problem with that definition is just that it's bullshit.” -- localroger

I like articles of a technical nature (1.38 / 13) (#40)
by LOSEWEIGHTORDIE on Wed Jul 05, 2006 at 06:57:10 PM EST

I mean, well written articles of a technical nature. Rewrite.

i can't believe this got voted up (1.36 / 11) (#41)
by Love Child of Baldrson and HollyHopDrive on Wed Jul 05, 2006 at 09:49:36 PM EST

"Officially they are not IP blocks officially allocated by IANA or RIRs. Additionally, they should not be routable. This is no big deal unless you are an organization that puts these lists in your routers so that if they show up on your WAN circuit the bad packets are immediately dropped. Still, this is a common practice with some organization, no big deal nothing wrong with that. However, there is a very big problem that happens very rarely and it happened to us.

The /20 CIDR block that had been allocated to us by our tier 1 backbone provider was allocating an IANA unallocated /16 CIDR block!"

what the hell does that mean? the whole article revolves around bogons and i have no more idea of what they are after reading it than before.

trane: Eventually the human race will realize that scientific progress is (almost always) slowed down by lies, and promote truth, justice and the American way over lying, discrimination, and the lesser American way.

There are two links in the article that explain (2.66 / 6) (#42)
by bushmanburn on Thu Jul 06, 2006 at 01:01:52 AM EST

what bogons are. If you click on both links there is a short explanation of what they are. I think that is why the links are there YMMV though.

[ Parent ]
do they? (1.50 / 6) (#43)
by Love Child of Baldrson and HollyHopDrive on Thu Jul 06, 2006 at 01:12:03 AM EST

it's not clear in the article that they explain it. face it, his english is just fucking terrible.

trane: Eventually the human race will realize that scientific progress is (almost always) slowed down by lies, and promote truth, justice and the American way over lying, discrimination, and the lesser American way.
[ Parent ]
Well they explained it to me anyway... (2.75 / 8) (#44)
by bushmanburn on Thu Jul 06, 2006 at 01:26:20 AM EST

this link and

this link as well were both in the story.

For me it was straight forward and made sense and wasn't to hard to figure out. But like I said, YMMV.

[ Parent ]

Still, (3.00 / 3) (#62)
by werner on Mon Jul 10, 2006 at 07:35:16 PM EST

he knows about capital letters. If you must insist on criticising other folks' English, at least bother to make sure your critique makes use of it, too.

Also, neither the author nor the other posters are responsible for your ignorance.

[ Parent ]

"technology and culture,from the trenches (1.77 / 9) (#51)
by circletimessquare on Thu Jul 06, 2006 at 08:23:49 PM EST

see that on the masthead?

good!

now shut the fuck up

the entirety of the problem you have with this story is that it is about "technology and culture, from the trenches"

therefore, you're the problem, not the story

so we're sorry you're too fucking retarded to catch up

but maybe you're on the wrong fucking website

try this one, it better matches your mental faculties

The tigers of wrath are wiser than the horses of instruction.

[ Parent ]

you always nail it CTS /nt (2.00 / 4) (#52)
by bushmanburn on Thu Jul 06, 2006 at 09:35:31 PM EST



[ Parent ]
This is weird, I'm actually agree completely [n/t] (2.75 / 4) (#56)
by vadim on Sat Jul 08, 2006 at 10:54:45 PM EST


--
<@chani> I *cannot* remember names. but I did memorize 214 digits of pi once.
[ Parent ]
ooh vitriol (1.50 / 2) (#69)
by Love Child of Baldrson and HollyHopDrive on Wed Jul 12, 2006 at 06:07:38 PM EST

to be precise, the entirety of the problem i have with this article is that it is poorly written, yet got voted FP anyway exactly because it is "technology and culture, from the trenches" which, imo, is not a sufficient reason to vote anything up, no matter how cool the topic.

trane: Eventually the human race will realize that scientific progress is (almost always) slowed down by lies, and promote truth, justice and the American way over lying, discrimination, and the lesser American way.
[ Parent ]
I'll speak slowly (none / 0) (#78)
by sgp on Fri Aug 04, 2006 at 06:51:02 PM EST

The /20 block is a block in which the first 20 bits of the 32-bit IP address relate to the network (the other 12 bits identify the host), so:

11111111 11111111 11110000 00000000

The 1's represent network address, the 0's represent host addresses. That't the /20 block. However, that's a subset of the /16 block which was on a list of "never-allocated IPs" aka Bogons:

11111111 11111111 00000000 0000000

So - from this huge (half-network, half-host) address range, they got allocated a smaller range, which was also (nartually) marked as "never-allocated IPs".

It's not terribly difficult to understand, <i>given that you understand how IP works in the first place</i>. If you don't understand the basics of CIDR, http://public.pacbell.net/dedicated/cidr.html looks like a decent summary of the problem and how CIDR solves the problem.

This isn't the place to explain every last bit of detail to you - if it sounds interesting, then investigate the details, you'll learn lots. Some people think that how data is routed around and between networks, to create "the internet", is really interesting and relevant; others think that it's boring geeky stuff. I don't know (nor do I care) which side of the fence you are on, the point is that this isn't a site which claims to take anyone from a point of knowing nothing about a subject to the point where they can fully understand all the issues. It's a site for people to discuss things, which means that a certain amount of knowledge has to be taken as a baseline.

There are 10 types of people in the world:
Those who understand binary, and those who don't.

[ Parent ]

Shouldn't hardcode bogons (2.50 / 4) (#45)
by bithead on Thu Jul 06, 2006 at 12:23:57 PM EST

The best way is to set up a BGP peering session to someone else (like the cymru route server project) who updates the portion of the bogon list comprised of 'public' IP addresses not yet in use.  That way you don't inadvertently block newly allocated network addresses, and still stop undesirable routes.  If you believe cymru, about half of all attacks use bogon IP addresses as the source addresses, so its probably worthwhile to block them.  Still, there are probably lots of people who hard code the list into their routers, since they might perceive it as easier to understand than the black art of BGP.

Stop making shit up (1.55 / 9) (#46)
by jungleboogie on Thu Jul 06, 2006 at 01:53:42 PM EST

You have 2 DS3s yet you serve "160,000 users" and "13 large organizations" ??

And you do this with an ancient, buggy, piece of shit Cisco 7513?

Uhh, yeah, right.......

Even better (1.40 / 5) (#47)
by jungleboogie on Thu Jul 06, 2006 at 01:57:03 PM EST

You serve 160,000 users out of a /20 !?!?!
You work at an ISP yet your bosses favorite web sites are AOL.COM !?!?!
This is so wrong, I don't even know where to begin.

[ Parent ]
Don't begin because you sound like (2.00 / 3) (#53)
by bushmanburn on Thu Jul 06, 2006 at 10:13:12 PM EST

you dont know what the hell you are talking about. Honest...you are completely clueless...

[ Parent ]
No offense but you sound like (2.33 / 3) (#54)
by bushmanburn on Thu Jul 06, 2006 at 10:16:52 PM EST

you are totally stupid.

I work as a consultant ... and there are plenty of 7513s in service doing very well. If it's not broke it doesn't need fixed sometimes.

The "latest and greatest" is always the best. besides "buggy" has more to do with IOS than hardware you dipshit.

[ Parent ]

Thanks (none / 1) (#59)
by jungleboogie on Sun Jul 09, 2006 at 06:15:05 PM EST

But I've used 7505s and 7507s for the past 10 years.  I'm sick and tired of the SOFTWARE BUGS that cisco hasn't fixed in that time.  Except for RAM issues, the hardware has been relatively solid.  I have hardware that has been EOL for 5 years that cisco has still never fixed properly.  It's mostly software issues, or hardware shortcomings that were part of the design.  I moved on to Juniper M20s and I will never look back.  Hope your consulting is going well with those 7500s.

[ Parent ]
totally stupid? (none / 1) (#60)
by jungleboogie on Sun Jul 09, 2006 at 06:21:39 PM EST

I'm totally stupid because I'm questioning anyone on the modern internet who serves 160,000 users with a 7513 and a /20?? Maybe 160,000 dialup users behind NAT.  Or, make that 160,000 dialup users on a 10:1 user to modem ratio, maybe.

[ Parent ]
perhaps a silly question (2.75 / 4) (#48)
by qu1j0t3 on Thu Jul 06, 2006 at 02:57:06 PM EST

Wouldn't it have helped to immediately offer proxy servers at least while the mess gets sorted out?

great story (2.66 / 3) (#49)
by rjnagle on Thu Jul 06, 2006 at 03:50:00 PM EST

What an interesting (and educational) story. Yet another aspect of networking I am totally ignorant of.

I would be curious about what were the extent of the losses to your company in terms of time and money. Did your agreement with the upstream provider protect you against these kinds of mishaps?

what would you have done differently? (2.75 / 4) (#50)
by rjnagle on Thu Jul 06, 2006 at 03:57:42 PM EST

I'm curious: what would you have done differently to prevent this occurrence?

The moral of the story (3.00 / 2) (#55)
by Maserati on Sat Jul 08, 2006 at 02:28:35 AM EST

The moral of the story is:

Research your newly-assigned address blocks carefully. Discovering ahead of time that you might be on some bogon lists with the new address space would have let you insist on a clean block.

--

For the wise a hint, for the fool a stick.
[ Parent ]

portable addresses duh (none / 1) (#76)
by lodc on Sat Jul 15, 2006 at 01:40:05 PM EST

any real isp would have had a portable block from arin anyways, so they could switch or add upstream providers any time they wanted without having to rebnumber anything on the cliet side.  these guys are a joke.
if my isp called and told me I had to renumber because they were getting a different provider, I would indeed have to renumber but it would be because I would be switching to an ISP that knew what they were doing.  
this story is just a rediculous example of how smaller ISPs genrally have crap technical knowledge.

[ Parent ]
good story BUT intro-box too long!! (2.25 / 4) (#57)
by Roger Mexico on Sun Jul 09, 2006 at 10:25:16 AM EST

We're not supposed to put all that much text in the introduction box, I think. Man.

You're probably correct but... (3.00 / 2) (#58)
by lamppter on Sun Jul 09, 2006 at 12:07:24 PM EST

after all the shit this story went through, it's questionable if I should ever post another one, know what I mean?

Naive Bayes Classification and K5 Dupes
[ Parent ]
Bogons ... (2.75 / 4) (#61)
by ejf on Mon Jul 10, 2006 at 04:00:37 AM EST

In your story, you say that the netblock was mistakenly on a bogon list for a short time. I just thought I'd add that similar problems crop up every day with blocks that were previously bogons but got assigned to a registry by IANA at some point (such as 89/8, which happened in mid-2005). Many routers STILL do not route these addresses, even though they have been off the bogon list for (in net-terms) ages. It's surprising what dimwits get enable at many ISPs ...

Tier 1, 2, 3-declarations, btw, are a heavily contested terminology (read nanog some time :->).

--- men are reasoning, not reasonable animals.

Information about inaccuracies in this article (2.33 / 3) (#63)
by willix on Wed Jul 12, 2006 at 12:40:39 AM EST

While this is a good story to raise awareness of the issues with bogons and problems that may arise when you do not update filters regularly, there are some inaccuracies, particularly it may not have been appropriate for an author to put blame on particular list generation party without further investigation of this topic and at least finding out which lists the sites that blocked his network have used. Hopefully such an oversight can still be corrected by the article author, if it is indeed what happened.

Particularly Completewhois bogon lists are quite accurate and it is very very very unlikely that an error would exist for a month (errors do happen on occasion with some legacy ip space but these are fixed in hours). What is described is also filtering by outside networks in the core routers, but completewhois bogon lists are regenerated on everyday basis to account for all new RIR->ISP allocations and are very detailed and quite large. As a result of the size, they really can not be used on core routers and are not actually designed for them in the first place; the lists are primarily used on per-query basis (mainly through dnsbl like verification) for spam filtering and as a filter on some firewalls and unix servers that rsync (or ftp, etc) the new lists on daily, weekly or (which is very discouraged but does happen) monthly basis; hopefully nobody is foolish enough to be doing updates less often then that and certainly website and list data mention it should not be done.

Due to this being daily and automated generation system, there are a lot of additional checks to prevent possible errors. One of these is post-generation checks removes from pre-published list /8 block(s) if its /8 block that is not held in reserve by IANA (i.e. block not allocated to RIR and listed on their site). A result of this is that when new /8 allocation by IANA to RIR happens the block that was previously listed as bogon may for a time not be listed at all (which as some here may point out makes list less specific/accurate then normally, but its better then large number of users being blocked even for short time when RIR data was not properly retrieved and processed). When RIR does begin to make allocations from the new /8, then most of the new block except allocated space would again show up in bogon list. Period of one month described maybe the one between when RIR started making first allocations from new /8 and the time that ISP used by the author of the article actually got their ip block (at least that is my guess about how its described).

So it is most likely that problems experienced by `lamppter' and described in the article are exactly those pointed out in comment by `elf' and have to do with those ISPs and organizations that have previously entered list of IANA unallocated /8 blocks in their routers. While this is exactly the bogon list distributed by Cymru it would be extremely inappropriate to blame Cymru for what happened as they go into great effort in making everyone aware when changes to their list and IANA allocation do accure as well as provide BGP feed for those who do not want to relay on having to make updates to their routers regularly. Some others in their comment also pointed out that Cymru makes "better" bogon better list. Such general comment is also quite inappropriate - neither completewhois nor cymru make any better or worth bogon lists nor is there some kind of competition going on. The situation is that cymru's list based entirely on IANA data is better for using in core network routers as its small and can prevent some spoofed traffic (more about it later) and lame errors (such as one your customer or peer starts to announce 10/8 used on their local net). The list entirely based on IANA however lives a lot of holes for blocks where IANA made allocation to RIR but RIR has not yet made allocation to end-user or ISP. These are very large holes that have been exploited by spammers and other miscreants toannounce such unallocated space in BGP and send traffic with no way to identify responsible party afterwards based only on ip address. This is where more detailed (based on both IANA and RIR data) completewhois bogon lists come in and as mentioned these lists are used in firewalls, directly on unix servers (as iptable filters) and in anti-spam software. So where one type of bogon filtering list is better, the other one is not and the other way around - but there is no general better or worse.

One last note for this long post is that I want to explain reasons for installing bogon filters in the core routers in the first place as explanation and link for more information to completewhois is not entirely appropriate here (its different type of a bogon lists). The reasons have to do with preventing spoofed ip packets. For those who do no know when computer system creates Internet Protocol packet it puts in source and destination addresses, but generally only destination is used to actually deliver packet across the Internet. The source is used when reply to the packet needs to be sent (which is required for protocols using TCP transport such as HTTP or SMTP, but is not required for ICMP and UDP based protocols such as TFTP, DHCP or DNS) and to log where the connecion appear to have come from. The bad guys used this to launch spoofed DoS attacks and put in random forged source address so as to remain anonymous and make identification of where the attack is launched from difficult (different source addresses also makes filtering attack on per-flow basis difficult). The theory is that if you randomly generate ipv4 source address this way, there is about 50% chance generated address would fall into one of the IANA unallocated ranges so network providers have been encouraged to install filters in their routers to at least get rid of these 50% of spoofed packets at the moment where these packets may enter their network. This is typical operations view - get rid of what you can if its seems easy to do, but security guys perspective is that going half-way is not good enough and what you really need to do is get rid of all spoofed packets. This is not easy and requires every ISP and network in the world to filter data going outside of their network and to make sure all ip traffic leaving their network has only source addresses within their ip blocks. However even though its not easy, it's the right approach for the future and one recommended by IETF in BCP 38 (RFC2827). This is also the only way to prevent abuse of UDP-based protocols to launch Reflected DDoS attacks which is the latest trend being employed by bad guys. Adoption of BCP38 would make putting bogon filters in routers unnecessary (and avoid associated problems for those getting new ip blocks due to some not following directions to keep their ACLs updated) and is also needed to be able to safely deploy DNSSEC, DKIM and other technologies that make use of very large dns records.

You sir are simply WRONG... (none / 1) (#65)
by lamppter on Wed Jul 12, 2006 at 04:05:31 AM EST

period. It happened to your precious list. Why it happened I am not able to tell you because I was not the one that made the mistake in the first place.

I am sure that it was human error that caused it to happen too. Though how I am not sure I am not responsible for the list.

There are a couple of other people that this has happened to had you taken as much time to read the comments as you did to prepare this comment.

Mistakes happen and one did this time and others had posted their experiences here that were the same thing.

We still own the IP space and to this day it is still unusable because it is still stuck in some routers.

I cannot prove to you that it happened at this point because I still like my job and want to have it for a few more years. The original posts by the tier 1 ISP on their BBS/Wiki is still there although I will not reveal the URL here.

Naive Bayes Classification and K5 Dupes
[ Parent ]

How to verify if there was an error (none / 1) (#66)
by willix on Wed Jul 12, 2006 at 08:59:27 AM EST

There are no mistakes with the list that last for a month and the way you explained what happened points to that responsible were static filters on ISP routers; this is not where completewhois lists are used and it more likely to be static IANA data bogon filter (most likely in fact these were all cisco routers where people did autosecure and in which case cisco router would install bogon filter automaticly but the filter based on IANA and known at the time the router was sold - pretty stuppid of cisco to do this and not allow for automatic updates of such filters; they no longer include bogon filter with their autosecure feature).

You can check yourself which is the case by doing one (or all) of the following:
 1. Best is check with at least one party that you  had to contact to get unblocked which bogon list they used and where they got it. Even better is to check with multiple parties.
 2. You can do whois check whois on the ip block of your upstream ISP. If the block was assigned by RIR in around March (or whenever you claim the problem was "corrected") that means prior to that the block would have been in bogon list and that is quite appropriate. If this is block from which the RIR just started making allocations (i.e. one of the last blocks assigned by IANA to RIR from IANA website) then you can find or at least estimate that date as well.
 3. You can check completewhois bogon data archives for when you think the block was not listed prior to period you specified. You'd need to dig deeper and check both the full bogons-cidr-all file as well as file for specific /8 block. If the /8 block file shows the entire /8 as being bogon but it was not carried into bogons-cidr-all then the post-processing script filter got rid of it.

[ Parent ]

I have been there, done that numerous times and (none / 1) (#67)
by lamppter on Wed Jul 12, 2006 at 09:45:34 AM EST

I am no idiot, the ip address space is there in the archives during the period I describe, plain and simple. I didn't make this story up dude. It happens even if rarely. But if you do not choose to believe it, that's your problem not mine.

I have been an IT professional long enough (since 1989) to know these issues, for whatever reasons, happen from time to time. I and several others saw them on the listing, the tier 1 provider admitted to it being there finally! I detail this in the article.

For you to tell me it isn't so, that I am (in so many words) full of shit and the story is inaccurate, does NOT make it so (OH MY!). I had no idea what a bogons listing was until this event happened, I never used them and don't now.

It's over and done with anyway and happened several years ago. I don't go around changing backbone providers as a habit.

Naive Bayes Classification and K5 Dupes
[ Parent ]

Inaccurate does not mean most of the story is bad (none / 1) (#68)
by willix on Wed Jul 12, 2006 at 12:02:43 PM EST

I did not said you were "full of shit" - that would have been very inappropriate to say the least. In fact I dont doubt that bogons were responsible for your problems with that "new ip block" nor is the problem you describe such a rare event - it happens every time new /8 ip block is beginning to be used and there whole bunch of routers still using old static list (it is just not completewhois list). Each year it happens there is a chatter about it on nanog and other operations forums; you're not alone and your story is a good description about such an event.

What I did say is that your story was not accurate, for example you say that completewhois lists are used in router ACLs (that is directly in your text) - that is just not so, anybody working with a router will tell you that the list is too large to be used in an ACL. You also said that completewhois listed this ip block as a bogon for a month and you can verify it in archive. I agreed with you that it is quite possibly the case, but that you may well be misinterpretting it as being incorrect listing and explained how you can check on that. Also previously you said this all happened recently and incorrect data was listed in February [this year 2006 I assumed] but several months after many routers still were using old list - is that so or did the events happen few years back as you now say in the comments?

I do want to note that when you start to point a finger at particular party in public you should
have data (to show in public) that can verify what you said. You basicly refused to provide this data   and in fact you said you're using herecy data yourself. That is inappropriate for a public story and if you can not share data in public, then while story can tell in general terms what the problem was (i.e. bogons and not unupdated router filters as your story does say) you should not be pointing to any one party as the cause.

Also regarding "tier1 ISP", I can tell you that their clue decreases with size. Actually to be more accurate they still have very clueful people working there, but their initiative can not longer be easily felt due to pure size of the company and associated beuracracy. The larger the isp the less likely it is that customer would ever be able to get in touch with truly clueful network guru (and not some L2 support person) and that such clueful person would post in some wiki even if its internal there. So if you're basing it on what they say, you should question its validity (they do need somebody to blame) and at the very least verify it before using it in public story.

In any case if you want to take it private - fine, email me with an actual ip block and particular timeframe to check and I'll tell if it was listed as bogon and at what time and why (do note that unlike commercial organizations, completewhois makes its entire archive for each day from when bogon project started in 2003 available in public and in fact makes it available in such a way that web engines and caches like google could see and preserve it so that completewhois could not change it later and say it did not happen).

[ Parent ]

heh. i was expecting (none / 1) (#70)
by kpaul on Fri Jul 14, 2006 at 09:17:47 AM EST

vogons to show up and start reading poetry. ;)

good read, though. just enough 'technical'... good job.


2014 Halloween Costumes

Cymru? (none / 0) (#71)
by The Diary Section on Fri Jul 14, 2006 at 03:56:42 PM EST

Figures, the Welsh ARE usually to blame for most things if you dig deep enough.

Spend 10 minutes in the company of an American and you end up feeling like a Keats or a Shelley: Thin, brilliant, suave, and desperate for industrial-scale quantities of opium.
what the hell are you thinking? (3.00 / 3) (#72)
by signal15 on Sat Jul 15, 2006 at 04:45:35 AM EST

You're an ISP, and you rely upon getting your IP space from your upstream providers?  That's just insane.  If you had gotten your own portable IP space, you wouldn't have to change addresses when you switch providers.  ARIN will hand out something as small as a /20, I know because I've obtained several.

Then you wouldn't have had to go through this big mess, you could have just peered with your new provider and announced your routes.  Would have been seamless.

It still could have been (none / 0) (#74)
by GrandWazoo on Sat Jul 15, 2006 at 10:11:57 AM EST

a big mess if they were given a CIDR block that was incorrectly placed on the bogons listings.

[ Parent ]
Yes! This problem should never have happened! (none / 0) (#75)
by lodc on Sat Jul 15, 2006 at 12:25:32 PM EST

Signal15, you are exactly correct. The whole time I was reading this article, a voice in my head is screaming "What the hell kind of "ISP" doesn't have their own (portable) IP space?"

Why would you take address space from your upstream network?  And why the hell would you have only one upstream provider... oh wait I know... because your network is designed so completely wrong that you CAN'T have multiple providers!!!  What crap!

Whoever is at the wheel there should be fired immediately.   All of this would have been avoided
if the tech staff there knew what they were doing.

Bottom line:  This happened because the ISP's network was not implemented correctly from the beginning.  13 large customers and 160,000 people were inconvienced because of the ignorance of the people running the ISP the author works for.  The transition SHOULD and COULD have been completely transparent to the customers.

Blaming the bogon list or their provider is weak, really weak.

-LodC

[ Parent ]

He didn't blame them (none / 0) (#77)
by curien on Sun Jul 16, 2006 at 05:46:19 AM EST

He blamed lazy router admins who block IP ranges when they show up on the listing but then don't unblock them when they come off.

--
I'm directly under the Earth's sun ... ... now!
[ Parent ]
Excellent article but you should have (3.00 / 4) (#73)
by GrandWazoo on Sat Jul 15, 2006 at 10:09:54 AM EST

picked up portables. Although, you could have had the same problem.

The problem is ISP apathy (none / 1) (#79)
by sarahlanephotography on Wed Nov 29, 2006 at 03:25:22 AM EST

Seriously!

Let Bogons Be Bogons: A Nightmare from ISP Hell | 78 comments (63 topical, 15 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!