K5 status for 27-09-2001

By Inoshiro in Site News
Thu Sep 27, 2001 at 07:49:49 PM EST
K5 had some problems today. Out of memory, kernel upgrades, and MySQL all conspired to keep us from you :)

I suspect the mod_perl server wedged itself by using up all the memory sometime this morning (hurstdog is working on a perl script I designed to let me track system stats so we'll see these sooner). To handle things better, I turned down a few settings so K5 would not use up all its memory so quickly.

Fewer mod_perl servers behind the caching proxy will let K5 hopefully just be slower, rather than grind to a halt.

Also in the config update: a kernel upgrade (to 2.4.10). We'll see if 2.4.10 wedges under load in EEPro100, but it seem stable. I had compiled it with the wrong RAID controller turned on, so it was hung at a panic for a while until the people at the colo facility could tell me what the console said and reboot it into a safe kernel.

The previous kernel was 2.2.19 with a few patches. 2.4.10 has already given us a bit of a boost since the better SMP support it has is much better than in 2.2.19 (more finely grained spinlocks, and threading of some subsystems).

Once the new kernel came up, we had to deal with a corrupt MySQL table (which lead to another 10-20 minutes of downtime this afternoon). Thankfully hurstdog was feeding me the syntax of the repair table command while I hurredly prepared for a doctor's appointment I was late for. The fun never stops here!

Everything is working fine now.


K5 status for 27-09-2001 | 11 comments (11 topical, editorial, 0 hidden)
Swapd, never worry about out of memory conditions (3.00 / 1) (#1)
by QuoteMstr on Thu Sep 27, 2001 at 07:55:30 PM EST

Adds swap as needed. http://www.linux.org/apps/AppId_6506.html

The problem there (none / 0) (#3)
by Inoshiro on Thu Sep 27, 2001 at 09:16:02 PM EST

Is that, as the swap is used up, the system slows down. As more and more backedup MySQLs and mod_perl servers load up, memory usage is no longer a sustainable line. Because of the latency, you see a spike in the curve of memory usage. If you have 2mb or 2gb or 2tb of swap, your system will either go OOM or end up so slow and lagged that it might as well be OOM.

[ イノシロ ]
[ Parent ]
Minor glitch (3.00 / 1) (#2)
by wiredog on Thu Sep 27, 2001 at 08:26:36 PM EST

The "Site News" section shows up on the front page, but this article doesn't.

Is "Site News" new or have I just not seen it before? And why not meta? (Not bitching, just curious.)

If there's a choice between performance and ease of use, Linux will go for performance every time. -- Jerry Pournelle

because... (4.00 / 1) (#5)
by hurstdog on Thu Sep 27, 2001 at 09:56:20 PM EST

of an oversight by me. Driph asked me to put the Site News section link just above technology, on the left, under "All Stories". Well I misunderstood, I thought he wanted it just like "All Stories" so I didn't make it list the stories there as well. Oops. Well its fixed now, and as soon as the cache clears (it refreshes the stories every 100 front page loads) it will show the latest sitenews stories.

Site News uses some new stuff that I put in a while back, so it won't post to the queue :) This is how it works: there are permissions to post/read comments and stories in each section (you can access them via the section editor). The permissions are allow, deny, or hide. All sections but sitenews are allow. If they are deny, Scoop says "You don't have permission to do <action> in this section". If they are hide, scoop pretends the section doesn't exist.

For posting stories, there are 2 extra permissions, "Auto-post to section" and "Auto-post to Front Page". Editors, Admins, and Superusers on Kuro5hin have Auto-post to Section privelages for the SiteNews section.

Site News was set up so that you don't have to read rusty's diary to see whats happening on kuro5hin anymore :) We'll post stories as we learn about whats happening, and they will be easy to find.

[ Parent ]

Abuse of power!!! (4.00 / 4) (#4)
by jabber on Thu Sep 27, 2001 at 09:45:13 PM EST

Hey!! I don't remember voting on this story!!!

(Just kidding Rusty - I missed K5 today. Glad all's well)

[TINK5C] |"Is K5 my kapusta intellectual teddy bear?"| "Yes"

Maybe it rushed through? (none / 0) (#7)
by Nick Ives on Fri Sep 28, 2001 at 03:38:04 PM EST

Stories like this tend to go through pretty fast. Maybe it zipped through with an editorial notice stating to put it in the new section if poss?

Now, what I would have liked would have been a notice about the "site news" section. When did that appear.....?

I seem to recall Rusty saying that even for misc site news crap they would use the submission que. Just stuff like this tends to be in agreement with most pple....

[ Parent ]

Well (none / 0) (#8)
by Inoshiro on Fri Sep 28, 2001 at 06:30:09 PM EST

This is more like the diaries, except for the site admins. We post directly to this one section about all site important news (such as why there was an outage). Hurstdog added it, and I think it's a good idea :)

Just think of it as the diary of the server.

[ イノシロ ]
[ Parent ]
Suggestion (4.66 / 3) (#6)
by slaytanic killer on Fri Sep 28, 2001 at 06:45:51 AM EST

I noticed scoop.kuro5hin.org was up (downloaded the source tarball too, which makes me think twice about my prejudices against perl), and wonder if there can be a page like status.kuro5hin.org where people can go if they wonder what's up. It's really disturbing when you try to get your fix, but the reload button's broke.

Only now can I unfilter irc... Just a sign like "Come back in 20 hours" would suffice.

I second this (none / 0) (#9)
by sigwinch on Sat Sep 29, 2001 at 05:49:27 AM EST

Just a basic status page, no promises, accuracy optional. It makes us junkies nervous if the dealer won't even talk about when the next hit will be available...

I don't want the world, I just want your half.
[ Parent ]

You see... (none / 0) (#10)
by acb on Thu Oct 04, 2001 at 11:03:30 AM EST

You should be running something like Open UNIX or Solaris/Intel for that enterprise level performance!
--- acb #kuro5hin
2.4.10 isn't very stable! (none / 0) (#11)
by nYxxie on Fri Oct 05, 2001 at 09:30:56 AM EST

AFAIK, 2.4.10 is quite unstable due to changes in VM support...if there will be any problems, I recommend downgrading to 2.4.9 or applying Alan Cox patch for 2.4.10

- Merge with Linux 2.4.10 tree
- Drop VM changes
- Drop raw/block I/O changes
- Drop out O_DIRECT
- Basically remove the seriously unsafe stuff and keep the -ac VM
- I've not applied the obvious fixes so ACPI and
joysticks are still icky - that is for ac2
- Fix the noncompile of SMP OOSTORE kernels

K5 status for 27-09-2001 | 11 comments (11 topical, 0 editorial, 0 hidden)
Display: Sort:


