Neuron Basher
Apr 23 2005, 04:44 AM
As some of you no doubt noticed, there was an outage this evening. This is the second time we've had an outage with the same symptoms, namely everything other than IRC simply stops responding and we're unable to connect to the server to see what's going on. I've implemented a change tonight that I hope will enable me to login to the server if this should happen again, but I'm not at all certain whether or not it will work since I don't know yet what's going on.
It's one of those frustrating cases where there are no incriminating (or even interesting) log messages to point me toward a possible problem, and the fact that I'm a 45 minute drive away from the physical machine keeps me from checking things out from the console. I'm hopeful that if the problem should happen again I will either be able to connect to the machine through my workaround or will have time to make the pilgrimage to the colocation facility. It's just a matter of luck on the timing whether or not that works out, but we'll see how it goes.
Just thought I'd let folks know what's going on while we try to iron out the bugs with the new server.
Nikoli
Apr 23 2005, 05:44 AM
Does the facility management offer a remote-in capable KVM setup?
Neuron Basher
Apr 23 2005, 12:24 PM
Nope, the most I could get out of them would be hooking up a monitor and reading what's on the screen to me or having them power cycle it. It's hard to be demanding when you're getting free hosting, so I can't complain.
Neuron Basher
Apr 27 2005, 01:05 PM
UPDATE: Bad things happened again last night -- I'm still just as baffled as before. This morning I compiled and installed the latest Linux 2.6 series kernel in hopes that it will solve the problem. Keep your fingers crossed.
Nikoli
Apr 27 2005, 05:55 PM
Fair enough.
Sucks though, we use a similair system here for our servers. It's nice to have hardware level access via remote. Next best thing to an anthroform drone there for your needs.
hahnsoo
Apr 27 2005, 06:01 PM
QUOTE (Nikoli) |
Next best thing to an anthroform drone there for your needs. |
*blink* Ewww.
Nikoli
Apr 27 2005, 06:41 PM
not those needs...
Neuron Basher
Apr 27 2005, 07:37 PM
Yeah, it would certainly be nice not having to rely on the NOC staff at the colo facility to power cycle the machine when it goes senile on me. Still, I've done enough system administration over the last dozen years that it's not a major headache -- if it wasn't for other more pressing matters at my day job I'd have just driven out and hooked a console up to it, but I've been busy. We'll see what happens.
Neuron Basher
Apr 30 2005, 04:39 AM
By way of a quick update, things seem to be working much better with the kernel I compiled myself. I should have known better than to try to be lazy and use the stock kernel that shipped with the distribution.
In any event, provided things run smoothly for the rest of the weekend my plan is to resume moving services to the new server next week. Thanks to everyone for their patience.
Neuron Basher
May 2 2005, 09:55 PM
I'm now considering turning the server into an improvised explosive device. Or at least using it for target practice.
Sedna
May 3 2005, 12:43 AM
*** empathy ***
Thank you for all your hard working on this. In passing, I'll mention that when the old equipment was upgraded at the local campus newspaper, the old servers went out the three-storey window. Literally.
[/aka "I'll be good. Just answer me one question: what did the chicken do?"]
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please
click here.