Neuron Basher
Jun 28 2005, 04:50 PM
I'm going to use this thread for a running update of how things are going with the reinstalled server. Feel free to post comments of your own throughout, and definitely let me know if anything is not working correctly.
As of now, the following is running on Nexus:
1. Dumpshock Forums (web front-end only, DB on old server)
Neuron Basher
Jun 28 2005, 04:51 PM
Sorry for the brief outage a few moments ago, I just realized that the server was running a uniprocessor kernel and had to reboot with the SMP one so we could take advantage of both processors.
Fortune
Jun 29 2005, 06:20 AM
I assume you are aware of the time synch problem.
Neuron Basher
Jun 30 2005, 04:12 PM
QUOTE (Fortune) |
I assume you are aware of the time synch problem. |
Yes, that should hopefully be resolved now.
Additionally, I have moved the live database over to nexus and the forums are completely running on the new server again. I'm going to give it a few days of this before I start the mass migration of the rest of the services again, but I'm encouraged so far by the stability.
Neuron Basher
Jun 30 2005, 10:20 PM
Spoke too soon. Server went boom, DB moved back to old server. I'm pulling my hair out again trying to figure out what the hell is going on with this piece of junk. It's rather likely I'll move the web front end back to the old server shortly and retrieve the server from the colocation facility yet again. I'm more or less convinced that it's a hardware issue at this point and I'll get Dell involved as soon as I can.
Neuron Basher
Jun 30 2005, 10:24 PM
Update: I changed the DNS entries to start draining traffic from the new server over to the old.
Fortune
Jul 1 2005, 06:48 AM
That one of the best things about Dell. If it does turn out to be a hardware problem they usually give great customer support. Hell, I even called them when I had a software hassle and they spent a bunch of time talking me through everything.
Nikoli
Jul 1 2005, 05:29 PM
Well, give' em Hell. Remember, some of the more advanced servers respond to threats of physical violence, so save some of the illegal fireworks just in case.
Sedna
Jul 3 2005, 11:26 PM
Being absolutely useless in terms of tangible suggestions, I'm just trying to stay out of the way and give you as little grief as possible. (I suspect those two servers are doing enough of that for the lot of us.) All I can offer is moral support and empathy ... not least because I've been pulling my own hair out, trying to keep LitS flowing smoothly while having to relog in five and six times per post
Ecclesiastes
Jul 14 2005, 06:48 PM
Any word on how things are going with Nexus?
Neuron Basher
Jul 15 2005, 04:56 AM
Sure, here's an update:
After running fine for 3 days with just the webserver, the machine hard locked when I enabled the databae on it. At this point it seem clear that the problem is disk I/O related, and there are a few things it could possibly be.
At this point, I have the following plan:
1. Visit the colocation facility with my external DVD-RW and a Linux install disc.
2. Reinstall the OS, splitting the mirrored drive configuration we have been running and using the disks individually. I plan to run them seperated and do scheduled backups from the "live" drive to the backup drive.
3. Install Xen virtualization software, and run all real services in virtual machines.
4. Hopefully everything runs smoothly and we all live happily ever after.
5. In the rather more likely event that things go to hell again, I'm hopeful that the trip to the netherworld will at least result in some error messages this time -- at the very least, I hope that the entire machine does not become hard locked and that the problem ends up affecting only the virtual machine. I may be able to get some useful debugging information then.
That's the plan right now. I was hoping to make it out to the colocation facility tomorrow to do this, but I'm rather doubtful that that will happen. I am behind on a project for work and will basically have to devote half a day to the colocation visit, at the very least (it's a 45 minute drive each way).
I'll post again when I have more information.
Nikoli
Jul 15 2005, 12:09 PM
Ouch. Best of luck, time to break out the good dice for this computer b/r check.
Neuron Basher
Jul 15 2005, 07:22 PM
Update: I squeezed in 4.5 hours today to go out to the colocation facility and to the install there, instead of bringing the server home and doing it like I have in the past. It's a 45 minute drive each way, so it makes more sense to do the install on the spot if I can find the time. I didn't really have time, but things don't look to improve in my schedule soon, so I just forced it a little.
Anyway, the machine running and I'm going to work on configuration this weekend. Hopefully we can give it another go soon. Stay tuned.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please
click here.