Thursday, February 26, 2009

Cluster still troubling

It turns out the cluster frontend was having memory errors. Reseating the DIMMs (Crucial Ballistix) seemed to help, as it passed Memtest86, but it failed again when put back in service. I tried to swap the disk into one of the cluster nodes to make it the frontend, but bloody Linux thwarted me completely, particularly because it was using disk labels not partition numbers and somehow they were not findable in the new machine. The disk wasn't corrupt, but the kernel panicked on boot each time, no matter what I did.

Eventually, I had to reformat, reinstall ROCKS, restore the users, and put the new frontend in service. It seems to be working now, though ROCKS is misbehaving a bit. Scripts that normally handle all the user adding tasks aren't handling the auto.home adding, which is a surprise. Anyway, I can use emacs and fix it myself. Bloody computers.