Wednesday, January 14, 2009

ROCKS not Rolling with CUDA

I finally have enough parts together to start building the new cluster. I have a frontend machine running a 4-core Phenom and a compute node with an Intel Core2Quad. I went to download ROCKS to put on the system and ran into a couple of problems.

First, the ROCKS ftp site is down (as of this writing), which means I can't download the DVD. But I can use the HTTP site and get the CDs, so no great loss.

Then, I went looking for the CUDA Roll (Rolls are extensions to ROCKS functionality so ROCKS knows how to install the software on the nodes). Well, the NVIDIA-supported roll is for ROCKS 4.3, which is way out of date. (And apparently NVIDIA must be embarrassed about it, because it is really hard to find the link on the NVIDIA website.) Someone has built a test version for ROCKS 5.0 and put it on Google Code, but even that is out of date.

So now I have to decide if I should try to make my own roll or at least integrate CUDA into ROCKS so it can be automatically installed or if I should install it by hand on each node and prevent ROCKS from reinstalling on each node crash. Having done the former before, I'm leaning towards the latter, since I have so few nodes. If anyone has a modern CUDA roll, please let me know.