Sunday, January 11, 2009

CUDA Cluster Computing

I'm in the process of putting together a new cluster to support both parallel computing education and research. I have an old, decrepit AMD Athlon cluster that has mostly failed (power supplies, fans, and disk drives are the culprits), so I will be replacing some of those nodes with fancy new nodes. This is needed because UCI no longer has a student-accessible cluster to be used for education. The graduate students used to have access to a fairly nice cluster as their email machine (they didn't know it could run MPI and had 44 Xeon's driving it). That's gone and has not been replaced.

The course I am teaching this quarter is primarily on parallel computer architecture, but in my opinion, the best way to understand a system's behavior is to use it, so I believe in having the students do parallel programming assignments. I will teach them OpenMP for cache-coherent shared memory architectures and MPI for distributed memory architectures. I will also spend a fair bit of time on CUDA, because of its performance and accessibility.

I received  some very generous support for my new cluster from NVIDIA. Because they want to make sure my grad students have the best possible CUDA experience, NVIDIA provided six Tesla C1060s and six 790i motherboards. This will help me make a really nice setup.

Unfortunately my old power supplies, RAM, and CPUs won't work with this new equipment, so I need to supply them. I had hoped to get Intel to donate CPUs, so I asked our industry relations folks to help make contact with Intel. Unfortunately, it got out that I will be teaching CUDA, so the Intel contact refused to help, saying that Larrabee is way better. Of course, Larrabee isn't available yet, so I can't exactly assign projects with that, and the OpenMP and MPI assignments would have made the students familiar with technologies that would have been useful on Intel processors, possibly including Larrabee. Darn.

Anyway, I've bought components for one system now and will borrow a machine from home as the cluster front end, so I will post updates as the system comes together. I plan to put ROCKS on it, having been a long-time ROCKS user and fan.