This section is still being developed as the usage on my cluster evolves, but so far we tend to write our own sets of message passing routines to communicate between processes on different machines.
Many applications, particularly in the computational genomics areas, are massively and trivially parallelisable, meaning that perfect distribution can be achieved by spreading tasks equally across the machines (for example, when analysing a whole genome using a technique that operates on a single gene/protein, each processor can work on one gene/protein at a time independent of all the other processors).
So far we have not found the need to use a professional queueing system, but obviously that is highly dependent on the type of applications you wish to run.
For the single most important program we run (our ab initio protein folding simulation program), using the Pentium 3 1 GHz processor machine as a frame of reference, on average:
Xeon 1.7 GHz processor is about 22% slower Athlon 1.2 GHz processor is about 36% faster Athlon 1.5 GHz processor is about 50% faster Athlon 1.7 GHz processor is about 63% faster Xeon 2.4 GHz processor is about 45% faster Xeon 2.7 GHz processor is about 80% faster Opteron 1.4 GHz processor is about 70% faster Opteron 1.6 GHz processor is about 88% faster
Yes, the Athlon 1.5 GHz is faster than the Xeon 1.7 GHz since the Xeon executes only six instructions per clock (IPC) whereas the Athlon executes nine IPC (you do the math!). This is however an highly non-rigourous comparison since the executables were each compiled on the machines (so the quality of the math libraries for example will have an impact) and the supporting hardware is different.
These machines are incredibly stable both in terms of hardware and software once they have been debugged (usually some in a new batch of machines have hardware problems), running constantly under very heavy loads. One common example is given below. Reboots have generally occurred when a circuit breaker is tripped.
2:29pm up 495 days, 1:04, 2 users, load average: 4.85, 7.15, 7.72