 |
|
pHPC Hardware
The need for HPC within the organization for research and operational purposes has expanded at a breathtaking rate. Power, speed, capacity as well as storage and I/O are in high demand. To meet these needs, The pHPC clusters now include over 600 process ing cores totalling 5 Tflops of computing power, 1.3TB memory and 32 TB of extremely high performance storage. As one of the largest computing facilities in the Boston area, the clusters have finished nearly 4Milliion jobs in the last 2 years .
Main systems
The rcclu.partners.org cluster
This system is composed of 88 of the latest c-class blade servers from HP. These nodes have mixture of 2 dual-core and 2 quad-core Xeon processors with a total capacity of 448 simultaneous processes.
Each node has about 72GB SAS drives, and 8GB-16GB of memory. The network is GB ethernet. In addition to the increased CPU density and speed, this cluster has been designed to relieve I/O limitations apparent in any HPC cluster through a very high performance Storage Array Clustered Gateway with initial raw capacity of 34TB. This clustered gateway serves data through HP's proprietary Cluster Gateway File Servers allowing for a high throughput and scalable I/O solution.
Management tools consist of a mixture of open-source and proprietary HP applications using LSF for scheduling and distributing jobs.
The hpres.partners.org cluster
The cluster, hpres.partners.org (used to be hpres.mgh.harvard.edu), is a 64 node system using HP DL145 servers as computing nodes and HP DL585 as headnode and login nodes. Each node has dual AMD-Opteron 64 bit processors at 2.1 GHz, 160GB disk drive, and 4 GB memory. The nodes communicate via gigabit ethernet and are connected via NFS mounts to the same high performance Storage Array Clustered Gateway shared with rcclu cluster. Management tools consist of a mixture of open-source and proprietary HP applications using LSF for scheduling and distributing jobs.
Two older and smaller systems
The rescluster.mgh.harvard.edu cluster
Rescluster has been designed and built completely by in-house staff with the first 15 nodes purchased by the Steele Lab at MGH. These compute nodes consist of 15 HP DL145 servers, each having two AMD-Opteron 64bit, 2.1 GHz CPU's, 160GB hard drives, and 4 GB memory. LSF is the scheduler and networked storage is just under 1 TB with gigabit ethernet and NFS for communication. This system also uses open-source solutions for cluster management.
The rescluster2.mgh.harvard.edu math cluster
This system was specialized cluster for Matlab and BioPara and other programs/applications requiring 32 bit processors. Rescluster2 is a 25 node system composed of Dell servers. Each node has a single Intel 32bit, 3 GHz CPU with 2 GB memory. Communication is over gigabit ethernet and NFS to storage consisting of approx 500 GB. Scheduling is via Sun Grid Engine. This cluster also runs Biopara, a parallelized version of R developed by users of this system. This cluster was designed and built by in-house Linux support staff and uses completely open source cluster management solutions.
New hardwares and additional services
We anticipate this performance will provide sufficient HPC resources to meet the current needs of the Partners research community. As our client base increases we plan to stay ahead of the curve through the dedicated resource offering and by designing and implementing additional offerings, such as large-scale shared memory systems, Windows based clusters, and scalable storage/archival services following the same service model.
|