 |
|
Resources in HPC Clusters
Both HPRES and RCCLU clusters share the same file system and they have nearly identical OS configurations. Thus the applications listed in this page fit for both systems.
Note:
1. HPRES is using AMD64 processors, RCCLU is using Intel XEON processors. Some applications might need different arguments when you use them (most of them are same).
2. In the linux environment, users can install various additional software into their own home directories without special permission or privileges. Our staff is happy to assist if you have difficulties.
News: A testing 10-node, 40-core Windows 2008 HPC cluster is currently depolyed for windows users. Please contact yxu11@partners.org for more information regarding the applications installed over there. Later more information will be posted on the website
__________________________________________________________________________
System, OS, Scheduler and Resource Management
| Names |
Description and References |
| SSH address |
RCCLU cluster: rcclu.partners.org
HPRES cluster: hpres.partners.org
* How to request your account?
* How to login to the cluster
|
| Login Node |
On RCCLU cluster: n136 or n137
On HPRES cluster: n62 or n63
* How to setup your login ?
|
| Linux OS |
2.6.9-42.9hp.2sp.XCsmp x86_64 GNU/Linux |
| Default Shell |
Bash |
| User Home |
/shr/home/your_username |
| Cluster Software |
HP XC V3.2
* XC User Manual from HP
|
| Scheduler, Queue and Resource Management |
HP LSF-6.2 * Quickstart for job submission
* Complete Platform LSF 6.2 User Guide
|
| User Environment Control |
HP Environment Modules
* How to setup your environment |
__________________________________________________________________________
General and High Performance Compilers. Math Library, Modules on the cluster(s)
Compilers
| Names |
Path |
Description and References |
| GNU gcc version 3.4.6 (gcc, g++, g77) |
/usr/bin |
GNU Compiler Default from system: |
| GNU gcc version 4.2 (gcc, g++, gfortran) |
/source/gnu_4.2/bin |
GNU Compiler (with openMP support)
* how to launch multi-thread OpenMP job
|
| Intel (c/c++/fortran) and the debuggering tool |
/source/intel/cce
/source/intel/fce
/source/intel/idbe
|
Intel 10.1 Compiler (Recommended)
* quick example how to use Intel compiler
* complete Intel 10.1 C compiler user guide (PHS network)
|
| Intel Threading Building Blocks (TBB) |
/source/intel/tbb |
Intel TBB 2.0
* Intel TBB user guide (PHS network) |
| HP MPI 2.01.00-08(mpicc, mpiCC, mpiff, mpif90) |
/opt/hpmpi/bin |
HP MPI C/C++ Compiler, fully compatible with mpich1.2 (Recommended) for MPI jobs
* How to use MPI on the cluster ? |
| MPICH 1.2 |
/source/mpich |
MPICH Compiler |
| MPICH 2 |
/source/mpich2 |
MPICH2 Compiler |
| JDK5 |
/source/jdk1.5.0 |
Java Compiler |
| JDK6 |
/source/jdk1.6.0 |
Java Compiler |
Extra Math Libraries
| Names |
Path |
Description and References |
| Numeric C |
/source/numeric_c |
From Numeric C Publisher, including a large number of source code for mathematical simulation and modeling
* External resource Numerical Recipes in C
|
| The GNU Scientific Library |
/source/gsl |
mathematical libs How to use gsl on cluster
|
| FFTW Library |
/source/fftw |
Discrete Fourier transform |
Intel MKL 10.1
| /source/intel/cmkl/ |
Intel Math Kernel Library How to use Intel MKL in the cluster
|
__________________________________________________________________________
Scientific Computing Software and Applications
Notes: Most users can install various applications in their own home directory. Please simply email us if you think it is necessary to install the application into a public directory (for example, large amount of storage is needed, group accessed is required, or restriction of license ..etc)
Matlab 2007
| Names |
Path |
Description |
| Matlab2007b |
/source/matlab2007b |
Matlab
Current Toolboxes available
| MATLAB | Version 7.4 (R2007b) |
| Distributed Computing Toolbox/Engine | (R2007b) |
| Optimization Toolbox | (R2007b) |
| Statistics Toolbox | (R2007b) |
| Signal Processing Toolbox | (R2007b) |
| Image Processing Toolbox | (R2007b) |
| Wavelet Toolbox | (R2007b) |
| Bioinformatics Toolbox | (R2007b) |
The number of license for toolboxes are limited.
* How to use matlab
* How to use matlab DCE to fire parallel jobs
|
( Matlab R2008b is scheduled to be depolyed later this year)
General biostatistics software
| Names |
Path |
Description |
| BioPerl Module |
/usr/bin/perl |
Bioperl 1.5, including core packages and run packages |
| R version 2.5 and BioConductor |
/source/R_2.5 |
Open source statistics software (including bioconductor modules for microarray analysis) |
Sequencing alignment and analysis
| Names |
Path |
Description and References |
| Solexa Pipeline |
/source/Solexa/GAPipeline-0.3.0 |
This is "next generation sequencing" pipeline. Multiple versions compiled with different libraries are availble. Running selexa requires user have knowledge of some basic configurations of the pipeline and the system, please consult scientific computing support
|
| NCBI Toolbox, 2007 March release |
/source/ncbi/bin |
NCBI Toolbox including blastall, megablast, wblast...etc
* How to run BLAST on the cluster
* External resources NCBI BlAST
|
| UCSC BLAT |
/source/blat |
BLAT on DNA is designed to quickly find sequences of 95% and greater similarity of length 40 bases or more.
* How to run BLAT on the cluster
* External resources UCSC BLAT
|
| UMBL EMBOSS |
/source/emboss_5.0 |
open Source software analysis package from EBI, specially developed for the needs of the molecular biology (e.g. EMBnet) user community.
* External resources UMBL EMBOSS
|
| Clustalw |
/source/clustalw2.0 |
open Source software for sequence alignment
|
| HMMER |
/source/hmmer_2.3 |
HMMER: biosequence analysis using profile hidden Markov models
|
| MR Bayers |
/source/mrbayes_3.1 |
Bayesian analysis of phylogeny
|
| pbat, p2bat |
Belong to specific users , contact HSPH Chris Lange |
An interactive software package for the design of genetic family-based association studies
* Resource from pbat manual
|
| plink |
/source/plink_1.0 |
A free, open-source whole genome association analysis toolset developed from MGH
* Exteranl resource from plink website
|
Protein structural dynamics
| Names |
Path |
Description and References |
| eHits |
/source/eHits_6.2 /ehits.sh |
Protein-ligand docking program, fast screen for large-scale ligand databases
* How to use ehits in the cluster
* External resources SimBioSys eHits
|
| Dock 6 |
/source/dock6 |
Protein-ligand docking simulation (allow multiple nodes, MPI-enabled).
|
| AutoDock |
/source/autodock |
Protein-ligand docking simulation (Single Thread).
|
| NAMD |
/source/namd |
Molecular dynamics simulation.
* how to use namd in the cluster?
* External resources UIUC namd
|
| CHARMM |
/source/charmm |
Molecular dynamics simulation (compiled with Intel Fortran Compiler)
|
Public genomic/genetic databases
| Names |
Path |
Description and References |
| NCBI Blast DB |
/pub1/ftp-ncbi-blast/blast/db |
BLAST database: including latest version nt, nr, human_genomic, mourse_genomic, est_others, est_human. (last update Dec 2007) |
| The NCBI genome data |
/pub1/ftp-ncbi-blast/blast/genome |
genome data, last update Jan 2008
|
| UCSC genome |
/pub1/ucsc-genome |
(.2bit) format for BLAT purpose |
(note: Other large-scale databases can be uploaded to the system based on request )
Text-Mining applications
| Names |
Path |
Description |
| MMTx |
/pub1/nlp/ |
Biomedical text mining software from NLM
How to run MMTx in the cluster
|
| UIMA |
/source/uima |
General text mining software
|
3D Graphics Library
| Names |
Path |
Description |
| Mesa |
/source/mesa |
Open Source 3D computer graphics library that provides a generic OpenGL implementation for rendering three-dimensional graphics on multiple platforms.
|
Tomographic image analysis
| Names |
Path |
Description |
| GATE/GEANT |
/source/geant/geant4.9.1.p01/ work/bin/Linux++ |
This application is used for development and evaluation of tomographic
reconstruction algorithms and other numerical observer studies.
Running GATE/GEANT requires complicated environment setup and configuration, please consult scientific computing support for envrionment variables
More information can be retrieved from national-wide health grid
|
Cell image analysis application
| Names |
Path |
Description and References |
| CProfiler |
/source/CProfiler/ |
Open-Source cell image analysis software from Board Institute
We have both matlab source version and executable version in the cluster, please consult scientific computing support regarding how to build the cluster version pipeline.
CellProfiler Manual
|
__________________________________________________________________________
Services Integrated with the Clusters
(The following services can only be accessed from within PHS network)
Database Service
| Names |
Address |
Description and References |
| MySQL 5.2 |
hpcdb.research. partners.org |
Currently store data for Text Mining and Sequence Analysis, Please consult scientific computing support if you want to upload your data |
| PostgreSQL 8.2 |
hpcdb.research. partners.org |
Currently is used for proteomics analysis. Please consult system admin to use it |
GRID Web Service
| Names |
Address |
Description and References |
| Engineframe BioGrid access |
hpcweb.research.partners.org:8081/ engineframe/ |
It provides web access to the cluster and allow user to control/manage their own web interface to interact with cluster.
|
| LabKey 2.1 |
hpcweb.research. partners.org |
A web-based high-throughput proteomics Mass Spectroscopy analysis application integrated with cluster. Please consult system admin
Visit HPCGG Proteomics Web Portal (if you do not have Labkey password, you will see a demo project as anonymous user) .
|
Windows GUI access
| Names |
Address |
Description and References |
| Windows-Based Applications |
hpcwin.research. partners.org |
Providing windows GUI interface for users to retrieve data/results from cluster remotely. Application can be installed based on user's request
|
|