Partners High Performance Computing Cluster
Partners applications via Mac/PC
Clinical & research applications
DFCI bioinformatics computer
PHS Research Computing cluster
Bioinformatics news
Data Storage & Backup
Sharing files & collaboration
HIPAA, ePHI and research (internal)
RPDR
HPCGG
Biomedical Engineering Model Shop
Harvard's GForge Implementation
Institutional research distribution lists

 

pHPC account registration pHPC user guide pHPC services pHPC web protal


Submit a MPICH2 MPI program on cluster


MPICH2 is an all-new implementation of MPI, designed to support research into high-performance implementations of MPI-1 and MPI-2 functionality. ( ) Please check MPICH2 website In order to run MPICH2 MPI program, we need to start the MPICH2 mpd ring in the nodes that are allocated by LSF scheduler, and use mpiexec to submit a mpi job in the clusters.



Step 1: You must laod MPICH2 module. By default, the system will use HP-MPI which is compatible with MPICH 1.x. In order to use MPICH2, in your .bashrc file, you need to have "module load mpi/mpich2/default" in your module-load section, for example, it will look like


# modules definitions
if [ -n "$MODULESHOME" ]; then
  module use /shr/modules
  module load mpi/mpich2/default
  module load matlab/default
  module load intel/cce/10.1/default
  module load java/1.6/default
fi

  
Type "..bashrc" in the terminal to enable the environment configuration.

[testy@n137 ~] ..bashrc
Step 2: Example MPICH2Example.zip can be downloaded. It also contains a Makefile example to compile multiple C++ source files to single execuable file. Please download and unzip it to /shr/home/$USER/TestMPICH2/

Step 3: Compile it with the MPICH2 compiler
[testy@n137 ~] which mpicxx
/source/mpich2/bin/mpicxx
[testy@n137 ~] which mpiexec
/source/mpich2/bin/mpiexec
[testy@n137 ~] cd TestMPICH2
[testy@n137 TestMPICH2]$ make
mpicxx -o main.o -c main.cpp
mpicxx -o funkywork.o -c funkywork.cpp
mpicxx -o myexe main.o funkywork.o  -lm -O2 -Wall
Step 4: Create your job script called "mpich2.lsf"
#BSUB -L /bin/bash

# the name of your job on the queue system
#BSUB -J mpich2_test

# the queue that you will use, the example here use the queue called "normal"
# please use bqueus command to check the available queues
#BSUB -q normal


# the system output and error message output, %J will show as your jobID
#BSUB -o %J.out
#BSUB -e %J.err

#the number of processors that you will use
#BSUB -n 10

#when job finish that you will get email notification
#BSUB -u testy@partners.org
#BSUB -N


############ enter your working directory, change to your own dir ###
work_dir="/shr/home/$USER/TestMPICH2"
cd $work_dir

############ create mpd ring, DO NOT modify this section unless you really know ####
nproc=0
for proc in $LSB_HOSTS ; do
echo $proc >> mpd.procs
nproc=`expr $nproc + 1`
done
echo $LSB_HOSTS
echo $nproc
`sort -u mpd.procs > mpd.nodes`
nhosts=`less mpd.nodes | wc -l`
mpdboot -n $nhosts -v -f mpd.nodes

############ ONLY change the myexe to your own application and ncessary arguments  #####

mpiexec -machinefile mpd.procs -np $nproc ./myexe

############ exit the mpd ring and clean off the nodes ###################

mpdallexit
mpdcleanup

`rm mpd.nodes`
`rm mpd.procs`



Step 5: Submit your job:
[testy@n137 ~]$ bsub < mpich2.lsf
You can always check it by typing "bjobs". If your job is dispatched, it will show as the following. The job id is 14894.
[testy@n137 TestMPICH2]$ bsub < mpich2.lsf
Job <496886> is submitted to queue 
[testy@n137 TestMPICH2]$ bjobs -l

Job <496886>, Job Name , User , Project , Mail , Status , Queue , Command <#B
                     SUB -L /bin/bash; # the name of your job on the queue syst
                     em;#BSUB -J mpich2_test; # the queue that you will use, th
                     e example here use the queue called "normal";# please use
                     bqueus command to check the available queues;#BSUB -q norm
                     al;  # the system output and error message output, %J will
                     show as your jobID;#BSUB -o %J.out;#BSUB -e %J.err; #the n
                     umber of processors that you will use;#BSUB -n 10; #when j
                     ob finish that you will get email notification;#BSUB -u yx
                     u11@partners.org;#BSU>
Wed Apr  8 19:17:47: Submitted from host , CWD <$HOME/TestMPICH2>, Output
                     File <%J.out>, Error File <%J.err>, Notify when job ends,
                     10 Processors Requested, Login Shell ;
Wed Apr  8 19:17:52: Started on 10 Hosts/Processors <4*n7> <3*n8> <3*n18>, Exec
                     ution Home , Execution CWD ;
Wed Apr  8 19:17:52: Resource usage collected.
                     MEM: 1 Mbytes;  SWAP: 12 Mbytes;  NTHREAD: 1
                     PGID: 7728;  PIDs: 7728


Step 6: Check your output results: When job is finishes, the system will generate output in *.out file.
[jxu@n137 TestMPICH2]$ vi 496886.out
n7 n7 n7 n7 n8 n8 n8 n18 n18 n18
running mpdallexit on n7
LAUNCHED mpd on n7  via
RUNNING: mpd on n7
LAUNCHED mpd on n18  via  n7
LAUNCHED mpd on n8  via  n7
RUNNING: mpd on n8
RUNNING: mpd on n18
Hello, 1 and n7 say hi in a  C++ statement
Hello, 0 and n7 say hi in a  C++ statement
Hello, 2 and n7 say hi in a  C++ statement
Hello, 4 and n8 say hi in a  C++ statement
Hello, 9 and n18 say hi in a  C++ statement
Hello, 3 and n7 say hi in a  C++ statement
Hello, 8 and n18 say hi in a  C++ statement
Hello, 7 and n18 say hi in a  C++ statement
Hello, 5 and n8 say hi in a  C++ statement
Hello, 6 and n8 say hi in a  C++ statement