Partners High Performance Computing Cluster
Partners applications via Mac/PC
Clinical & research applications
DFCI bioinformatics computer
PHS Research Computing cluster
Bioinformatics news
Data Storage & Backup
Sharing files & collaboration
HIPAA, ePHI and research (internal)
RPDR
HPCGG
Biomedical Engineering Model Shop
Harvard's GForge Implementation
Institutional research distribution lists

 

pHPC account registration pHPC user guide pHPC services pHPC web protal


Submit a MPI program on cluster


The key here is to combine the command mpirun with lsb_hosts to submit a mpi job in hpres cluster.

Attention: You must use HP-MPI to test the following script. HP-MPI is basically compatible with MPICH 1.x

Step 1: Example mpiintegral.c can be downloaded

Step 2: Compile it with the HP-MPI compiler
If you have loaded the right module (refer to "setup your working environment" ), you shall have hp-mpi available.

[testy@n137 ~] which mpicc
/opt/hpmpi/bin/mpicc

[testy@n137 ~] mpicc -o mpiintegral mpiintegral.c

Step 3: Create your job script mympijob.lsf
[testy@n137 ]$ vi mympijob.lsf

# enable your environment, which will use .bashrc configuration in your home directory
#BSUB -L /bin/bash

# the name of your job showing on the queue system
#BSUB -J mpitest

# the queue that you will use, the example here use the queue called "normal"
# please use bqueus command to check the available queues
#BSUB -q normal


# the system output and error message output, %J will show as your jobID
#BSUB -o %J.out
#BSUB -e %J.err

#the computing core number that you will collect (Attention: each node has 4 to 8 cores)
#BSUB -n 20



#when job finish that you will get email notification
#BSUB -u youremail@partners.org
#BSUB -N


#enter your working directory, change to your own dir
cd /shr/home/$USER/

#Finally, start the mpi program. You MUST make sure the argument for -np, here is "8" is same as the
#number for "#BSUB -n"

mpirun -np 20 -lsb_hosts ./mpiintegral



Step 4: Submit your job:
[testy@n137 ~]$ bsub < mympijob.lsf
You can always check it by typing "bjobs". If your job is dispatched, it will show as the following. The job id is 14894.
[testy@n137 ]$ bjobs
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
14894   testy     PEND  normal     n137                  mpitest    Dec 17 14:35
[jxu@n137 examples]$ bjobs -l

Job <14894>, Job Name , User , Project , Mail , Status , Queue , Command <#!/bin
                     /bash; # enable your environment, which will use .bashrc c
                     onfiguration in your home directory;#BSUB -L /bin/bash; #
                     the name of your job showing on the queue system;#BSUB -J
                     gethosts; # the queue that you will use, the example here
                     use the queue called "normal";# please use bqueus command
                     to check the available queues;#BSUB -q normal;  # the syst
                     em output and error message output, %J will show as your j
                     obID;#BSUB -o %J.out;#BSUB -e %J.err; #the CPU number that
                     you will collect >
Thu Dec 17 14:35:40: Submitted from host , CWD <$HOME/TestMPI/examples>,
                     Output File <%J.out>, Error File <%J.err>, Notify when job
                     ends, 20 Processors Requested, Login Shell ;
Thu Dec 17 14:35:44: Started on 20 Hosts/Processors <3*n25> <2*n3> <2*n8> <2*n2
                     7> <2*n5> <2*n4> <2*n23> <2*n14> <2*n15> <1*n10>, Executio
                     n Home , Execution CWD ;
Thu Dec 17 14:35:44: Resource usage collected.
                     MEM: 1 Mbytes;  SWAP: 10 Mbytes;  NTHREAD: 1
                     PGID: 12476;  PIDs: 12476


 SCHEDULING PARAMETERS:
           r15s   r1m  r15m   ut      pg    io   ls    it    tmp    swp    mem
 loadSched   -     -     -     -       -     -    -     -     -      -      -
 loadStop    -     -     -     -       -     -    -     -     -      -      -


Step 5: Check your output results: When job is finishes, the system will generate output in *.out file.
[testy@n137 ~]$ vi 14894.out
n57 n57 n57 n57 n59 n59 n59 n59 n58 n58 n58 n58 n61 n61 n61 n60 n60 n60 n63 n63
Process 0 has the partial integral of 0.078459
Process 1 has the partial integral of 0.077975
Process 3 has the partial integral of 0.075572
Process 2 has the partial integral of 0.077011
Process 4 has the partial integral of 0.073666
Process 9 has the partial integral of 0.057659
Process 13 has the partial integral of 0.038366
Process 16 has the partial integral of 0.021313
Process 19 has the partial integral of 0.003083
The Integral =1.000000
Process 5 has the partial integral of 0.071307
Process 6 has the partial integral of 0.068508
Process 7 has the partial integral of 0.065287
Process 8 has the partial integral of 0.061663
Process 11 has the partial integral of 0.048611
Process 10 has the partial integral of 0.053299
Process 12 has the partial integral of 0.043623
Process 14 has the partial integral of 0.032873
Process 15 has the partial integral of 0.027177
Process 17 has the partial integral of 0.015318
Process 18 has the partial integral of 0.009229