Partners High Performance Computing Cluster
Partners applications via Mac/PC
Clinical & research applications
DFCI bioinformatics computer
PHS Research Computing cluster
Bioinformatics news
Data Storage & Backup
Sharing files & collaboration
HIPAA, ePHI and research (internal)
RPDR
HPCGG
Biomedical Engineering Model Shop
Harvard's GForge Implementation
Institutional research distribution lists

 

pHPC account registration pHPC user guide pHPC services pHPC web protal



How to run R on clusters




Notes:

1. We have two different R version installed in the system. R 2.6 and R 2.8, please load correct modules before using them

2. R 2.6 has bioconductor installed by default. R 2.8 has RMPI module installed.

Step 1: Please read "how to setup your working environment" and do the following:

a. Include the sentence of "module load r/2.6/default" in the module-loading blocks in your .bashrc file

. Basically, you shall contain the following block in the .bashrc file.

# modules definitions
if [ -n "$MODULESHOME" ]; then
  module use /shr/modules
  module load r/2.6/default
fi
b. Type ". .bashrc" in your terminal, and then type "module list or type "which R" in your terminal again, you shall be able to see:
[testy@n137 ~]$ . .bashrc
[testy@n137 ~]$ module list
Currently Loaded Modulefiles:
   1) r/2.6/default
[testy@n137 ~]$ which R
/source/R_2.6/bin/R
If you see the above output on terminal, it means your environment has been setup correctly.

(Note: If change the 2.6 to 2.8, you will be using version 2.8, version 2.8 has rmpi module installed, please refer How to submit R MPI jobs .

Step 2: Download the Sample R script and save it as "example.R" in your r_example directory.



Step 3: Submit the R job

Example 1: You need to submit a R job and let it run in the cluster

a: Create folder called "r_example" in your home directory, then create the following rjob.lsf file inside the folder
#!/bin/bash

# enable your environment, which will use .bashrc configuration in your home directory
#BSUB -L /bin/bash

# the name of your job showing on the queue system
#BSUB -J rtest

# the queue that you will use, the example here use the queue called "normal", it will end in 3 hours.
# use long queue if you job will run more than 3 hours
# please use bqueus command to check the available queues, or consult system administrator to use other queues
#BSUB -q normal


# the system output and error message output, %J will show as your jobID
#BSUB -o %J.out
#BSUB -e %J.err

#the CPU number that you will collect for each job
#BSUB -n 1

# The following configuration will allow you exclusively allocate one node so that
# you can use all the entire memory of the allocated computing node. Here we comment it out by using two "#"
# if you want to enable it, delete one "#"
##BSUB -x

#when job finish that you will get email notification, change it to your own email address
#BSUB -u yourID@partners.org
#BSUB -N


#enter your working directory, change to your own dir
cd /shr/home/$USER/r_example


#Finally, Start the blast program
R --no-save < example.R > r_example.out
b. Submit your R job.

 [testy@n137 ~] bsub < rjob.lsf
 
If you have many different input parameters and you need to run a number of R jobs concurrently and each take one parameter. We suggest you use Job Array, please refer to How to submit job array".