How to run R on clusters
Notes:
1. We have two different R version installed in the system. R 2.6 and R 2.8, please load correct modules before using them
2. R 2.6 has bioconductor installed by default. R 2.8 has RMPI module installed.
Step 1: Please read "how to setup your working environment" and do the following:
a. Include the sentence of "module load r/2.6/default" in the module-loading blocks in your .bashrc file
. Basically, you shall contain the following block in the .bashrc file.
# modules definitions
if [ -n "$MODULESHOME" ]; then
module use /shr/modules
module load r/2.6/default
fi
b. Type ". .bashrc" in your terminal, and then type "module list or type "which R" in your terminal again, you shall be able to see:
[testy@n137 ~]$ . .bashrc
[testy@n137 ~]$ module list
Currently Loaded Modulefiles:
1) r/2.6/default
[testy@n137 ~]$ which R
/source/R_2.6/bin/R
If you see the above output on terminal, it means your environment has been setup correctly.
(Note: If change the 2.6 to 2.8, you will be using version 2.8, version 2.8 has rmpi module installed, please refer How to submit R MPI jobs .
Step 2: Download the Sample R script and save it as "example.R" in your r_example directory.
Step 3: Submit the R job
Example 1: You need to submit a R job and let it run in the cluster
a: Create folder called "r_example" in your home directory, then create the following rjob.lsf file inside the folder
#!/bin/bash
# enable your environment, which will use .bashrc configuration in your home directory
#BSUB -L /bin/bash
# the name of your job showing on the queue system
#BSUB -J rtest
# the queue that you will use, the example here use the queue called "normal", it will end in 3 hours.
# use long queue if you job will run more than 3 hours
# please use bqueus command to check the available queues, or consult system administrator to use other queues
#BSUB -q normal
# the system output and error message output, %J will show as your jobID
#BSUB -o %J.out
#BSUB -e %J.err
#the CPU number that you will collect for each job
#BSUB -n 1
# The following configuration will allow you exclusively allocate one node so that
# you can use all the entire memory of the allocated computing node. Here we comment it out by using two "#"
# if you want to enable it, delete one "#"
##BSUB -x
#when job finish that you will get email notification, change it to your own email address
#BSUB -u yourID@partners.org
#BSUB -N
#enter your working directory, change to your own dir
cd /shr/home/$USER/r_example
#Finally, Start the blast program
R --no-save < example.R > r_example.out
b. Submit your R job.
[testy@n137 ~] bsub < rjob.lsf
If you have many different input parameters and you need to run a number of R jobs concurrently and each take one parameter. We suggest you use Job Array, please refer to How to submit job array".
|