From Documentation
Jump to: navigation, search

Installing Rmpi Package for Linux

Here are some rough instructions to get RMPI and R working on sharcnet systems.

In order to properly use Rmpi on hound.sharnet.ca and various other systems one must ensure that Rmpi and dependencies are installed.

Installing Rmpi from Downloaded Tar File

You can download the tar file of Rmpi in the name of Rmpi_x.y-r.tar.gz from CRAN. You may also check the developer's web site for the latest version.

To install Rmpi from the downloaded tar file, e.g. Rmpi_0.5-8.tar.gz, first ensure you are using a cluster that has OpenMPI installed/available. Then install Rmpi from the command line

R CMD INSTALL Rmpi_0.5-8.tar.gz 

The installation should be very straightforward. In case of missing header files or library files of MPI, ensure that the following environment variables are set and contain proper MPI installation paths

LD_LIBRARY_PATH
MPI_ROOT

If they are not set, then set them properly, e.g.

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/sharcnet/openmpi/current/intel/lib
export MPI_ROOT=/opt/sharcnet/openmpi/current/intel

where current is the version of the OpenMPI loaded, as shown in the following module command

   module list

For instance, you may see the following output

Currently Loaded Modulefiles:
 1) torque/2.5.4             8) gromacs/4.0.5
 2) sq-tm/2.4                9) ddt/2.5.1
 3) intel/11.0.083          10) namd/2.7b3
 4) openmpi/intel/1.4.2     11) r/2.10.0
 5) fftw/intel/2.1.5        12) vmd/1.8.7
 6) gaussian/g03_E.01       13) compile/1.3
 7) gaussian/g09_B.01       14) user-environment/1.0.0

You can see the loaded OpenMPI version is 1.4.2. Normally, if the module OpenMPI is loaded, the environment variables LD_LIBRARY_PATH and MPI_ROOT should contain the right path.

Installing Rmpi from within R

Alternatively, you may install Rmpi from within R. Again, before you proceed, ensure that the environment variables LD_LIBRARY_PATH and MPI_ROOT contain the right paths. At the command line, run R to enter R environment. Then at R prompt, type command

> install.packages("Rmpi")

When prompted for a mirror site, choose one that is near you. The installation should be as straight as before.


Running An Rmpi Job

To run an R script named test.R submitted with 21 cores (20 slaves 1 master) and submitted to run for 3h in the mpi queue.

#!/bin/bash
sqsub -o diag.out -r 3h -q mpi -n 21 R CMD BATCH test.R test.txt

Alternatively to run on a single node (say one of hounds 20+ processor nodes). The option -N x specifies how many nodes to run on.

#!/bin/bash
sqsub -o diag.out -r 3h -q mpi -n 21 R -N 1 CMD BATCH test.R test.txt

Within sharcnet clusters process spawning is *not available*. To work around this a special .Rprofile file specified in the Rmpi documentation must be available in the directory that R is running from. It allows for MPI jobs where process spawning is not available.

The directory structure should look like below before execution.

/work/<usrname>/<clustername>/statfoo/.Rprofile
/work/<usrname>/<clustername>/statfoo/script.sh
/work/<usrname>/<clustername>/statfoo/test.R

You would cd to /work/<usrname>/<clustername>/statfoo/ and enter sh script.sh or ./script.sh


Sample Script

 #Tell all slaves to return a message identifying themselves. 
 mpi.remote.exec(paste("I am",mpi.comm.rank(),"of",mpi.comm.size())) 
 mpi.remote.exec(paste(mpi.comm.get.parent()))
 
 #Send execution commands to the slaves
 x<-5
 #These would all be pretty correlated one would think
 x<-mpi.remote.exec(rnorm,x) 
 length(x)
 x
 #Use mpi.apply instead to generate a huge increasing list of rnorms (41 in total). 
 x<-mpi.apply(seq(20,800,20),rnorm)
 length(x)
 x

To do anything more complex, you need to look at mpi.apply in the documentation about Rmpi (http://cran.r-project.org/web/packages/Rmpi/Rmpi.pdf). For random numbers predesignate on the master node with a known seed, then dispatch to the slaves. This example is very limited and does not exhibit any speedup at all. The jobs need to be longer for any benefits to overcome the overhead.


Caveats

I haven't gotten the snow package to properly retrieve the cluster objects using any of

cl<-getMPIcluster()
cl<-getcluster()
cl<-makeCluster()

Using the second two options a reference to the current cluster is obtained but the comm number is different so most R packages that use snow will not work at all. I am in contact with a few people to attempt and figure this out. Anyone else is free to help out as well, or if they can solve this issue.

R will also throws errors occasionally, not sure why, typically timeout issues that occur every now and then or start up failure.


.Rprofile

# This R profile can be used when a cluster does not allow spawning or a job 
# scheduler is required to launch any parallel jobs. Saving this file as 
# .Rprofile in the working directory or root directory. For unix platform, run
# mpirexec -n [cpu numbers] R --no-save -q
# For windows platform with mpich2, use mpiexec wrapper and specify a working 
# directory where .Rprofile is inside.
# Cannot be used as Rprofile.site because it will not work 
# Following system libraries are not loaded automatically. So manual loads are 
# needed.
# Typically located at .../Rmpi/inst/.Rprofile You have to turn on 'View Hidden Files' 
library(utils)
library(stats)
library(datasets)
library(grDevices)
library(graphics)
library(methods)  
if (!invisible(library(Rmpi,logical.return = TRUE))){
     warning("Rmpi cannot be loaded")
    q(save = "no")
}     
options(error=quote(assign(".mpi.err", FALSE, env = .GlobalEnv)))  
if (mpi.comm.size(0) > 1)
    invisible(mpi.comm.dup(0,1))  
if (mpi.comm.rank(0) >0){ 
    options(echo=FALSE)
    .comm <- 1
    mpi.barrier(0)
    repeat  
	try(eval(mpi.bcast.cmd(rank=0,comm=.comm)),TRUE) 	
   if (is.loaded("mpi_comm_disconnect")) 
       mpi.comm.disconnect(.comm)
   else mpi.comm.free(.comm)
   mpi.quit()
}
if (mpi.comm.rank(0)==0) {
   #options(echo=TRUE)
   mpi.barrier(0)
   if(mpi.comm.size(0) > 1)
       slave.hostinfo(1)
}
.Last <- function(){
   if (is.loaded("mpi_initialize")){
       if (mpi.comm.size(1) > 1){
           print("Please use mpi.close.Rslaves() to close slaves")
           mpi.close.Rslaves(comm=1)
   	}
   }
   print("Please use mpi.quit() to quit R")
   mpi.quit()
}

For additional reading

http://cran.r-project.org/web/packages/npRmpi/vignettes/npRmpi.pdf