R
Hamilton supports using the R programming language.
A number of versions of R are made available through the module command. module avail r and module avail rstudio will show what is available. We currently recommend that you load the r/4.1.2 module:
module load r/4.1.2
If you need to use RStudio (the standard R interactive development environment) and are not connecting from the University campus, we recommend that you connect to Hamilton using X2GO (see the Login page for details) to improve performance and stability and then load the both the r/4.1.2 and rstudio/2021.09.1 modules.
Note that, by default, a fairly old version of Python is available via the R command on the login nodes, which is provided by the underlying operating system. We do not advise its use.
For further information, see:
Installing R packages
Each user is able to install R packages in their own account. This allows you to install packages, or alternative versions of packages, without waiting for the Hamilton administrators to install it for you.
The R module sets an environment variable, $R_BUILD_MODULES, which contains a list of the modules used to build the R module itself. Reproducing this environment can help when installing some packages. To see which modules it lists, type:
echo $R_BUILD_MODULES
For example, to install package Matrix (Skipping some of the output for brevity. Files will be saved under a folder called R in your home directory):
[aabb22@login1 ~]$ module load r/4.1.2
[aabb22@login1 ~]$ module load $R_BUILD_MODULES
[aabb22@login1 ~]$ R
R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
> install.packages("Matrix")
Warning in install.packages("Matrix") :
'lib = "/apps/applications/r/4.1.2/1/default/lib64/R/library"' is not writable
Would you like to use a personal library instead? (yes/No/cancel) yes
Would you like to create a personal library
'~/R/x86_64-pc-linux-gnu-library/4.1'
to install packages into? (yes/No/cancel) yes
* DONE (Matrix)
The downloaded source packages are in
'/tmp/RtmpKVcvHk/downloaded_packages'
>
Once installed, R will be able to find and use these packages using the library("<package>") command.
Note that the exact steps required may differ. For example, you may be asked to select where to download the package. In this case UK (Bristol) [https] and UK (London 1) [https] are sensible choices to make.
Running R jobs
Example R program, in file my_r_program.R:
print("hello, world!")
Example job script my_r_job.sh to run an R program using a single CPU core:
#!/bin/bash
# Request resources:
#SBATCH -c 1 # 1 CPU core
#SBATCH --mem=1G # 1 GB RAM
#SBATCH --time=1:0:0 # 1 hour (days-hours:minutes:seconds)
# Run in the 'shared' queue
# (job will share node with other jobs)
#SBATCH -p shared
# Make R available:
module load r/4.1.2
# Commands to be run:
R CMD BATCH my_r_program.R
Submit it to the queue with the command: sbatch my_r_job.sh
Output from the job, including any messages from the batch queue system, will be found in a file called slurm-<jobid>.out and output from the R program will be found in my_r_job.Rout.
Making R faster
If you need to make your R code run faster, there are a number of things you can try. Roughly, in order of importance (most important first):
- Use a profiler such as the lineprof library, or benchmark using the microbenchmark library to understand: where your program is spending most of its time; where you need to concentrate your effort; and the impact of any changes made to speed up the code.
- Use vectorised functions that act on an object as a whole, such as rowSums() etc., instead of iterating over each element in a list.
- Try writing the most numerically intensive part of your program in another language, such as Fortran or C, and call it from R.
- Try using packages such as future or Rmpi to make use of more than one CPU core.