RStudio

From Cheaha
Jump to navigation Jump to search


Attention: Research Computing Documentation has Moved
https://docs.rc.uab.edu/


Please use the new documentation url https://docs.rc.uab.edu/ for all Research Computing documentation needs.


As a result of this move, we have deprecated use of this wiki for documentation. We are providing read-only access to the content to facilitate migration of bookmarks and to serve as an historical record. All content updates should be made at the new documentation site. The original wiki will not receive further updates.

Thank you,

The Research Computing Team

RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. To learn more about RStudio, click here.

Starting a RStudio server session

RStudio server session can be started on cheaha, by using command rserver.

[ravi89@login001 ~]$ rserver
Waiting for RStudio server to start
...

SSH port forwarding from laptop
ssh -L 8700:c0082:8700 ravi89@cheaha.rc.uab.edu

Connection string for local browser
http://localhost:8700

Authorization info for Rstudio
Username: ravi89
Password: ................

[ravi89@login001 ~]$

Accessing the created RStudio session

Once RStudio session has started after running rserver command, it would give you the information you need to connect to it.

Here are the steps to connect to it, based on the information that it sends:

Port forwarding

SSH port forwarding from laptop
ssh -L 8700:c0082:8700 ravi89@cheaha.rc.uab.edu

If you are on a Mac/Linux system, start a new tab/terminal on your mac and copy the ssh line mentioned under SSH port forwarding from laptop which in the example above would be ssh -L 8700:c0082:8700 ravi89@cheaha.rc.uab.edu . On a Windows system, you can set up port forwarding on your system, using the methods defined here.

Local Browser Connection

Connection string for local browser
http://localhost:8700

Now, start up a web browser of your choice, Google Chrome, Firefox, Safari etc. , and go to the link mentioned under Connection string for local browser , which in the above example would be http://localhost:8700

Authorization Info

Authorization info for Rstudio
Username: ravi89
Password: ................

Each RStudio server session is secured with a random temporary password, which can be found under Authorization info for Rstudio . Use this info to login to Rstudio server, on your web browser.

Setting your own password

You can setup your own password for accessing RStudio session, by setting environment variable RSTUDIO_PASSWORD . You can set an environment variable using the followng command on cheaha, before starting rserver

[ravi89@login001 ~]$ export RSTUDIO_PASSWORD=asdfghjkl
[ravi89@login001 ~]$ rserver 
Waiting for RStudio server to start
.............

SSH port forwarding from laptop
ssh -L 8742:c0076:8742 ravi89@cheaha.rc.uab.edu

Connection string for local browser
http://localhost:8742

Authorization info for Rstudio
Username: ravi89
Password: asdfghjkl

[ravi89@login001 ~]$ 

Default parameters

If you use rserver without any additional parameters, it would start with the following default parameters

Partition: Short
Time: 12:00:00
mem-per-cpu: 1024
cpus-per-task: 2

Setting parameters

You can set your own parameters with rserver like time, partition etc.

Example:

 rserver --time=05:00:00 --partition=short --mem-per-cpu=4096

List of parameters that you can set up with rserver:

Parallel run options:
  -a, --array=indexes         job array index values
  -A, --account=name          charge job to specified account
      --bb=<spec>             burst buffer specifications
      --bbf=<file_name>       burst buffer specification file
      --begin=time            defer job until HH:MM MM/DD/YY
      --comment=name          arbitrary comment
      --cpu-freq=min[-max[:gov]] requested cpu frequency (and governor)
  -c, --cpus-per-task=ncpus   number of cpus required per task
  -d, --dependency=type:jobid defer job until condition on jobid is satisfied
      --deadline=time         remove the job if no ending possible before
                              this deadline (start > (deadline - time[-min]))
      --delay-boot=mins       delay boot for desired node features
  -D, --workdir=directory     set working directory for batch script
  -e, --error=err             file for batch script's standard error
      --export[=names]        specify environment variables to export
      --export-file=file|fd   specify environment variables file or file
                              descriptor to export
      --get-user-env          load environment from local cluster
      --gid=group_id          group ID to run job as (user root only)
      --gres=list             required generic resources
      --gres-flags=opts       flags related to GRES management
  -H, --hold                  submit job in held state
      --ignore-pbs            Ignore #PBS options in the batch script
  -i, --input=in              file for batch script's standard input
  -I, --immediate             exit if resources are not immediately available
      --jobid=id              run under already allocated job
  -J, --job-name=jobname      name of job
  -k, --no-kill               do not kill job on node failure
  -L, --licenses=names        required license, comma separated
  -M, --clusters=names        Comma separated list of clusters to issue
                              commands to.  Default is current cluster.
                              Name of 'all' will submit to run on all clusters.
                              NOTE: SlurmDBD must up.
  -m, --distribution=type     distribution method for processes to nodes
                              (type = block|cyclic|arbitrary)
      --mail-type=type        notify on state change: BEGIN, END, FAIL or ALL
      --mail-user=user        who to send email notification for job state
                              changes
      --mcs-label=mcs         mcs label if mcs plugin mcs/group is used
  -n, --ntasks=ntasks         number of tasks to run
      --nice[=value]          decrease scheduling priority by value
      --no-requeue            if set, do not permit the job to be requeued
      --ntasks-per-node=n     number of tasks to invoke on each node
  -N, --nodes=N               number of nodes on which to run (N = min[-max])
  -o, --output=out            file for batch script's standard output
  -O, --overcommit            overcommit resources
  -p, --partition=partition   partition requested
      --parsable              outputs only the jobid and cluster name (if present),
                              separated by semicolon, only on successful submission.
      --power=flags           power management options
      --priority=value        set the priority of the job to value
      --profile=value         enable acct_gather_profile for detailed data
                              value is all or none or any combination of
                              energy, lustre, network or task
      --propagate[=rlimits]   propagate all [or specific list of] rlimits
      --qos=qos               quality of service
  -Q, --quiet                 quiet mode (suppress informational messages)
      --reboot                reboot compute nodes before starting job
      --requeue               if set, permit the job to be requeued
  -s, --oversubscribe         over subscribe resources with other jobs
  -S, --core-spec=cores       count of reserved cores
      --signal=[B:]num[@time] send signal when time limit within time seconds
      --spread-job            spread job across as many nodes as possible
      --switches=max-switches{@max-time-to-wait}
                              Optimum switches and max time to wait for optimum
      --thread-spec=threads   count of reserved threads
  -t, --time=minutes          time limit
      --time-min=minutes      minimum time limit (if distinct)
      --uid=user_id           user ID to run job as (user root only)
      --use-min-nodes         if a range of node counts is given, prefer the
                              smaller count
  -v, --verbose               verbose mode (multiple -v's increase verbosity)
  -W, --wait                  wait for completion of submitted job
      --wckey=wckey           wckey to run job under
      --wrap[=command string] wrap command string in a sh script and submit

Constraint options:
      --contiguous            demand a contiguous range of nodes
  -C, --constraint=list       specify a list of constraints
  -F, --nodefile=filename     request a specific list of hosts
      --mem=MB                minimum amount of real memory
      --mincpus=n             minimum number of logical processors (threads)
                              per node
      --reservation=name      allocate resources from named reservation
      --tmp=MB                minimum amount of temporary disk
  -w, --nodelist=hosts...     request a specific list of hosts
  -x, --exclude=hosts...      exclude a specific list of hosts

Consumable resources related options:
      --exclusive[=user]      allocate nodes in exclusive mode when
                              cpu consumable resource is enabled
      --exclusive[=mcs]       allocate nodes in exclusive mode when
                              cpu consumable resource is enabled
                              and mcs plugin is enabled
      --mem-per-cpu=MB        maximum amount of real memory per allocated
                              cpu required by the job.
                              --mem >= --mem-per-cpu if --mem is specified.

Affinity/Multi-core options: (when the task/affinity plugin is enabled)
  -B  --extra-node-info=S[:C[:T]]            Expands to:
       --sockets-per-node=S   number of sockets per node to allocate
       --cores-per-socket=C   number of cores per socket to allocate
       --threads-per-core=T   number of threads per core to allocate
                              each field can be 'min' or wildcard '*'
                              total cpus requested = (N x S x C x T)

      --ntasks-per-core=n     number of tasks to invoke on each core
      --ntasks-per-socket=n   number of tasks to invoke on each socket



Setting per-project package libraries

Please see the page on R Per-Project Package Libraries for more information

Moving rstudio directory

As you accumulate rstudio packages, you may find that it is taking a lot of space in your $HOME directory, leading to issues with interactive sessions failing to start. The issue may be resolved by moving the directory and creating a shortcut to the new location in its place.

How to: Move a pre-existing rstudio directory and create a symlink

cd ~
mv ~/.rstudio $USER_DATA/
ln -s $USER_DATA/.rstudio .rstudio