-
Notifications
You must be signed in to change notification settings - Fork 3
RCC Computing Guide
Here, we show the link needed for signing for an account, and then the appropriate answers to each question in the application.
- Obtain a new account. Visit this link: RCC Website Link
- Use the following responses to answer the application questions:
- Principal Investigator account name:
pi-nord - software and system tools that you anticipate using for computational research at the RCC:
We will use scientific Python and deep learning codebases. - A brief summary of your work that will use RCC resources:
We will perform research at the intersection of physics, cosmology, and artificial intelligence.
- Principal Investigator account name:
If you want multiple accounts, apply as above and then re-apply for the second time, including the first account on that application (there's a space to list existing account affiliations).
RCC access does not require the use of a VPN, but it can make remote notebook access (see here) easier.
- Download Cisco AnyConnect Secure Mobility Client here
- Log into the VPN using the address
vpn.uchicago.eduin the Cisco AnyConnect Secure Mobility Client dialog box - Authenticate with your
Duomulti-factor authentication application
- Log in with SSH at the command line,
ssh <cnetid>@midway2.rcc.uchicago.edu
- Authenticate your ID with the
Duomulti-fac application. - Create an alias on your local machine to simplify your login:
- Open
~/.bash_profilelocally (on your home machine). - add this line:
alias sshrcc='> ~/.ssh/known_hosts; ssh midway2.rcc.uchicago.edu'
- Save and exit the file.
- Test at the command line:
sshrcc
- Open
- RCC often changes its IP address, which may cause errors on your local machine. This is why we recommend creating the alias.
-
<cnetid>is your UChicago username. - if the username on your computer is the same as your
<cnetid>, you can usessh midway2.rcc.uchicago.eduinstead.
-
Loginnodes are your landing node -- you always log in to aloginnode -
Computenode can be accessed from theloginnodes -- use these for memory intensive computations
- Functioning:
- Default node. It is the most robust way, because it assigns you to the least-used node if they're both up, or it assigns you to the live one if one is down. This is our best interpretation of the situation. We haven't confirmed with RCC staff.
ssh midway2.rcc.uchicago.edu
- login1 (if you know you want to be on this particular node)
ssh midway2-login1.rcc.uchicago.edu
- login2 (if you know you want to be on this particular node
ssh midway2-login2.rcc.uchicago.edu
- Non-Functioning
-
midway2-login3.rcc.uchicago.eduis not typically available -
midway.rcc.uchicago.eduis decommissioned
-
- the most common (and recommended) way to access compute nodes is by running
sinteractive(with optional flags as described on the RCC website here, and in this guide here) - if you want to run a job for longer than you wish to be actively logged in, you can use
tmux - if you want to submit many jobs at once and don't want to have a different
tmuxsession for everything, considerbatchcomputing instead, as described here
-
functionality is described in the user guide here.
-
RCC recommends
sinteractivefor most use cases to select a node for compute. -
If
kicpaais your primary affiliation (rather thanpinord) it might work without the flags. -
There is a separate partition for the KICP/A&Ap GPU allotment – to access that, do -- different partition as above, but the same account name
-
To use a CPU:
sinteractive -A kicpaa -p kicpaa
-
To use a GPU:
sinteractive -p kicpaa-gpu -A kicpaa
- Create a
.bash_profilein your home directory on RCC and add any commands you want to run by default immediately upon logging in - I always load tmux in case I wind up running a job for longer than I want to remain actively logged in, so I include
module load tmux
- Activate a virtual environment (described more here):
source activate <env_name>
- RCC uses
condafor package management - RCC provides many prebuilt conda modules.
- For guidance on creating virtual environments, see the second warning in this list of “mistakes to avoid”
- After creating a virtual environment named
env_name, RCC preferssource activate <env_name>as described here
- Load the latest Anaconda python (as of March 2023):
module load python/anaconda-2021.05
- Use
source activate - Never use (has been known to break things like
ThinLinc.):conda init
- Never use:
conda activate
There is an RCC guide for running Jupyter notebooks available here. Usage depends on whether or not you're on VPN.
The basic approach is as follows
- Write a script:
mybatchscript.sh - Submit a script:
sbatch mybatchscript.sh <arguments>
The batch file can contain use different flags. For instance, to run on the kicpaa partition with the kicpaa account and passing two arguments to a script mypyscript.py, the contents would look like
#!/bin/bash
#SBATCH --job-name=Hruns
#SBATCH --time=02:00:00
#SBATCH --account=kicpaa
#SBATCH --partition=kicpaa
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=12G
python mypyscript.py and the two arguments get passed to mypyscript.py via sys.argv (for example). With these tools, you can iterate over the two arguments from 10 to 15 and 0 to 5, respectively, with for i in {0..5}; do for j in {10..15}; do sbatch <file name>.sh $j $i; done; done
This might be unnecessarily slow if SLURM doesn’t want to dispatch that many independent jobs. An alternative would be to use the array flag of sbatch and instead mybatchscript.sh looks like
#!/bin/bash
#SBATCH --job-name=Hruns
#SBATCH --time=02:00:00
#SBATCH --array=10-15
#SBATCH --account=kicpaa
#SBATCH --partition=kicpaa
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=12G
python mypyscript.py $SLURM_ARRAY_TASK_ID $1The array flag will ensure that mypyscript is run with the first argument taking on values from 10-15 (inclusive) for any values of the second argument. To iterate over the second argument from 0-5 (inclusive), use the following at the CLI:
for i in {0..5}; do sbatch mybatchscript.sh $i; doneThese two sets of scripts and commands will result in the same output, but they are handled by the scheduler differently (eg, the job names will have subscripts according to their $SLURM_ARRAY_TASK_ID value, so the output files will be slurm_123456789_0.out, slurm_123456789_1.out, etc. rather than slurm_123456789.out, slurm_123456790.out, slurm_123456791.out, etc.)
Compute nodes aren't connected to the internet, so if you want to clone a GitHub repository hosted at https://github.com/<myrepo> do the following on the login node.
- Log in to a login node
- CLI:
git clone https://github.com/myrepo.git
- Provide your username and either your password (which you will need to reenter every time) or a token (as documented by git here)
- if you need compilers to install a development package, RCC has
gcc, though it must be loaded withmodule load gcc
- DeepBench
- DeepGotData
- DeepUtils
- Google Colaboratory
- Elastic Analysis Facility (EAF; Fermilab)
- Research Computing Center (UChicago)
- coming soon.
