These patterns apply to both RCC and DSI because both use SLURM. These templates are designed to be copied into your project and edited.
When to use interactive vs batch jobs¶
| Use case | Recommended |
|---|---|
| Debugging code | srun |
| Testing environments | srun |
| Exploratory analysis | srun |
| Long-running jobs | sbatch |
| Production workflows | sbatch |
sbatch Batch job submissions¶
Before submitting:
Create a
logs/directory (and any output directories your code expects)Update
#SBATCH --partition=...to a partition you can accessRequest the minimum resources you need so jobs start sooner
GPU job (batch)¶
Request the minimum time/memory/GPU you need
Smaller requests often start sooner
Submit:
sbatch scripts/gpu.sbatchYou may want to determine what partitions you can run on by reviewing a list of available partitions on the cluster:
sinfo -a(RCC partitions: https://
File contents:
#!/bin/bash
# NOTE: Consider enabling SLURM email notifications (see Appendix: SLURM Email Notifications)
#SBATCH --job-name=gpu_example
##SBATCH --account=pi-account # <-- change to an allowed account on your cluster - RCC CLUSTER ONLY (uncomment if needed)
#SBATCH --partition=general # <-- change to an allowed GPU partition on your cluster
#SBATCH --gres=gpu:1 # <-- change request 1 GPU (adjust as needed)
##SBATCH --gres=local:200G # <-- Request node local storage - DSI CLUSTER ONLY (uncomment if needed)
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32G
#SBATCH --time=02:00:00
#SBATCH --output=/path/to/logs/%x_%j.out
#SBATCH --error=/path/to/logs/%x_%j.err
# ===============================
# GPU JOB TEMPLATE (ANNOTATED)
# ===============================
# Tips:
# - Request the minimum resources you need (time/mem/CPU/GPU) to start sooner.
# - On RCC compute nodes, outbound internet is typically blocked.
# - Create a logs/ directory (and any output dirs) before submitting.
# - Uncomment required steps to execute script to customize to your usage.
set -euo pipefail
# 1) Load modules (adjust to your software stack)
# module purge
# module load cuda/12.1 # example; check `module avail` on your cluster
# 2) Activate your environment
# Recommended: keep envs in your project/scratch, not in $HOME if large.
# source /path/to/venv/bin/activate
# OR for conda/mamba:
# source ~/.bashrc
# conda activate myenv
# 3) (Optional) Use node-local scratch for high I/O temporary files
# SLURM may set $TMPDIR / $SLURM_TMPDIR on some clusters.
# If set, it is FAST but DELETED when the job ends.
WORKDIR="${SLURM_TMPDIR:-/local/scratch}/${USER}_${SLURM_JOB_ID}"
mkdir -p "$WORKDIR"
echo "Working directory: $WORKDIR"
# 4) Copy inputs to node-local storage (optional)
# cp -r /path/to/input "$WORKDIR/"
# 5) Run your workload
# Example: Python training script (replace with your command)
python -u /home/ntebaldi/user-guide/gpu/gpu.py \
--epochs 5 \
--batch-size 64 \
--outdir "${WORKDIR}/run_${SLURM_JOB_ID}"
# 6) Copy outputs back to persistent storage if you used node-local scratch
# rsync -av "$WORKDIR/" /scratch/midways3/$USER/somewhere/
# 7) Deactivate your environment
# deactivate
# conda deactivate myenv
echo "Done."
See this repository directory for full example: https://
Job arrays¶
Run many similar jobs over different inputs. Reference: https://
Submit:
sbatch scripts/array.sbatchFile contents:
#!/bin/bash
# NOTE: Consider enabling SLURM email notifications (see Appendix: SLURM Email Notifications)
#SBATCH --job-name=array_example
##SBATCH --account=<PI_ACCOUNT> # <-- change to an allowed account on your cluster ; RCC CLUSTER ONLY (uncomment if needed)
#SBATCH --partition=<PARTITION> # <-- change to an allowed GPU partition on your cluster
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=4G
##SBATCH --gres=local:200G # <-- Request node local storage - DSI CLUSTER ONLY (uncomment if needed)
#SBATCH --time=00:30:00
#SBATCH --array=0-9 # 10 tasks: indices 0..9
#SBATCH --output=/path/to/logs/%x_%A_%a.out
#SBATCH --error=/path/to/logs/%x_%A_%a.err
# =====================================
# JOB ARRAY TEMPLATE (ANNOTATED)
# =====================================
# Use job arrays when you have many similar tasks over different inputs:
# - parameter sweeps
# - per-sample preprocessing
# - independent simulations
#
# Key environment variables:
# - SLURM_ARRAY_JOB_ID (the parent job ID)
# - SLURM_ARRAY_TASK_ID (the index for this array task)
set -euo pipefail
TASK_ID="${SLURM_ARRAY_TASK_ID}"
echo "Array task: ${TASK_ID}"
# Option A: Pass input file directly to Python script
# The Python script will use SLURM_ARRAY_TASK_ID to select which element to process
MANIFEST="/path/to/your/input.json"
echo "Input file: ${MANIFEST}"
echo "Array task ID: ${TASK_ID}"
# Run your program
python -u /path/to/your/array.py --input "$MANIFEST" --output "/path/to/your/results/out_${TASK_ID}.txt"
# Option B: Map task IDs to input file elements via a manifest (alternative approach)
# MANIFEST="inputs.txt"
# INPUT=$(sed -n "$((TASK_ID+1))p" "$MANIFEST") # +1 because sed is 1-indexed
# echo "Input for this task: ${INPUT}"
# python -u array.py --input "$INPUT" --output "outputs/out_${TASK_ID}.txt"
# Option C: Map task IDs to parameters (example)
# PARAM=$(python - <<'PY'
# import os
# tid = int(os.environ["SLURM_ARRAY_TASK_ID"])
# print([0.1,0.2,0.5,1.0,2.0][tid])
# PY
# )
# echo "Param: $PARAM"
echo "Done."
See this repository directory for full example: https://
Multi-node MPI jobs¶
Run across nodes in the cluster. Reference: https://
Submit:
sbatch scripts/mpi.sbatchRCC Cluster submission file contents:
#!/bin/bash
#SBATCH --account=<PI_ACCOUNT> # <-- change to an allowed account
#SBATCH --job-name=mpi_example
#SBATCH --partition=<PARTITION> # <-- change to an allowed partition
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=1
#SBATCH --mem=0
#SBATCH --time=01:00:00
#SBATCH --output=/path/to/your/project/logs/%x_%j.out # <-- change to your project's logs directory
#SBATCH --error=/path/to/your/project/logs/%x_%j.err # <-- change to your project's logs directory
set -euo pipefail
# ---- User paths ----
SCRIPT="/path/to/your/project/mpi.py"
INPUT="/path/to/your/project/mpi_input.txt"
LOGS="/path/to/your/project/logs"
RESULTS="/path/to/your/project/results"
VENV="/path/to/your/project/.venv"
mkdir -p "$LOGS" "$RESULTS"
# ---- Runtime settings ----
export PYTHONUNBUFFERED=1
# Open MPI transport preferences (Midway3)
# - Try UCX first; keep TCP + shared memory as fallback
export OMPI_MCA_pml=ucx
export OMPI_MCA_btl=self,vader,tcp
export OMPI_MCA_btl_tcp_if_include=ib0
# ---- Environment ----
module load openmpi/4.1.8
source "${VENV}/bin/activate"
echo "Nodes allocated:"
scontrol show hostnames "$SLURM_NODELIST"
echo "Total tasks: ${SLURM_NTASKS} (ntasks-per-node: ${SLURM_NTASKS_PER_NODE:-unknown})"
# Hostfile from Slurm allocation (cleaned up on exit)
HOSTFILE="$(mktemp "${SLURM_SUBMIT_DIR:-/tmp}/hostfile.${SLURM_JOB_ID}.XXXX")"
trap 'rm -f "$HOSTFILE"' EXIT
scontrol show hostnames "$SLURM_NODELIST" > "$HOSTFILE"
# Optional debug: sbatch --export=ALL,DEBUG=1 mpi.sbatch
DEBUG="${DEBUG:-0}"
if [[ "$DEBUG" == "1" ]]; then
echo "Hostfile: $HOSTFILE"
cat "$HOSTFILE"
echo "Interfaces per node:"
srun -N "$SLURM_JOB_NUM_NODES" -n "$SLURM_JOB_NUM_NODES" --ntasks-per-node=1 \
bash -lc 'echo HOST=$(hostname); ip -o -4 addr show | awk "{print \$2,\$4}"'
echo "mpi4py linked against:"
python -c "import mpi4py.MPI as M; print(M.Get_library_version())"
fi
# ---- Launch (Open MPI via mpirun; works around Slurm PMI/PMIx limitations) ----
mpirun -np "$SLURM_NTASKS" \
--hostfile "$HOSTFILE" \
--map-by "ppr:${SLURM_NTASKS_PER_NODE}:node" \
--bind-to none \
--tag-output --timestamp-output \
python -u "$SCRIPT" \
--input "$INPUT" \
--output "${RESULTS}/out_${SLURM_JOB_ID}.txt"
# ---- Deactivate environment ----
# conda deactivate myenv
deactivate
echo "Done."
DSI Cluster submission file contents:
#!/bin/bash
#SBATCH --job-name=mpi_example
#SBATCH --partition=general
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=1
#SBATCH --mem=0
#SBATCH --time=01:00:00
#SBATCH --output=/path/to/your/project/logs/%x_%j.out
#SBATCH --error=/path/to/your/project/logs/%x_%j.err
set -euo pipefail
# Optional debug: sbatch --export=ALL,DEBUG=1 mpi.sbatch
DEBUG="${DEBUG:-0}"
# ---- Paths ----
SCRIPT="/path/to/your/project/mpi.py"
INPUT="/path/to/your/project/mpi_input.txt"
LOGS="/path/to/your/project/logs"
RESULTS="/path/to/your/project/results"
VENV="/path/to/your/project/.venv"
MPI_TYPE="pmix_v3"
mkdir -p "$LOGS" "$RESULTS"
# ---- Runtime settings ----
export PYTHONUNBUFFERED=1
# Force Open MPI to use TCP over a clean interface set (avoid docker/loopback bridges)
export OMPI_MCA_btl=self,tcp
export OMPI_MCA_btl_tcp_if_exclude=lo,docker0,virbr0
# Conservative PML (avoid UCX surprises on this cluster)
export OMPI_MCA_pml=ob1
# ---- Environment ----
# module purge
# module load openmpi/<version>
source "${VENV}/bin/activate"
echo "Nodes allocated:"
scontrol show hostnames "$SLURM_NODELIST"
echo "Total tasks: ${SLURM_NTASKS} (ntasks-per-node: ${SLURM_NTASKS_PER_NODE:-unknown})"
echo "DEBUG=${DEBUG} MPI_TYPE=${MPI_TYPE}"
if [[ "$DEBUG" == "1" ]]; then
echo "Interfaces per node:"
srun -N "$SLURM_JOB_NUM_NODES" -n "$SLURM_JOB_NUM_NODES" --ntasks-per-node=1 \
bash -lc 'echo HOST=$(hostname); ip -o -4 addr show | awk "{print \$2,\$4}"'
echo "mpi4py linked against:"
python -c "import mpi4py.MPI as M; print(M.Get_library_version())"
fi
# ---- Launch ----
SRUN_ARGS=(--mpi="${MPI_TYPE}" --label --kill-on-bad-exit=1)
if [[ "$DEBUG" == "1" ]]; then
SRUN_ARGS+=(--output="${LOGS}/%x_%j_%t.out" --error="${LOGS}/%x_%j_%t.err")
fi
srun "${SRUN_ARGS[@]}" \
python -u "$SCRIPT" \
--input "$INPUT" \
--output "${RESULTS}/out_${SLURM_JOB_ID}.txt"
deactivate
echo "Done."
See this repository directory for full example: https://
srun Interactive jobs¶
Interactive jobs let you run commands directly on a compute node instead of submitting a batch script. They are ideal for:
Debugging jobs that fail in batch mode
Testing software environments and modules
Running short experiments or exploratory analyses
Developing code before scaling up to batch jobs
srun: direct interactive jobs (recommended), the preferred and most flexible way to start an interactive session.
Basic interactive CPU job¶
srun --partition=general --nodes=1 --ntasks=1 --cpus-per-task=2 --mem=4G --time=01:00:00 --pty bash -iUse this to:
Activate virtual environments
Compile code
Run small test cases
Verify file paths and permissions
Interactive GPU job¶
srun --partition=schmidt-gpu --gres=gpu:1 --cpus-per-task=4 --mem=32G --time=01:30:00 --pty bashVerify GPU access:
nvidia-smiAttach to a running job¶
srun --jobid=<JOBID> --pty bashUseful for inspecting logs or diagnosing issues.
sinteractive: convenience wrapper (RCC only)¶
sinteractive is an RCC-provided wrapper around srun.
Example:
sinteractiveWith options:
sinteractive --mem=8G --time=01:00:00Notes:
RCC-specific (not available on DSI)
Defaults may request more resources than intended
Prefer
srunfor clarity and reproducibility
Open OnDemand (RCC only)¶
Open OnDemand (OOD) is a web-based interface to the RCC clusters. It provides an alternative to the command line for common tasks and is especially useful for visualization, notebooks, and interactive workflows.
RCC documentation: https://
What Open OnDemand is good for
Use Open OnDemand when you want to:
Browse and manage files through a web interface
Launch interactive desktops on compute nodes
Run Jupyter notebooks without setting up port forwarding
Use GUI-based tools (e.g., visualization, IDEs) on cluster hardware
Open OnDemand still submits jobs through SLURM—it does not bypass scheduling or resource limits.
Accessing Open OnDemand
Log in with your UChicago credentials
Choose an app or interactive session from the menu
Open OnDemand vs srun and sbatch
| Task | Recommended |
|---|---|
| Debugging via terminal | srun |
| Jupyter notebooks | Open OnDemand |
| Interactive desktop / GUI apps | Open OnDemand |
| Scripted, repeatable workflows | sbatch |
| Lightweight exploration | Either |
Important notes
Open OnDemand sessions:
Run on compute nodes, not login nodes
Count against your allocation
Are subject to the same time and memory limits as
srun
If a session ends or times out, unsaved work may be lost
For long or unattended jobs, always use
sbatch
Common mistakes¶
Forgetting
--timeRequesting excessive resources
Running large jobs interactively
Leaving interactive sessions idle
Software Modules¶
On RCC clusters (like Midway2 / Midway3) and to some extent the DSI cluster, most scientific software isn’t available in your shell by default.
Instead, packaged tools, languages, libraries, and compilers are managed through Environment Modules which is a system that lets you load, unload, and switch software versions cleanly.
A module is basically a script that sets up your environment (e.g., PATH, LD_LIBRARY_PATH) so a specific software package and its dependencies become available in your session.
You use the module command (module avail, module load, etc.) to interact with these.
The benefit: no conflicting software versions, easy switching between versions, and reproducibility across compute sessions.
Modules may also provide:
Libraries (e.g., FFTW, MKL) used by other software
Developer tools (CMake, debugger/profilers)
Language environments (Perl, Java)
These support both building complex codes and running them smoothly on compute nodes.
How to run software modules¶
See what’s available:
module availLoad what you need:
module load <software>/<version>Check what’s loaded:
module listRun your code inside a job script or interactive session.
Monitoring Jobs¶
Once you submit a job, Slurm provides several commands to help you track its status, priority, and resource usage on RCC clusters.
Check Job Status: squeue¶
View your running and pending jobs:
squeue -u $USERKey fields include job ID, state (PD = pending, R = running), elapsed time, and node or pending reason.
For a custom view:
squeue -u $USER -o "%.18i %.9P %.8j %.2t %.10M %.6D %R"Check Job Priority: sprio¶
If a job is pending, use sprio to understand its scheduling priority:
sprio -u $USEROnly prints output for pending jobs. This shows how factors like job age, fairshare, and job size affect when your job will start.
Check Job Efficiency: seff¶
After a job finishes, summarize how efficiently it used resources:
seff <JOBID>Use CPU and memory efficiency to adjust future job requests.