Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Running Jobs (SLURM)

These patterns apply to both RCC and DSI because both use SLURM. These templates are designed to be copied into your project and edited.

When to use interactive vs batch jobs

Use caseRecommended
Debugging codesrun
Testing environmentssrun
Exploratory analysissrun
Long-running jobssbatch
Production workflowssbatch

sbatch Batch job submissions

Before submitting:

GPU job (batch)

Submit:

sbatch scripts/gpu.sbatch

You may want to determine what partitions you can run on by reviewing a list of available partitions on the cluster:

sinfo -a

(RCC partitions: https://docs.rcc.uchicago.edu/partitions/)

File contents:

gpu.sbatch
#!/bin/bash
# NOTE: Consider enabling SLURM email notifications (see Appendix: SLURM Email Notifications)
#SBATCH --job-name=gpu_example
##SBATCH --account=pi-account    # <-- change to an allowed account on your cluster - RCC CLUSTER ONLY (uncomment if needed)
#SBATCH --partition=general      # <-- change to an allowed GPU partition on your cluster
#SBATCH --gres=gpu:1             # <-- change request 1 GPU (adjust as needed)
##SBATCH --gres=local:200G       # <-- Request node local storage - DSI CLUSTER ONLY (uncomment if needed)
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32G
#SBATCH --time=02:00:00
#SBATCH --output=/path/to/logs/%x_%j.out
#SBATCH --error=/path/to/logs/%x_%j.err

# ===============================
# GPU JOB TEMPLATE (ANNOTATED)
# ===============================
# Tips:
# - Request the minimum resources you need (time/mem/CPU/GPU) to start sooner.
# - On RCC compute nodes, outbound internet is typically blocked.
# - Create a logs/ directory (and any output dirs) before submitting.
# - Uncomment required steps to execute script to customize to your usage.

set -euo pipefail

# 1) Load modules (adjust to your software stack)
# module purge
# module load cuda/12.1  # example; check `module avail` on your cluster

# 2) Activate your environment
# Recommended: keep envs in your project/scratch, not in $HOME if large.
# source /path/to/venv/bin/activate
# OR for conda/mamba:
# source ~/.bashrc
# conda activate myenv

# 3) (Optional) Use node-local scratch for high I/O temporary files
# SLURM may set $TMPDIR / $SLURM_TMPDIR on some clusters.
# If set, it is FAST but DELETED when the job ends.
WORKDIR="${SLURM_TMPDIR:-/local/scratch}/${USER}_${SLURM_JOB_ID}"
mkdir -p "$WORKDIR"
echo "Working directory: $WORKDIR"

# 4) Copy inputs to node-local storage (optional)
# cp -r /path/to/input "$WORKDIR/"

# 5) Run your workload
# Example: Python training script (replace with your command)
python -u /home/ntebaldi/user-guide/gpu/gpu.py \
  --epochs 5 \
  --batch-size 64 \
  --outdir "${WORKDIR}/run_${SLURM_JOB_ID}"

# 6) Copy outputs back to persistent storage if you used node-local scratch
# rsync -av "$WORKDIR/" /scratch/midways3/$USER/somewhere/

# 7) Deactivate your environment
# deactivate
# conda deactivate myenv

echo "Done."

See this repository directory for full example: https://github.com/chicago-aiscience/chicago-aiscience.github.io/tree/main/docs/user_guide/scripts/gpu

Job arrays

Run many similar jobs over different inputs. Reference: https://slurm.schedmd.com/job_array.html

Submit:

sbatch scripts/array.sbatch

File contents:

array.sbatch
#!/bin/bash
# NOTE: Consider enabling SLURM email notifications (see Appendix: SLURM Email Notifications)
#SBATCH --job-name=array_example
##SBATCH --account=<PI_ACCOUNT>        # <-- change to an allowed account on your cluster ; RCC CLUSTER ONLY (uncomment if needed)
#SBATCH --partition=<PARTITION>         # <-- change to an allowed GPU partition on your cluster
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=4G
##SBATCH --gres=local:200G              # <-- Request node local storage - DSI CLUSTER ONLY (uncomment if needed)
#SBATCH --time=00:30:00
#SBATCH --array=0-9                     # 10 tasks: indices 0..9
#SBATCH --output=/path/to/logs/%x_%A_%a.out
#SBATCH --error=/path/to/logs/%x_%A_%a.err

# =====================================
# JOB ARRAY TEMPLATE (ANNOTATED)
# =====================================
# Use job arrays when you have many similar tasks over different inputs:
# - parameter sweeps
# - per-sample preprocessing
# - independent simulations
#
# Key environment variables:
# - SLURM_ARRAY_JOB_ID  (the parent job ID)
# - SLURM_ARRAY_TASK_ID (the index for this array task)

set -euo pipefail

TASK_ID="${SLURM_ARRAY_TASK_ID}"
echo "Array task: ${TASK_ID}"

# Option A: Pass input file directly to Python script
# The Python script will use SLURM_ARRAY_TASK_ID to select which element to process
MANIFEST="/path/to/your/input.json"
echo "Input file: ${MANIFEST}"
echo "Array task ID: ${TASK_ID}"

# Run your program
python -u /path/to/your/array.py --input "$MANIFEST" --output "/path/to/your/results/out_${TASK_ID}.txt"

# Option B: Map task IDs to input file elements via a manifest (alternative approach)
# MANIFEST="inputs.txt"
# INPUT=$(sed -n "$((TASK_ID+1))p" "$MANIFEST")  # +1 because sed is 1-indexed
# echo "Input for this task: ${INPUT}"
# python -u array.py --input "$INPUT" --output "outputs/out_${TASK_ID}.txt"

# Option C: Map task IDs to parameters (example)
# PARAM=$(python - <<'PY'
# import os
# tid = int(os.environ["SLURM_ARRAY_TASK_ID"])
# print([0.1,0.2,0.5,1.0,2.0][tid])
# PY
# )
# echo "Param: $PARAM"

echo "Done."

See this repository directory for full example: https://github.com/chicago-aiscience/chicago-aiscience.github.io/tree/main/docs/user_guide/scripts/array

Multi-node MPI jobs

Run across nodes in the cluster. Reference: https://docs.open-mpi.org/en/main/launching-apps/slurm.html

Submit:

sbatch scripts/mpi.sbatch

RCC Cluster submission file contents:

mpi-rcc.sbatch
#!/bin/bash
#SBATCH --account=<PI_ACCOUNT>    # <-- change to an allowed account
#SBATCH --job-name=mpi_example
#SBATCH --partition=<PARTITION>    # <-- change to an allowed partition
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=1
#SBATCH --mem=0
#SBATCH --time=01:00:00
#SBATCH --output=/path/to/your/project/logs/%x_%j.out    # <-- change to your project's logs directory
#SBATCH --error=/path/to/your/project/logs/%x_%j.err    # <-- change to your project's logs directory

set -euo pipefail

# ---- User paths ----
SCRIPT="/path/to/your/project/mpi.py"
INPUT="/path/to/your/project/mpi_input.txt"
LOGS="/path/to/your/project/logs"
RESULTS="/path/to/your/project/results"
VENV="/path/to/your/project/.venv"

mkdir -p "$LOGS" "$RESULTS"

# ---- Runtime settings ----
export PYTHONUNBUFFERED=1

# Open MPI transport preferences (Midway3)
# - Try UCX first; keep TCP + shared memory as fallback
export OMPI_MCA_pml=ucx
export OMPI_MCA_btl=self,vader,tcp
export OMPI_MCA_btl_tcp_if_include=ib0

# ---- Environment ----
module load openmpi/4.1.8
source "${VENV}/bin/activate"

echo "Nodes allocated:"
scontrol show hostnames "$SLURM_NODELIST"
echo "Total tasks: ${SLURM_NTASKS} (ntasks-per-node: ${SLURM_NTASKS_PER_NODE:-unknown})"

# Hostfile from Slurm allocation (cleaned up on exit)
HOSTFILE="$(mktemp "${SLURM_SUBMIT_DIR:-/tmp}/hostfile.${SLURM_JOB_ID}.XXXX")"
trap 'rm -f "$HOSTFILE"' EXIT
scontrol show hostnames "$SLURM_NODELIST" > "$HOSTFILE"

# Optional debug: sbatch --export=ALL,DEBUG=1 mpi.sbatch
DEBUG="${DEBUG:-0}"
if [[ "$DEBUG" == "1" ]]; then
  echo "Hostfile: $HOSTFILE"
  cat "$HOSTFILE"
  echo "Interfaces per node:"
  srun -N "$SLURM_JOB_NUM_NODES" -n "$SLURM_JOB_NUM_NODES" --ntasks-per-node=1 \
    bash -lc 'echo HOST=$(hostname); ip -o -4 addr show | awk "{print \$2,\$4}"'
  echo "mpi4py linked against:"
  python -c "import mpi4py.MPI as M; print(M.Get_library_version())"
fi

# ---- Launch (Open MPI via mpirun; works around Slurm PMI/PMIx limitations) ----
mpirun -np "$SLURM_NTASKS" \
  --hostfile "$HOSTFILE" \
  --map-by "ppr:${SLURM_NTASKS_PER_NODE}:node" \
  --bind-to none \
  --tag-output --timestamp-output \
  python -u "$SCRIPT" \
    --input "$INPUT" \
    --output "${RESULTS}/out_${SLURM_JOB_ID}.txt"

# ---- Deactivate environment ----
# conda deactivate myenv
deactivate
echo "Done."

DSI Cluster submission file contents:

mpi-dsi.sbatch
#!/bin/bash
#SBATCH --job-name=mpi_example
#SBATCH --partition=general
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=1
#SBATCH --mem=0
#SBATCH --time=01:00:00
#SBATCH --output=/path/to/your/project/logs/%x_%j.out
#SBATCH --error=/path/to/your/project/logs/%x_%j.err

set -euo pipefail

# Optional debug: sbatch --export=ALL,DEBUG=1 mpi.sbatch
DEBUG="${DEBUG:-0}"

# ---- Paths ----
SCRIPT="/path/to/your/project/mpi.py"
INPUT="/path/to/your/project/mpi_input.txt"
LOGS="/path/to/your/project/logs"
RESULTS="/path/to/your/project/results"
VENV="/path/to/your/project/.venv"
MPI_TYPE="pmix_v3"

mkdir -p "$LOGS" "$RESULTS"

# ---- Runtime settings ----
export PYTHONUNBUFFERED=1

# Force Open MPI to use TCP over a clean interface set (avoid docker/loopback bridges)
export OMPI_MCA_btl=self,tcp
export OMPI_MCA_btl_tcp_if_exclude=lo,docker0,virbr0
# Conservative PML (avoid UCX surprises on this cluster)
export OMPI_MCA_pml=ob1

# ---- Environment ----
# module purge
# module load openmpi/<version>
source "${VENV}/bin/activate"

echo "Nodes allocated:"
scontrol show hostnames "$SLURM_NODELIST"
echo "Total tasks: ${SLURM_NTASKS} (ntasks-per-node: ${SLURM_NTASKS_PER_NODE:-unknown})"
echo "DEBUG=${DEBUG}  MPI_TYPE=${MPI_TYPE}"

if [[ "$DEBUG" == "1" ]]; then
  echo "Interfaces per node:"
  srun -N "$SLURM_JOB_NUM_NODES" -n "$SLURM_JOB_NUM_NODES" --ntasks-per-node=1 \
    bash -lc 'echo HOST=$(hostname); ip -o -4 addr show | awk "{print \$2,\$4}"'
  echo "mpi4py linked against:"
  python -c "import mpi4py.MPI as M; print(M.Get_library_version())"
fi

# ---- Launch ----
SRUN_ARGS=(--mpi="${MPI_TYPE}" --label --kill-on-bad-exit=1)

if [[ "$DEBUG" == "1" ]]; then
  SRUN_ARGS+=(--output="${LOGS}/%x_%j_%t.out" --error="${LOGS}/%x_%j_%t.err")
fi

srun "${SRUN_ARGS[@]}" \
  python -u "$SCRIPT" \
    --input "$INPUT" \
    --output "${RESULTS}/out_${SLURM_JOB_ID}.txt"

deactivate
echo "Done."

See this repository directory for full example: https://github.com/chicago-aiscience/chicago-aiscience.github.io/tree/main/docs/user_guide/scripts/mpi

srun Interactive jobs

Interactive jobs let you run commands directly on a compute node instead of submitting a batch script. They are ideal for:

srun: direct interactive jobs (recommended), the preferred and most flexible way to start an interactive session.


Basic interactive CPU job

srun --partition=general --nodes=1 --ntasks=1 --cpus-per-task=2 --mem=4G --time=01:00:00 --pty bash -i

Use this to:


Interactive GPU job

srun --partition=schmidt-gpu --gres=gpu:1 --cpus-per-task=4 --mem=32G --time=01:30:00 --pty bash

Verify GPU access:

nvidia-smi

Attach to a running job

srun --jobid=<JOBID> --pty bash

Useful for inspecting logs or diagnosing issues.


sinteractive: convenience wrapper (RCC only)

sinteractive is an RCC-provided wrapper around srun.

Example:

sinteractive

With options:

sinteractive --mem=8G --time=01:00:00

Notes:


Open OnDemand (RCC only)

Open OnDemand (OOD) is a web-based interface to the RCC clusters. It provides an alternative to the command line for common tasks and is especially useful for visualization, notebooks, and interactive workflows.

RCC documentation: https://docs.rcc.uchicago.edu/open_ondemand/open_ondemand/

What Open OnDemand is good for

Use Open OnDemand when you want to:

Open OnDemand still submits jobs through SLURM—it does not bypass scheduling or resource limits.


Accessing Open OnDemand

  1. Go to: https://midway3-ondemand.rcc.uchicago.edu

  2. Log in with your UChicago credentials

  3. Choose an app or interactive session from the menu


Open OnDemand vs srun and sbatch

TaskRecommended
Debugging via terminalsrun
Jupyter notebooksOpen OnDemand
Interactive desktop / GUI appsOpen OnDemand
Scripted, repeatable workflowssbatch
Lightweight explorationEither

Important notes


Common mistakes

Software Modules

On RCC clusters (like Midway2 / Midway3) and to some extent the DSI cluster, most scientific software isn’t available in your shell by default.

Instead, packaged tools, languages, libraries, and compilers are managed through Environment Modules which is a system that lets you load, unload, and switch software versions cleanly.

A module is basically a script that sets up your environment (e.g., PATH, LD_LIBRARY_PATH) so a specific software package and its dependencies become available in your session.

You use the module command (module avail, module load, etc.) to interact with these.

The benefit: no conflicting software versions, easy switching between versions, and reproducibility across compute sessions.

Modules may also provide:

These support both building complex codes and running them smoothly on compute nodes.

How to run software modules

  1. See what’s available:

module avail
  1. Load what you need:

module load <software>/<version>
  1. Check what’s loaded:

module list
  1. Run your code inside a job script or interactive session.

Monitoring Jobs

Once you submit a job, Slurm provides several commands to help you track its status, priority, and resource usage on RCC clusters.

Check Job Status: squeue

View your running and pending jobs:

squeue -u $USER

Key fields include job ID, state (PD = pending, R = running), elapsed time, and node or pending reason.

For a custom view:

squeue -u $USER -o "%.18i %.9P %.8j %.2t %.10M %.6D %R"

Check Job Priority: sprio

If a job is pending, use sprio to understand its scheduling priority:

sprio -u $USER

Only prints output for pending jobs. This shows how factors like job age, fairshare, and job size affect when your job will start.

Check Job Efficiency: seff

After a job finishes, summarize how efficiently it used resources:

seff <JOBID>

Use CPU and memory efficiency to adjust future job requests.