Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| quickstart [2022/06/22 14:33] – update from anonymous 132.161.219.43 | quickstart [2025/05/08 19:43] (current) – docuadmin | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Cluster Introduction ====== | ||
| + | |||
| + | Faculty and their research students can request access to the cluster by contacting Mike Conner [connerms], Linux System Administrator, | ||
| + | |||
| + | To prepare jobs for the scheduler, users write a shell script, the submission script, that sets all necessary variables and contains all commands to be run. | ||
| + | |||
| + | ====== Connecting to the Grinnell High Performance Compute Cluster ====== | ||
| + | |||
| + | After obtaining an account on the cluster, the easiest way to connect to the cluster and begin setting up your workload and submitting jobs is to direct your browser to https:// | ||
| + | |||
| + | Open OnDemand will give you easy access to tools to create and submit jobs, manage your files on the cluster, even access an interactive shell on the cluster. Open OnDemand (web-based) connections are restricted to network connections that are wired, on campus, or connected to the Grinnell College secure wireless network. | ||
| + | |||
| + | Cluster users may also connect to the cluster via SSH. SSH connections are restricted to network connections that are wired, on campus, or connected to the Grinnell College secure wireless network. | ||
| + | |||
| + | $ ssh < | ||
| + | |||
| + | Then enter your password when prompted. | ||
| + | |||
| + | ====== The SLURM Scheduler ====== | ||
| + | |||
| + | Compute jobs on the cluster are managed by a scheduler: SLURM. To run jobs on the cluster you'll need to prepare your jobs then submit them to the scheduler along with some information about the resources that are needed to run the job. (Refer to the [[https:// | ||
| + | |||
| + | Jobs can be submitted to the scheduler using the '' | ||
| + | |||
| + | These are common parameters passed to SLURM when submitting a job: | ||
| + | * Set Working Directory ('' | ||
| + | * Nodes ('' | ||
| + | * Sockets (or CPUs) per node ('' | ||
| + | * Cores ('' | ||
| + | * Tasks ('' | ||
| + | * Memory required per node ('' | ||
| + | * Time limit ('' | ||
| + | * Output Path ('' | ||
| + | * Error Path ('' | ||
| + | * Name ('' | ||
| + | |||
| + | A complete list of parameters that '' | ||
| + | |||
| + | ====== Run an Interactive Job on a Compute Node ====== | ||
| + | |||
| + | Running an interactive job on a compute node is the preferred method of testing jobs and scripts. Doing so avoids running resource-intensive workloads on the master (or login) node. | ||
| + | To start a job with an interactive shell on a compute node you can use this command: | ||
| + | |||
| + | srun -N 1 -n 1 --pty /bin/bash | ||
| + | |||
| + | ====== Basic job submission ====== | ||
| + | |||
| + | Consider this command. If you enter this command on the command line of a Terminal, it will print the date, wait 5 seconds, then print the date again. | ||
| + | | ||
| + | $ date; sleep 5; date; | ||
| + | |||
| + | To execute this simple series of commands you can create a very simple script: | ||
| + | |||
| + | File '' | ||
| + | |||
| + | #!/bin/bash | ||
| + | date | ||
| + | sleep 5 | ||
| + | date | ||
| + | |||
| + | The commands are now in a format that can be submitted to the scheduler to be run as a job on the cluster. But we also need to tell the scheduler what resources are needed. This information can be passed to SLURM directly in the '' | ||
| + | |||
| + | The job can be submitted to the scheduler by adding flags to the '' | ||
| + | |||
| + | $ sbatch -J SleepJob -N 1 --tasks-per-node=1 date.sub & | ||
| + | |||
| + | To place the resource request in the script itself, we modify the script to include additional lines for SLURM: | ||
| + | |||
| + | File '' | ||
| + | |||
| + | #!/bin/bash | ||
| + | #SBATCH -N 1 | ||
| + | #SBATCH --tasks-per-node=1 | ||
| + | #SBATCH -J SleepJob | ||
| + | | ||
| + | date | ||
| + | sleep 5 | ||
| + | date | ||
| + | |||
| + | Then the job is submitted with '' | ||
| + | |||
| + | $ sbatch date.sub | ||
| + | |||
| + | The '' | ||
| + | |||
| + | $ squeue --job < | ||
| + | -or- | ||
| + | $ squeue -n <Job name if specified upon submission> | ||
| + | |||
| + | |||
| + | ====== Next Steps ====== | ||
| + | |||
| + | A step-by-step tutorial for creating and submitting a job using Open OnDemand is available here. | ||
| + | |||
| + | Comprehensive [[https:// | ||
| + | |||
| + | |||