To submit jobs to Slurm in batch mode use the Slurm command: sbatch
The sbatch command can include all Slurm options in the command line or as script arguments, which is the preferred way.
Batch script must start with this line:
#!/bin/bash (or #!/bin/csh)
After the first line (as described above) the batch script must contain Slurm options with "#SBATCH" before each option. If any other command is entered before Slurm options then Slurm will ignore all options and allocate default job resources: 2 hour time limit, 1 CPU core and 7.5GB RAM memory.
sbatch will stop processing further #SBATCH options once the first non-comment non-whitespace line has been reached in the script - this means that all Slurm options must be grouped together, each option in one line, and any executable commands of the job you want to run must come after Slurm options in the script.
Naming conventions: bash scripts end with sh.
The following editors are available on the cluster: vim/vi, nano, emacs
or gedit
(recommended for Windows users who are not familiar with Linux text editors, but requires a graphical tunnel).
Create a new file in the current working directory:
gedit my-first-sbatch-script.sh
Enter following lines in the new file you just created:
#!/bin/bash
#SBATCH --time=05:00:00
#SBATCH --ntasks=2
#SBATCH --mem=1G
Submitting this script to Slurm will allocate for your job a time limit of 5 hours, 2 CPU cores and 1GB of RAM memory, but will immediately terminate since no actual commands are issued.
After the Slurm options in the script you can enter any bash command that is required for running your code, set environment variables and load modules.
For example, here are the commands that need to be added to the sbatch script for running an R script that needs an R version that can be loaded with the module command.
module load R4/4.1.0
srun Rscript /path/to/my/R/script.R
The final script will be:
#!/bin/bash
#SBATCH --time=05:00:00
#SBATCH --ntasks=2
#SBATCH --mem=1G
module load R4/4.1.0
Rscript /path/to/my/R/script.R
Now that you have this script you can submit it:
sbatch my-first-sbatch-script.sh
Output in shell will be:
Submitted batch job <job id>
sbatch exits immediately after the script is successfully submitted to Slurm and assigned a job ID. The job is not necessarily granted resources immediately, it may sit in the queue of pending jobs for some time before its required resources become available.
Note the use of the srun
command before calling the actual program executable or bash command.
Properly constructed sbatch
scripts execute all commands using the srun
command to ensure that the cluster management software (slurm!) is aware of all job steps.
This also allows for more advanced distribution of resources within the scripts, for example:
#!/bin/bash
#SBATCH --time=05:00:00
#SBATCH --ntasks=2
#SBATCH --cpu
#SBATCH --mem=1G
module load R4/4.1.0
Rscript /path/to/my/R/script.R
Instead of entering Slurm options into a script you can add them as arguments in the command line.
For example:
sbatch --time 05:00:00 --ntasks 2 --mem 1G /my/code/to/run.sh
After this sbatch command Slurm will allocate 2 CPU cores and 1GB RAM memory for a time limit of 5 hours. After resources are allocated for this job the code in the path you entered will be executed.
You can also use the wrap option to submit a command or a few commands separated with a semicolon. For example:
sbatch --time 05:00:00 --ntasks 2 --mem 1G --wrap=”first_command;second_command”
By default both standard output and standard error are directed to a file of the name "slurm-<jobid>.out", where <jobid> is the job allocation number.
You can control this behaviour with the following options:
-e, --error=<filename_pattern>
This option instructs Slurm to redirect the batch script's standard error directly to the file name specified in the "filename pattern".
-o, --output=<filename_pattern>
This option instructs Slurm to redirect the batch script's standard output directly to the file name specified in the "filename pattern".
If you need to load modules in your job then you must create a script and submit it with sbatch command.
Moriah cluster doesn't support the the flag --module like some CS clusters.
Using the flag --wrap in sbatch command won't work. The wrap flag wraps your commands in an sh shell which doesn't support the module command.
You should not use a loop in your script to submit multiple sbatch jobs.
This will overload the Slurm scheduler which will impact submitting jobs for all users in cluster and even cause it to fail.
This is mentioned in Slurm documentation.
The correct way to do it is by using arrays as explained here.