Slurm

What is it

The Slurm Workload Manager (formerly known as Simple Linux Utility for Resource Management or SLURM), or Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.

SLURM

Requirements

It provides three key functions: - allocating exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work, - providing a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes, and - arbitrating contention for resources by managing a queue of pending jobs.

How to do it

Install packages

sudo apt install munge slurm-wlm

Create configuration files

sudo slurmd -C
NodeName=unit32-xavier CPUs=8 Boards=1 SocketsPerBoard=4 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=15822

You can use your brouser to open /usr/share/doc/slurmctld/slurm-wlm-configurator.easy.html to generate a configuration file.

/etc/slurm-llnl/slurm.conf

# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=unit32-xavier #<YOUR-HOST-NAME>
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/none
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/builtin
#SchedulerPort=7321
SelectType=select/linear
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
#AccountingStoragePass=/var/run/munge/global.socket.2
ClusterName=unit32-xavier #<YOUR-HOST-NAME>
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=4
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# COMPUTE NODES
NodeName=unit32-xavier CPUs=8 Boards=1 SocketsPerBoard=4 CoresPerSocket=2 ThreadsPerCore=1 RealMemory=15822
PartitionName=long Nodes=unit32-xavier Default=YES MaxTime=INFINITE State=UP

Enable and start daemons

Enable and start manageq slurmctld

sudo systemctl enable slurmctld
sudo systemctl start slurmctld

If you see Failed to start Slurm controller daemon - you need to check this with sudo slurmctld -Dvvv. After that you'll probably see something like Slurmctld has been started with "ClusterName=unit32-xavier", but read "testclusternode" from the state files in StateSaveLocation. You need to delete the file /var/lib/slurm-llnl/slurmctld/clustername.

Enable and start agent slurmd

sudo systemctl enable slurmd
sudo systemctl start slurmd

Start agent mungle

sudo systemctl start mungle

munge -h, --help - show help page

How to check status

sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
long*        up   infinite      1   idle unit32-xavier

scontrol show node
NodeName=unit32-xavier Arch=aarch64 CoresPerSocket=2
   CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=unit32-xavier NodeHostName=unit32-xavier Version=17.11
   OS=Linux 4.9.140-tegra #1 SMP PREEMPT Tue Apr 28 14:06:23 PDT 2020
   RealMemory=15822 AllocMem=0 FreeMem=12095 Sockets=4 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=long
   BootTime=2020-07-10T17:54:58 SlurmdStartTime=2020-07-10T17:55:06
   CfgTRES=cpu=8,mem=15822M,billing=8
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Run some task

Create test.sh file

#!/bin/sh
sleep 20
date +%T

Make test.sh file executable

chmod +x test.sh

Submit the test.sh script

sbatch test.sh

Check the status - sinfo and squeue

sifo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
long*        up   infinite      1  alloc unit32-xavier

squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 4      long  test.sh     root PD       0:00      1 (Resources)
                 3      long  test.sh     root  R       0:11      1 unit32-xavier

cat slurm-<JOB ID>.out
09:18:36

Links

Slurm