Facts About SLURM Usage
Recently, we received a ticket from a buy-in account user asking a question about SLURM usage quota. The following list contains useful information about SLURM CPU/GPU usage quota shown in the output of the command “SLURMUsage” for general and buy-in users.
The “SLURMUsage” command only takes the CPU and GPU time of jobs that use a “general” account. If a user submits a job using a buy-in account (either explicitly with –account=<buyin_account> or if a buy-in account is set as the default account), the computation time is not counted into SLURMUsage.
The job using a buy-in account could still be scheduled to run on other non-buy-in nodes or the nodes of other users' buy-in accounts even when the nodes of this job’s buy-in account are not available. In other words, the job that uses a buy-in account is not limited to running on the nodes belonging to the buy-in account but the usage is still counted into the job’s buy-in account.
Another advantage of having a buy-in account is that the priority of the job with a buy-in account qualifies for the backfill scheduler to consider. This may increase the chance of being scheduled earlier.
Note that there are buy-in accounts where the members’ total CPU/GPU usage hours are greater than the available CPU/GPU hours that their total buy-in nodes could possibly provide.
“SLURMUsage” counts the user’s reserved CPU/GPU hours, not the actual used CPU/GPU hours. For example, a user reserved five CPUs for two hours for a job. That job’s execution walltime is one hour on one CPU because the program does not run parallel on five CPUs. The usage count into SLURMUsage would be five, not one, CPU hours. Therefore, it is the users' responsibility to make sure the program is indeed capable of running in parallel on the number of CPUs reserved to avoid wasting the system resources as well as their usage quota.
Xiaoge Wang, PhD
ICER Research Consultant