Changes

Jump to: navigation, search

Tracking paid accounts

3,397 bytes added, 18:35, November 8, 2019
Created page with "Category:Siku If your firm or organization has a paid contract with ACENET for compute time on Siku, then this page explains what you should know about running jobs...."
[[Category:Siku]]

If your firm or organization has a paid contract with ACENET for compute time on [[Siku]], then this page explains what you should know about running jobs.

== Terminology ==

* "User", that's obvious. One person, one login name.
* "Account" is not the same as "user" in this context. Your firm or organization has a contract with ACENET (or you probably wouldn't be reading this.) On Siku, that contract is represented by an "account" with a name like "pd-abc-123".
* "QoS" stands for "Quality of Service", but it might be better to forget that expansion and instead think of a "QoS" as a software object which remembers how many CPU hours etc. an account is allowed to use, and how much has been used already. Each paid account is associated with its own QoS, and the QoS has the same name as the account, like "pd-abc-123".
* "Billing units" measure the use of the system. One CPU-minute is worth one billing unit; 4G of RAM for one minute is also worth one billing unit. A GPU-minute is worth 35 billing units. See below for a formula, and examples.

== How much computing can I do? ==

You can see the number of billing units available to you through your QoS by running the utility acct-tool. The output will look something like this:

Available QoSs: pd-abc-123
Default QoS: pd-abc-123

For QoS 'pd-abc-123':
Billing units limit: 100000000
Billing units used: 7196979

When your team gets close to the limit, you may find that your jobs are not starting, but instead staying in PD (pending) state with "QOSGrpBillingMinutes" showing in the "Reason" field of squeue (or sq). This is because the job would put you over your billing limit if it ran for the time requested. Contact support@ace-net.ca to discuss refreshing your billing limit.

== Billing units formula ==

<pre>
BillingUnits = ( CPUs * 1.0
+ RAM_GB * 0.25
+ GPUs * 35.0 ) * minutes
</pre>

A job that reserves one CPU and 4G of RAM and runs for one minute consumes 2 billing units.
One CPU and 1G of RAM for one minute? 1.25 billing units.
Our GPU-equipped nodes have 40 CPUs, 186G of RAM, and two GPUs. To use one of these nodes for 24 hours would cost
(40 * 1 + 186 * 0.25 + 2 * 35) * 24 * 60 = 225360 billing units.

== What if I have more than one account or QoS? ==

A user typically only has access to one account and one QoS, and your jobs are automatically associated with that account and QoS. You could have more than one account if, for example, your firm made two separate contracts with ACENET for separate projects, and you are working on both. Or if you were a freelancer and working for two different firms with ACENET contracts. In that case it will be up to you to assign each job you submit to the correct QoS using the --qos= option to sbatch, salloc, or srun.

== Why the funny word, QoS? ==

You may well ask, "Why have QoSs at all, why not just use accounts?" That has to do with Slurm internals. We would like you to be able to think in terms of a "bank account" of computing time, but to implement that we had to use Slurm's QoS mechanism. If we were then to call a QoS a "bank account", when the term "account" in Slurm means something slightly different but closely related, that would cause great confusion if and when you ever have to consult the generic Slurm documentation at https://slurm.schedmd.com.

Navigation menu