Subordinate Queues

From ACENET
Jump to: navigation, search
Achtung.png Legacy documentation

This page describes a service provided by a retired ACENET system. Most ACENET services are currently provided by national systems, for which please visit https://docs.computecanada.ca.


Introduction

The subordinate queue "sub.q" is associated with the Green ACENET program. Briefly, Green ACENET is an arrangement whereby researchers purchase computing equipment which is administered on their behalf by ACENET. In return for this service, the "spare cycles" are made available to general ACENET users under the condition that the purchasing researchers can always get priority access to their own resources.

This transaction is mediated by the Sun Grid Engine (SGE). Within SGE, Green hardware is organized into groups (technically "cluster queues") to which the owning group members have sole access. For example, members of the CMMS research group are the only ones whose jobs will run in the "cmms.q" cluster queue.

However, the same hardware is also a member of the sub.q cluster queue. When one of these hosts has no job running on it for the owning group, then jobs in sub.q can run there. If a member of the owning group starts a job via SGE, then any jobs running in sub.q on that host are automatically suspended.

Is it for me?

If you

  • have serial jobs or shared-memory parallel jobs which run for a long time, and
  • can make use of intermediate results without interrupting the job which generates them, or
  • don't mind your jobs being interrupted and resumed arbitrarily,

then you might be able to use the subordinate queue profitably.

How to use sub.q

The subordinate queue only exists at Placentia and Glooscap.

A job must explicitly request the "suspendable" resource in order to qualify for the subordinate queue. For example:

#$ -l h_rt=1000:0:0,susp=true
./my_application

Conversely, no job with "susp=true" will go into the regular production queues.

You can probe the availability of sub.q nodes with qsum, coarsely, or in more detail with qstat -f -q sub.q. Nodes marked "S" are suspended, nodes with nothing in the "state" column are available.

Run times and memory

The hard run time limit h_rt is a wall-clock limit, not a CPU time limit. Because jobs in sub.q can be suspended and therefore an unpredictable amount of time can elapse before they complete, there is no restriction on what h_rt you can request. But you must still supply one!

Another practical limitation is that Grid Engine will not start a job that it expects to run into the next scheduled outage. Scheduled outages are announced on the Cluster Status page and in the login message for each cluster. Take these into account when choosing h_rt for a suspendable job.

Jobs in sub.q cannot reserve more than 2G memory per slot (h_vmem=2G).

Parallel jobs

The subordinate queue is suspended and resumed host-by-host. It is difficult to ensure that a parallel job will behave properly if it is suspended on one host while other parts continue to run. Therefore shared-memory jobs (-pe openmp and -pe gaussian) are the only kind of parallel jobs permitted in sub.q.

Email notification

You can be notified when one of your sub.q jobs is suspended by setting the "s" flag to the "-m" Grid Engine directive:

-M user@mail.host
-m eas

You may find this useful if you wish to manually interrupt and resubmit a suspended job.

Termination instead of suspension

If you want your jobs to be terminated when they are pre-empted, instead of being suspended indefinitely, you should submit jobs to the subordinate queue with qsub -notify. This will cause Grid Engine to send a SIGUSR1 signal to your application 15 seconds before sending a SIGSTOP. The intended purpose of this function is to allow your application to trap SIGUSR1 and save state, but the default action on receipt of SIGUSR1 is to terminate the application.

See man 7 signal and man qsub.

See the Gnu manual for more on trapping signals in C programs. See "Fortran Signal Handling" for notes on trapping signals in a Fortran program.