Qstat
Legacy documentation
This page describes a service provided by a retired ACENET system. Most ACENET services are currently provided by national systems, for which please visit https://docs.computecanada.ca. |
- Main page: Job Control
The fundamental command for monitoring job status in Grid Engine is qstat
. By default (that is, with no arguments) it will show information about your own jobs in a one-line-per-job format which is described on the Job Control page.
qstat
support many options. Some control which jobs are displayed, e.g.:
$ qstat -u \* Jobs belonging to all users $ qstat -q test.q -f Jobs running in test.q $ qstat -q \*@cl001 -f -u \* Jobs running on node cl001
Some control what sort of information is displayed, e.g.:
$ qstat -g t One line per parallel process $ qstat -j jobid Details on one job including error causes and resource usage
The definitive and complete reference is man qstat
.
When a job has ended it no longer appears in qstat
, but some information about it is available through qacct
.
Job statuses
The common job status identifiers output by qstat
are listed below. They often appear in combinations like Eqw, hqw
, or dr
:
qw |
job is waiting |
r |
job is currently running |
t |
job is being transferred to the compute nodes |
s or S |
job is suspended, should only see this in Subordinate Queues |
h |
job is being held due to a job dependency or due to sysadmin action |
E |
submission is in error state, use qstat -j job_id to find out why
|
R |
job has been restarted (Rr ) or is waiting to be restarted (Rq ), typically follows a node crash.
|
d |
job has been registered for deletion, usually seen if a node has crashed |