Difference between revisions of "Cluster Status"

Revision as of 17:35, February 22, 2013

Please click on the name of the cluster below in the table to quickly get to the corresponding section of this page. The outage schedule section is a single place where data about all scheduled ACEnet outages are represented.

Cluster	Status	Planned Outage
Brasdor	Offline	No outages
Mahone	Online	No outages
Placentia	Online	No outages
Fundy	Online	No outages
Glooscap	Online	No outages
Courtenay	Online	No outages

Legend:

Online	cluster is up and running
Offline	all users cannot login or submit jobs
Online	some users can login and/or there are problems

Outage schedule

Grid Engine will not schedule any job with a run time (h_rt) that extends into the beginning of a planned outage period. This is so the job will not be terminated prematurely when the system goes down.

Brasdor

Brasdor is not responding to any login attempts......investigating

13:35, February 22, 2013 (AST)

Brasdor is back up after the AC outage. The delay was due to a faulty motor in one of the roof compressors that kept tripping the power breaker to the room.

13:48, February 21, 2013 (AST)

The temperature in the Brasdor Machine room is still high. Unknown as to when the AC will be repaired properly. News to come as we know more.

13:44, February 18, 2013 (AST)

One AC is working again, and the other needs more work. Brasdor should be back up later today. All nodes had to be turned off, so jobs will have been lost.

12:35, February 15, 2013 (AST)

The AC has failed in the Brasdor machine room, it might be due to a power failure. FacMan is looking into the issue. More information to come.

15:59, February 14, 2013 (AST)

Mahone

Queues enabled. NQS is available.

17:09, January 15, 2013 (AST)

All the queue have been temporarily disabled to prevent jobs from using the NQS filesystem, which will be temporarily unmounted for a filesystem check to resolve the utilization problem.

12:04, January 15, 2013 (AST)

Placentia

There was a power surge on Feb 10 around 9am that brought down some compute nodes.

13:07, February 11, 2013 (AST)

Back online after the power outage due to a snow storm.

14:46, January 14, 2013 (AST)

Placentia in inaccessible. We are investigating if it's related to the snow storm in Newfoundland. MUN is closed.

08:50, January 11, 2013 (AST)

Fundy

Head node rebooted due to the head node policy abuse.

08:33, February 8, 2013 (AST)

We are having problems with NFS servers. Affected compute nodes have been disabled, but existing jobs might crash. Support calls have been launched with Oracle.

09:59, February 6, 2013 (AST)

Glooscap

Network switches were be replaced on nodes cl098-cl183 to fix cooling and airflow problems. The service was completed ahead of schedule and the entire cluster is back in service.

15:43, December 17, 2012 (AST)

The Grid Engine queue master is running.

08:11, November 19, 2012 (AST)

Courtenay

Courtenay is back online.

11:47, November 28, 2012 (AST)

Courtenay is offline due to some nfs/network problem. We are sorting this out.

08:23, November 28, 2012 (AST)

@@ Line 9: / Line 9: @@
 ! scope="col" align=left width="250px" | Notes
 |- valign=top bgcolor="#f5faff"
-| [[Cluster Status#Brasdor | Brasdor]] || style="color:green" | '''Online''' || [[Cluster Status#Outage schedule | No outages]] ||
+| [[Cluster Status#Brasdor | Brasdor]] || style="color:red" | '''Offline''' || [[Cluster Status#Outage schedule | No outages]] ||
 |- valign=top bgcolor="#f5faff"
 | [[Cluster Status#Mahone | Mahone]] || style="color:green" | '''Online''' || [[Cluster Status#Outage schedule | No outages]] ||
@@ Line 36: / Line 36: @@
 == Brasdor ==
+* Brasdor is not responding to any login attempts......investigating
+: 13:35, February 22, 2013 (AST)
 * Brasdor is back up after the AC outage.  The delay was due to a faulty motor in one of the roof compressors that kept tripping the power breaker to the room.
 : 13:48, February 21, 2013 (AST)

Difference between revisions of "Cluster Status"

Revision as of 17:35, February 22, 2013

Outage schedule

Brasdor

Mahone

Placentia

Fundy

Glooscap

Courtenay

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Quick Links

User Support

Resources

Policies

Legacy Documentation

Tools