This page is maintained manually. It gets updated as soon as we learn new information.
|
Clusters
Please click on the name of the cluster below in the table to quickly get to the corresponding section of this page. The outage schedule section is a single place where data about all scheduled ACEnet outages are represented.
Services
- Legend:
Online |
cluster is up and running
|
Offline |
all users cannot login or submit jobs, or service is not working
|
Online |
some users can login and/or there are problems
|
Outage schedule
Grid Engine will not schedule any job with a run time (h_rt
) that extends into the beginning of a planned outage period. This is so the job will not be terminated prematurely when the system goes down.
Mahone
- 12:01, November 18, 2014 (AST)
Placentia
- There was a power event this morning at around 3:40am that took down a number of compute nodes. Please check your jobs and resubmit if necessary.
- 10:30, January 5, 2015 (AST)
Fundy
- Thirteen compute nodes have been temporarily taken out of production to allow us to further investigate and replace the faulty line card in the Infiniband switch. These nodes will be put back in production as Ethernet-only if the repair takes longer than anticipated.
- 12:28, January 7, 2015 (AST)
- Mellanox has informed us that there is no replacement or support for products that has reached their End of Life date. We are investigating alternatives.
- 14:25, January 6, 2015 (AST)
- We have contacted Mellanox support to deal with what appears to be a problem with the IB switch.
- 12:24, January 6, 2015 (AST)
Glooscap
- Some queue changes have been made in connection with the 2015 Compute Canada NRAC. 'qsum' may display incorrect numbers of available slots while jobs drain from reassigned equipment, perhaps as late as January 20th. We regret any confusion this may cause.
- 10:45, January 8, 2015 (AST)