Cluster Status
![]() |
Clusters
Cluster | Status | Planned Outage | Notes |
---|---|---|---|
Siku | Online | no outages | |
Placentia | Online | no outages | Restricted since March 2019 |
Nefelibata | Offline | - | Electrical failure |
Humus | Offline | - | Electrical failure |
Argo | Online | no outages |
For national clusters (Arbutus, Béluga, Cedar, Graham, Narval, Niagara) see status.alliancecan.ca
Services
Service | Status | Planned Outage | Notes |
---|---|---|---|
Globus at Argo | Online | - | |
Globus at Siku | Online | - | Academic users only |
Account creation | Manual | No outages | Write support |
PGI and Intel licenses | Online | No outages |
- Legend:
Online | cluster is up and running |
Offline | all users cannot login or submit jobs, or service is not working |
Online | some users can login and/or there are problems affecting your work |
Outage schedule
Jobs will not be scheduled with a run time (--time=
) that extends into the beginning of a planned outage period. This is so the job will not be terminated prematurely when the system goes down.
- There are currently no planned outages.
Siku
2025
- On Thu Feb 20 and between Mon Mar 03 and Thu Mar 13, we will be performing rolling updates of all compute nodes causing SIKU to operate at a reduced total capacity. Since we only reserve a small fraction of nodes each day, the impact to user-jobs should be small since all other nodes will still be available.
- 11:00, January 30, 2025(NST)
For older outages see: Previous outages
- Our newest cluster, Siku, is now in production. Access is currently restricted to invited users only. Access request form.
- 13:00, December 10, 2019 (NST)
Argo
2025
- Due to a critical cooling failure in the data-centre we had to perform an emergency shutdown of Argo on the morning of Saturday, February 15th. We expect Argo to become available again sometime on Monday, February 17.
- 12:30, Feb 15, 2025 (NST)
- Update #1: Argo's login nodes and filesystems are available again, however the compute nodes will remain offline until next week.
- 14:30, Feb 15, 2025 (NST)
- Update #2: Over the course of today we have released about half of Argo's CPU nodes and all GPU nodes back into production. We continue to work on the remaining nodes.
- 16:30, Feb 17, 2025 (NST)
- Update #3: Most of Argo's compute nodes are back in production and we will continue enabling the remaining ones as soon as they are available.
- 13:30, Feb 19, 2025 (NST)
- Argo suffered an electrical power event on Friday evening (Jan 17) around 18h00 NST (21h30 UTC) which brought down some components. The cluster is back in production at this hour. Some compute nodes have not yet recovered; we are working to bring them back.
- 10:30, Jan 20, 2025 (NST)
2024
- Argo suffered an electrical power event last night (Nov 19-20) which brought down some components. The cluster is back in production at this hour. Some compute nodes have not yet recovered; sysadmins are working to bring them back.
- 12:10, Nov 20, 2024 (NST)
- Argo was offline from October 28 to 30, 2024 for electrical power work, some upgrades of infrastructure machines, and some software and firmware updates. Service was resumed on Thursday October 31st at around 14h00 NDT with about 75% of its CPU-capacity while the remaining nodes are being worked on.
- 14:40, Oct 31, 2024 (NDT)
- Update: The GPU nodes
argo[72-73]
have been returned to service- 17:00, Nov 1, 2024 (NDT)
Placentia
- Placentia was retired from general service as of 2019 Mar 31. A reduced number of compute nodes remain in service, with access restricted to MUN users who have made suitable arrangements. Contact support@ace-net.ca if you believe you should have access.
Nefelibata
- Nefelibata has had its shared storage replaced, but Slurm scheduler service has not yet been restored. This is waiting for personnel to become available from other work.
- 2024-03-18
- Nefelibata will be unavailable on 2023 September 5, Tuesday, for operating system and driver updates. We expect return-to-service on Wednesday Sept 6.
- Update at 2023-09-07 12:00 NDT: Outage complete, Nefelibata back in service.