Changes

Jump to: navigation, search

Cluster Status/Previous outages

2,449 bytes added, 14:29, May 10, 2023
Siku: added 2022 outage info from Cluster Status page.
== Siku ==
==== 2022 ====
* Siku is back online since 12:30pm NDT (15h00 UTC). This was the last of three scheduled power outages.
: 12:45, May 30, 2022 (NDT)
 
* Siku is back online since 12:00pm NDT (14h30 UTC). There will be one further outage from 11:30am NDT (14h00 UTC) on Friday May 27 until midday on Monday May 30.
: 13:00, May 16, 2022 (NDT)
 
* Siku is back online since 12:30pm NDT (15h00 UTC). There will be two similar outages: May 13-16 and May 27-30.
: 12:32, May 2, 2022 (NDT)
 
* Siku is offline since 11:30am NDT (14h00 UTC) to facilitate electrical work by Memorial University facilities management in the data centre. We expect a return to service by mid-day on Monday, May 2nd 2022.
: 11:40, April 29, 2022 (NDT)
 
* A time sensitive maintenance outage was carried out on Monday March 28. Work began at 7:30AM Newfoundland time (10h00 UTC) and was completed by 5:30pm Newfoundland time (20h00 UTC). The work carried out has expanded our Infiniband Network and increased the capacity of our backend-infrastructure to allow the addition of almost 30 additional nodes, which will be added over the coming days.
: 17:30, March 28, 2022 (NST)
 
* Memorial University IT services has interrupted network service to Siku just after midnight Newfoundland time (03h30 UTC) on Tuesday Mar 1, 2022, to perform maintenance. The interruption to lasted less 30min. During this time, jobs were prevented to start to avoid failures caused by the lack of external network connection, but has now resumed.
: 00:25, March 1, 2022 (NST)
 
* Memorial Universities networks are online again and access to Siku has been restored. Siku's scheduler had stopped at some point during the outage, but has been restarted on Sat Jan. 8th at 10:20am (NST). Jobs have been running on Siku since then.
: ''Update Monday Jan 10, 13:15 (NST)'': The onset of the network interruption was also accompanied by a power-fluctuation, that has caused some (but not all) compute nodes to reboot.
: 13:00, January 8, 2022 (NST)
 
* Memorial University has announced that they are experiencing a wide-spread internet outage. Therefore access to Siku is currently not possible, but we expect the system to continue running jobs until internet access has been restored.
: ''Update 14:10 NST'': Memorial University has announced on their Twitter account that the issue was caused by an internal technology malfunction. MUN-ITS is working on fixing it.
: 13:30, January 7, 2022 (NST)
==== 2021 ====
342
edits

Navigation menu