342
edits
Changes
→2025: update wording of Siku outage into past tense. Added additional info
==== 2025 ====
* '''March 18-20, 2025''' both Both '''Siku and Argo''' will be were offline from March 18 to 20 for network- and system maintenance.<br/>During the outage, the public IP addresses of both clusters will change has changed and move moved to a different subnet and software updates will be installed.<br/>This outage has also resolved a performance regression for certain MPI jobs that ran on nodes connected across more than one Infiniband leaf-switch and were close to their scaling limit.: Wed Mar 12 2025 11:30 NDT: '''UPDATE March 18, 08h30 NDT''': The planned maintenance has started. We will continue to post updates here.: '''UPDATE March 20, 10h10 NDT''': The planned maintenance has been completed and job scheduling has been resumed.
* '''Siku''' is operating at reduced capacity due to problems with the cooling in the data centre. New jobs will not be started until we are confident that the temperature in the room will remain stable.