Uppsala Multidisciplinary Center for Advanced Computational Science

Maintenance window Wednesday 2017-03-01 -- finished


Maintenance starts at 0900 hours and will probably last all day long. This time, we will:

  • Install new main ethernet switches, which will affect access to all login nodes in the morning, because UPPMAX will be disconnected from internet. Installing the new switch will also affect connection to Pica over the day, meaning we will stop queues on Fysast1, Milou, Rackham, and Tintin.

  • Upgrade kernel and other system software on all nodes of Bianca, Fysast1, Irma, Milou, Mosler, and Rackham.

  • Decommission Tintin. UPPMAX starts to migrate all active Tintin projects to Rackham. Migration is expected to last for up to three days. When the migration has finished, all Tintin users will be able to login to Rackham and we will allow their projects to continue their activities on Rackham. Home directories will be located on Pica, as they are now, but project directories will move to to Crex, a new storage system on Rackham. Note that job queue on Tintin will be dropped, so any jobs still queuing when maintenance window starts will need to be requeued on Rackham.

Update at 1130 hours:

The new ethernet swhitches have been installed, and are now configured. We have also corrected a few errors in our electrical UPS (uninterrruptible power supply) system.

Please do not login yet. Everything is not yet working as intended.

Update at 1540 hours:

We are still configuring our networks and upgrading our clusters.

Update at 1600 hours:

We haved finished today's maintenance for Mosler.

Update at 1810 hours:

Also Bianca and Irma is back in production.

Update at 1830 hours:

We have finished maintenance on Fysast1 and Milou.

Rackham is still in maintenance. We also have a problem with internet access from compute nodes, which we hopefully can fix tomorrow Thursday.

Update at 2030 hours:

Now the maintenance of all compute resources has finished.

We still can not reach internet from compute nodes. We try to fix that problem tomorrow. Please tell us if you notice other problems.

Update Thursday at 1710 hours:

We have not  yet been able to give our compute nodes internet access, but will continue tomorrow.

Update Friday at 1100 hours:

Now compute nodes have internet acess again, as before the maintenance, so we close the maintenance window.

Next maintenance window is planned for Wednesday, April 5th.