Uppsala Multidisciplinary Center for Advanced Computational Science

Maintenance window Wednesday 2017-05-03 -- finished

2017-05-03

Monthly maintenance window begins at 0900 hours on the first Wednesday of the month. (That is  today.)

This time we will:

  • Upgrade Slurm, Linux kernel and other system software on Bianca, Fysast1, Irma, Milou, and Rackham.
  • Upgrade Linux kernel and other system software on Castor and Grus.
  • Physically move one of the OpenStack server machines of Bianca from one chassi to another.

Bianca and Grus will be unavailable while we service them.

We will restart all login nodes of Fysast1, Irma, Milou and Rackham, probably only once.

Slurm jobs on Fysast1, Irma, Milou and Rackham will continue to run, but access to Slurm commands will be unavailable sometimes during the day.

Slurm queues on Bianca will be stopped and, most of the day, logins to Bianca will not be possible.

We plan to keep you informed about out progress with the maintenance with updates here.

Update at 1210 hours

Part of Bianca and Castor is updated.

We have some unexpected problems with the new Slurm version. First machine we are testing this on is Irma, so Slurm is unavailable on Irma. We are sorry about that.

Update at 1605 hours

We are now giving up on the new Slurm version and goes back to the old one.

Update at 1730 hours

We have changed back to the Slurm version of yesterday.

Some login nodes are not yet restarted, and will soon be.

Service of Bianca continues tomorrow. Restart of Milou-f will be done tomorrow, or this evening.

Update Thursday at 0845 hours

We are soon restarting the login node of Fysast1.

Maintenance of Bianca continues today. We try to improve the compute nodes of the project clusters.

Irma, Rackham, and the UPPNEX part of Milou are back in production. Compute nodes will upgrade themselves automatically, so the waiting time in Slurm queues will be longer than normal today.

Update Thursday at 1545 hours

We have lost part of the connection to compute nodes of Fysast1, and are busy trying to get it back.

Maintenance on Bianca has finished and we will soon allow new logins.

Update Thursday at 1600 hours

Bianca is back in production.

Update Friday at 0920 hours

Now most compute nodes of Fysast1 are available. We will probably soon close the maintenance window.

Update Friday at 1135 hours

The connection to compute nodes of Fysast1 is fully recovered. We have now finished maintenance.

Next maintenance day is June 7th.

Old System News