Maintenance window Wednesday 2016-12-07 -- FINISHED
Monthly maintenance window begins at 0900 hours.
This time we will:
- Upgrade kernel and other system software on all nodes of Fysast1, Irma, Milou, Tintin, and Smog
- Restart Lupus
Slurm queues are stopped on Irma, but not on the other systems. We expect to finish sometime during Wednesday afternoon.
Login nodes will be rebooted once during the day (we will warn logged in users an hour in advance).
During the maintenance day, we will now and then update you about our progress, within this web page.
Update at 13:30
Upgrades are done on Milou and Tintin. This means all running jobs will finish, then the nodes will be restarted and updated before new jobs are started. No jobs are interrupted, but there might be a slower queue the coming week.
Update at 18:00
Most parts of our filesystem checks on Lupus is done. We'll working on getting Irma back in production, this evening or tomorrow.
Update December 8, 14:00
What was planned to do for the service window is completed for Irma and Lupus and queues for Irma are open again.
Problem with Slurm on Milou -- fixed
Interrupts in Slurm service on Rackham -- fixed
Bianca's storage system Castor has problems -- fixed
Resetting your password from the homepage is not working --fixed
Resetting your password from this page is currently not working. If you need to reset your password please contact firstname.lastname@example.org
Update 2017-04-18: This issue should now be fixed.
Funk-accounts and new certificates
Some of the shared funk-accounts used on Irma and Milou might stop working due to the IP-address change.
Maintenance window Wednesday 2017-04-05 -- finished
Smog will be decommissioned on Wednesday 5th of April
Smog will be decommissioned on Wednesday 5th of April. As previously mentioned the SNIC Cloud Team is currently working on bringing up a new cloud to replace Smog and join the other two regions in the SNIC Science Cloud project.
For questions ,please contact email@example.com (and not the UPPMAX support queues).
Rackham2, one of Rackham's login nodes, got into problems -- now fixed
Maintenance window for Bianca Wednesday 2017-03-22 -- finished
Problem with file permissions in certain projects
Poor performance using Intel MPI on Rackham
We have idenfied performance issues when using Intel MPI on Rackham. In some cases you see a 10x slowdown (or worse) using Intel MPI compared to Open MPI. We are investigating this issue and hope to have it solved soon. For now, please use Open MPI.
Fixed: "Project p123456 may not run jobs on this cluster (rackham)"
An issue exist on Rackham affecting projects of the form "p123456". The projects are not allowed to run due to the monthly core allocation incorrectly being set to 0 hours. We are investigating why this happens.
Update 2017-03-10: The issue should now be fixed.
Rackham will soon be open for all users
Many Tintin users have missed that Rackham will replace Tintin. We are currently migrating all projects from Tintin to Rackham and when this is done, all users will get access to Rackham. We will announce this per email and on our homepage.
Maintenance window Wednesday 2017-03-01 -- finished
Today we decommission Tintin
1st of March 2017 is the day we decommission Tintin. It will be replaced by the Rackham cluster. All projects on Tintin will be moved to our new Rackham cluster.
Poor performance on Milou and Tintin