Rackham User Guide
This is the user guide to Rackham, a high performance computer cluster at UPPMAX. Guides for the other systems at UPPMAX can be found here.
Please read this Users Guide for up-to-date information.
All heavy usage of the cluster must go through the batch system, SLURM. The login nodes only allow up to 30 minutes of CPU time per process.
The login node for Rackham is called rackham.uppmax.uu.se. In fact, there may be four login nodes hidden behind this name; you will be automatically redirected to any one of these.
Rackham consists of 334 compute servers (nodes) where each compute server consists of two 10-core Xeon E5-2630 V4 processors running at 2.2 GHz (turbo 3.1 GHz). We provide 272 nodes with 128 GB memoryi (r33-r304) and 32 nodes with 256 GB (r1-r32). All nodes are interconnected with a 2:1 oversubscribed FDR (56 GB/s) Infiniband fabric. In total Rackham provides 6080 CPU cores in compute nodes.
The login nodes (rackham1-rackham4) are identical to the compute nodes but each with 256 GB of memory and a dedicated Nvidia Quadro K2200 graphics card.
Important information about computer architecture
Note: This section is for users migrating their code from the Tintin cluster to Rackham. If you have never compiled your code for Tintin you can skip this section.
Rackham has a significantly different architecture from Tintin, so it may be important for you to recompile your applications to get them to run faster on Rackham. To get a good speed, UPPMAX recommends you to compile using the Intel compiler, Intel Math Kernel Library (MKL) and the Intel MPI Compiler. These build tools are available from the module system. Note that, due to features available only on Rackham CPUs, codes compiled for Tintin's AMD CPUs will normally *not* run at all on Rackham. So even for non-performance critical codes you should take care to only run codes compiled on Rackham on Rackham and codes compiled on Tintin on Tintin.
Symptoms of mixing machines when compiling and running could, apart from missing libraries, be the program terminating due to illegal instructions.
OS and software
There are several compilers available through the module system on Rackham. This gives you flexibility to obtain programs that run optimally on Rackham.
- gcc - the newest version usually generates the best code, if you tell it to use the new instructions. Check which version is the newest by doing module avail.
The compiler executable is named gcc for C, g++ for C++, and gfortran for Fortran.
To use the new instructions available on Rackham (AVX2 and FMA3), give the additional options "-mavx2 -mfma3" to gcc. For good performance with this compiler you should also specify optimization at least at level -O2 or -O3. Also try using -march=broadwell for GCC >= 4.9.0 or -march=core-avx2 for GCC 4.8.x, which will enable all the instructions on the CPU.
- Intel+MKL - usually generates the fastest code. As with gcc, it is good to use the latest version. The compiler executable is named icc for C, icpc for C++, and ifort for Fortran. You should give optimization options at least -O2, preferably -O3 or -fast. You can also try to use the -xCORE-AVX2 option to the compiler to output AVX2 instructions.
- pgi - often generates somewhat slower code, but it is stable so often it is easier to obtain working code, even with quite advanced optimizations. The compiler executable is named pgcc for C, pgCC for C++, and pgfortran, pgf77, pgf90, or pgf95 for Fortran. For this compiler, you can generate code for Rackham using the following options "UPDATES NEEDED". Also give optimization options at least -O2, preferably -Ofast, even though the compile times are much longer, the result is often worth the wait.
See the our software pages for more details about OS, compilers and installed software.
You will probably have good use of the following commands:
- uquota - telling you about your file system usage.
- projinfo - telling you about the CPU hour usage of your projects.
- jobinfo - telling you about running and waiting jobs on Rackham.
- finishedjobinfo - telling you about finished jobs on Rackham.
- projmembers - telling you about project memberships.
- projsummary [project id] - summarizes some useful information about projects
For SLURM commands and for commands like projinfo, jobinfo and finishedjobinfo, you may use the "-M" flag to ask for the answer to be given for a system that you are not logged in to. E.g., when logged into Rackham, you may ask about information about current core hour usage on Rackham, with the command projinfo -M Rackham
Accounts and log in
All access to this system is done via secure shell (a.k.a SSH) interactive login to the login node, using the domain name rackham.uppmax.uu.se
ssh -AX firstname.lastname@example.org
To get a user account and start using UPPMAX, see the Getting Started page.
For questions concerning accounts and access to Rackham, please contact UPPMAX support.
Note that the machine you arrive at when logged in is only a so called login node, where you can do various smaller tasks. We have some limits in place that restricts your usage. For larger tasks you should use our batch system that pushes your jobs onto other machines within the cluster.
Using the batch system
To allow a fair and efficient usage of the system we use a resource manager to coordinate user demands. On Rackham we use the SLURM resource manager. Read our SLURM user guide for detailed information on how to use SLURM.
- There is a job walltime limit of ten days (240 hours).
- We restrict each user to at most 5000 running and waiting jobs in total.
- Each project has a 30 days running allocation of CPU hours. We do not forbid running jobs after the allocation is overdrafted, but instead allow to submit jobs with a very low queue priority, so that you may be able to run your jobs anyway, if a sufficient number of nodes happens to be free on the system.
- Very wide jobs will only be started within a maintenance window (just before the maintenance window or at the end of the maintenance window). These are planned for the first Wednesday of each month. On Rackham a "very wide" job asks for 100 nodes or more.
- $SNIC_TMP - Path to node-local temporary disk space
The $SNIC_TMP variable contains the path to a node-local temporary file directory that you can use when running your jobs, in order to get maxiumum disk performance (since the disks are local to the current compute node). This directory will be automatically created on your (first) compute node before the job starts and automatically deleted when the job has finished.
The path specified in $SNIC_TMP is equal to the path: /scratch/$SLURM_JOB_ID, where the job variable $SLURM_JOB_ID contains the unique job identifier of your job.
WARNING: Please note, that in your "core" (see below) jobs, if you write data in the /scratch directory but outside of the /scratch/$SLURM_JOB_ID directory, your data may be automatically deleteted during your job run.
Details about the "core" and "node" partitions
A normal Rackham node contains 128 GB of RAM and 20 compute cores. An equal share of RAM for each core would mean that each core gets at most 6.8 GB of RAM. This simple calculation gives one of the limits mentioned below for a "core" job.
You need to choose between running a "core" job or a "node" job. A "core" job must keep within certain limits, to be able to run together with up to 19 other "core" jobs on a shared node. A job that cannot keep within those limits must run as a "node" job.
Some serial jobs must run as "node" jobs. You tell Slurm that you need a "node" job with the flag "-p node". (If you forget to tell Slurm, you are by default choosing to run a "core" job.)
A "core" job:
Will use a part of the resources on a node, from a 1/20 share to a 19/20 share of a node.
Must specify less cores than 20, i.e.between "-n 1" to "-n 19".
Must not demand "-N", "--nodes", or "--exclusive".
Is recommended not to demand "--mem"
Must not demand to run on a fat node (see below, for an explanation of "fat"), a devel node or a GPU node.
Must not use more than four GB of RAM for each core it demands. If a job needs half of the RAM, i.e. 64 GB, you need to reserve also at least half of the cores on the node, i.e. 10 cores, with the "-n 10" flag.
A "core" job is accounted on your project as one "core hour" (sometimes also named as a "CPU hour") per core you have been allocated, for each wallclock hour that it runs. On the other hand, a "node" job is accounted on your project as sixteen core hours for each wallclock hour that it runs, multiplied with the number of nodes that you have asked for.
Rackham has two node types, thin being the typical cluster node and fat nodes having double the amount of memory available normally (256 GB). You may specify a node with more RAM, by adding the words "-C mem256GB" or "-C fat" to your job submission line and thus making sure that you will get 256 GB of RAM on each node in your job. Please note that there are only 32 nodes with this amount (or more) of RAM.
To request a fat node, use -c mem256GB or -c fat in your sbatch command.
File storage and disk space
At UPPMAX we have a few different kinds of storage areas for files, see Disk Storage User Guide for more information and recommended use.