Uppsala Multidisciplinary Center for Advanced Computational Science

Test Galaxy on UPPMAX

This is a tutorial on how to try out Galaxy running on UPPMAX. This is not the final product, keep that in mind. The goal is to integrate all this in the coming Bioclipse client, making the whole process as easy as pressing a button.

Meanwhile, prepare to get your hands a little dirty.

If you are running Windows, please use MobaXterm as your SSH program. I have tried the port forwarding with it and it works perfectly.

For information about Galaxy itself, please visit their official wiki.

Installation

If it is the first time you are doing this, you have to install Galaxy in a suitable location. This could be anywhere you have write permissions, but your own glob/ or your project's glob/ is recommended.

Step 1: Load the Galaxy module

Since there is no Galaxy module at the moment, we'll have to simulate loading it by manually doing what module loading usually does: changing your PATH variable. Type the following code in your UPPMAX terminal window:

$ export PATH=$PATH:/sw/apps/build/slurm-drmaa/other/scripts

Step 2: Install Galaxy

The installation script will simply copy a working Galaxy to the location you specify.

$ galaxy-installer [installation path] [default project]

Ex.

$ galaxy-installer ./galaxy b2010074
or
$ galaxy-installer /bubo/glob/g18/dahlo/galaxy b2010074

That's it, now you have galaxy installed! These steps are only necessary the first time you do this.

Starting Galaxy

These are the steps you will follow every time you want to start Galaxy. If you come up with some clever way to do things that i have not thought of, please let me know!

Step 1: Book a node/core

Since there is a 30 min time limit for running programs on the login nodes, we have to book a node or a core to run Galaxy on. I have not tried the core booking, so please do and let me know it it works. I mean, what's the worst that could happen?

EDIT: Tried the core booking now, and it seems to be working perfectly. But as Roman pointed out during the meeting, there could be performance impacts if multiple users are doing stuff simultaneously.

To book a node/core that will remain active even though no programs are running on it:

$ salloc -A [uppmax project] -t [time allocation] -p [node or core] --no-shell

E.g.

$ salloc -A b2010074 -t 24:00:00 -p node --no-shell
or
$ salloc -A b2010074 -t 2-00:00:00 -p core --no-shell

Step 2: Check which node/core you got

To check which node your booking is located on, wait until your job has started and type:

jobinfo -u [your user name]

E.g.

jobinfo -u dahlo

This will give you a print out of all your jobs and then some more. The top part of it will show you your currently running:

CLUSTER: milou
Running jobs:
JOBID PARTITION NAME USER ACCOUNT ST START_TIME TIME_LEFT NODES CPUS NODELIST(REASON)
1464950 core (null) dahlo b2010074 R 2011-10-07T11:08:26 11:49:25 1 1 q47

Look at the rightmost part of it, q47 in this example. This is the name of the node your job is running on. Remember this.

Step 3: Start Galaxy on the booked node

Open up a new terminal on your computer, one which is not connected to UPPMAX. The following command will create a double SSH tunnel that will forward port 8080 (or any other if you wish) on your computer to the booked node in UPPMAX.

ssh -L 8080:localhost:[random port number] [user name]@milou.uppmax.uu.se 'ssh -t -t -L [random port number]:localhost:8080 [node name] "sh [path to galaxy]/run.sh"'

There are 4 variables that you have to insert on your own here:

[random port number] = pick a number between 20 000 and 63 000
Use the same number in both locations in the command.

[user name] = your UPPMAX user name

[node name] = the name of your booked node. q47 in this example.

[path to galaxy] = the same path as you typed in the installation process
/glob/dahlo/work/galaxy in this example.

E.g.

ssh -L 8080:localhost:34567 dahlo@kalky.uppmax.uu.se 'ssh -t -t -L 34567:localhost:8080 q47 "sh /glob/dahlo/work/galaxy/run.sh"'
or
ssh -L 1234:localhost:50000 dahlo@kalky.uppmax.uu.se 'ssh -t -t -L 50000:localhost:1234 q312 "sh ~/glob/work/galaxy-central/run.sh"'

Executing this command will first ask you for your UPPMAX password, and then loads of text will flash across the screen while Galaxy is initiating. When the text stops flowing and you see the following:

Starting server in PID 17459.
serving on http://127.0.0.1:8080

the server is started.

Step 4: Connect to Galaxy

Here is the easy part. Just open your favorite browser and type in the address:

http://localhost:8080

Congratulations, you are now running Galaxy!

If you are experiencing problems with Galaxy complaining that some programs are not installed, it might be because the module for that program is not loaded. Edit the file [path to galaxy]/startup_settings and add that module to the list of modules being loaded when Galaxy starts.

The default modules are:

export SLURM_MOD="bioinfo-tools bowtie samtools tophat"