Uppsala Multidisciplinary Center for Advanced Computational Science

Visualizing sequencing data with UCSC Genome Browser

General information

This guide will go through step by step how to visualize sequencing data with UCSC Genome Browser.

WARNING!!
Your data will be available on a web server until you delete the copy of the data in the webexport folder. The server will do its best to protect the data, only allowing UCSC server access, but web servers are by nature considered insecure for top-secret things. This may not be a concern for ordinary users.

Another thing to keep in mind is that the data will be accessible to other users at UPPMAX through the SSH connection, until you delete the copy of the data in the webexport folder.

Visualize data with USCS

Step 1: Activate your webexport folder

Follow the Webexport Guide to activate your webexport folder, if you have not already done so.

Step 2: Connect to UPPMAX with SSH

ssh <username>@milou.uppmax.uu.se

Ex:

ssh dahlo@kalkyl.uppmax.uu.se

MobaXterm is recommended to use when connecting with Windows computers. Linux and Mac have native terminals for this.

Step 3: Load the www-tools module

module load www-tools

Step: 4: Run the ucscVis script

ucscVis --proj <project id=""> --bam <bam-files> (--gtf <gtf-files>) (--db <uscs assembly="" name="">)

Explanation of options:
--proj The UPPMAX project ID
--bam Comma separated list of sorted and indexed BAM-files (file1.bam,file2.bam,file3.bam)
--gtf Optional. Comma separated list of GTF-files (file1.gtf,file2.gtf,file3.gtf)
--db Optional. Name of the UCSC reference genome you want to display your data on. See http://genome.ucsc.edu/FAQ/FAQreleases.html#release1 (column "UCSC version") for the available names. Not specifying this option wll give you hg19 by default.

The --gtf option can be used if you want to visualize custom annotations together with the BAM-files. Might be assembled transcripts from Cufflinks, or other interesting things. Must be in GTF format.

The files specified in --bam MUST be sorted and indexed, with the .bai file present in the same directory as the .bam-file and named as follows: if the bam-file is named 'file1.bam' the index file must be named 'file1.bai'.

Examples:

ucscVis --proj b2010074 --bam file1.bam,file2.bam
ucscVis --bam file1.bam,file2.bam --gtf transctipts.gtf,extra.gtf,myOwn.gtf --proj b2010074 --db hg17
ucscVis --proj b2010074 --gtf transctipts.gtf,extra.gtf,myOwn.gtf --bam ../../file1.bam,/home/dahlo/myBamFile.bam --db myoLuc2

Step 5: Look at your data

When the script has finished running (might take a while since it will copy all the data to the webexport directory) a message will be displayed in the terminal:

URL to results:

https://export.uppmax.uu.se/b2010074/ucscVis/2011-10-03-17-02-35/[chr nr].[start].[stop]
E.g.
https://export.uppmax.uu.se/b2010074/ucscVis/2011-10-03-17-02-35/4.10000.11000
https://export.uppmax.uu.se/b2010074/ucscVis/2011-10-03-17-02-35/X.10000.11000

To reset UCSC Genome Browser, visit the following URL: http://genome.ucsc.edu/cgi-bin/cartReset

This information in this case is printed to /proj/b2010074/webexport/ucscVis/2011-10-03-17-02-35/2011-10-03-17-02-35.url if you would like to read it in the future.

To view your data, simply go to the address given to you, in this case https://export.uppmax.uu.se/b2010074/ucscVis/2011-10-03-17-02-35/

To view specific areas of the genome navigate using the UCSC Genome Browsers built-in tools, or write the address in the following way:

https://export.uppmax.uu.se/b2010074/ucscVis/2011-10-03-17-02-35/[chr nr].[start].[stop]

where [chr nr] is the chromosome number (i.e. 2 or 15 or X or MT etc.), [start] is the starting position, and [stop] is the end position.

Step 6: Reset the UCSC Genome Browser

If you are switching between different visualizations you might notice that UCSC will keep all the recent custom tracks in the same visualization. To clear out all custom tracks and reset UCSC, enter to following URL:

http://genome.ucsc.edu/cgi-bin/cartReset