What Does the Genome Browser Do

35
What does the Genome Browser do?  As vertebrate genome sequences near completion and research re-focuses on their analysis, the issue of effective sequence display becomes critical: it is not helpful to have 3 billion letters of genomic DNA shown as plain text! As an alternative, the UCSC Genome Browser provides a rapid and reliable display of any requested portion of genomes at any scale, together with dozens of aligned annotation tracks (known genes,  predicted genes, ESTs, mRNAs, CpG islands, assembl y gaps and coverage, chromosomal  bands, mouse homologies , and more). Half of the annotation tr acks are computed at UCSC from publicly available sequence data. The remaining tracks are provided by collaborators worldwide. Users can also add their own custom tracks to the browser for educational or research purposes. The Genome Browser stacks annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. The user can look at a whole chromosome to get a feel for gene density, open a specific cytogenetic band to see a positionally mapped disease gene candidate, or zoom in to a particular gene to view its spliced ESTs and possible alternative splicing. The Genome Browser itself does not draw conclusions; rather, it collates all relevant information in one location, leaving the exploration and interpretation to the user. The Genome Browser supports text and sequence based searches that provide quick,  precise access to an y region of specific interest. Se condary links from ind ividual entries within annotation tracks lead to sequence details and supplementary off-site databases. To control information overload, tracks need not be displayed in full. Tracks can be hidden, collapsed into a condensed or single-line display, or filtered according to the user's criteria. Zooming and scrolling controls help to narrow or broaden the displayed chromosomal range to focus on the exact region of interest. Clicking on an individual item within a track opens a details page containing a summary of properties and links to off-site repositories such as PubMed, GenBank, Entrez, and OMIM. The page provides item-specific information on position, cytoband, strand, data source, and encoded protein, mRNA, genomic sequence and alignment, as appropriate to the nature of the track. A blue navigation bar at the top of the browser provides links to several other tools and data sources. For instance, under the "View" menu, the "DNA" link enables the user to view the raw genomic DNA sequence for the coordinate range displayed in the browser window. This DNA can encode track features via elaborate text formatting options. Other links tie the Genome Browser to the BLAT alignment tool, provide access to the underlying relational database via the Table Browser, convert coordinates across different assembly dates, and open the window at the complementary  Ensembl or   NCBI Map Viewer  annotation. The browser data represents an immense  collaborative effort  involving thousands of  people from the internati onal biomedical research community. The UCSC Bioinformatics Group itself does no sequencing. Although it creates the majority of the annotation tracks in-house, the annotations are based on publicly available data contributed by many labs and research groups throughout the world. Several of the Genome Browser annotations are generated in collaboration with outside individuals or are contributed wholly by external research groups. UCSC's other major roles include building genome assemblies,

Transcript of What Does the Genome Browser Do

Page 1: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 1/35

What does the Genome Browser do? As vertebrate genome sequences near completion and research re-focuses on their analysis, the issue of effective sequence display becomes critical: it is not helpful to have3 billion letters of genomic DNA shown as plain text! As an alternative, the UCSCGenome Browser provides a rapid and reliable display of any requested portion of genomes at any scale, together with dozens of aligned annotation tracks (known genes,

 predicted genes, ESTs, mRNAs, CpG islands, assembly gaps and coverage, chromosomal bands, mouse homologies, and more). Half of the annotation tracks are computed atUCSC from publicly available sequence data. The remaining tracks are provided bycollaborators worldwide. Users can also add their own custom tracks to the browser for educational or research purposes.

The Genome Browser stacks annotation tracks beneath genome coordinate positions,allowing rapid visual correlation of different types of information. The user can look at a

whole chromosome to get a feel for gene density, open a specific cytogenetic band to seea positionally mapped disease gene candidate, or zoom in to a particular gene to view itsspliced ESTs and possible alternative splicing. The Genome Browser itself does not drawconclusions; rather, it collates all relevant information in one location, leaving theexploration and interpretation to the user.

The Genome Browser supports text and sequence based searches that provide quick, precise access to any region of specific interest. Secondary links from individual entrieswithin annotation tracks lead to sequence details and supplementary off-site databases.To control information overload, tracks need not be displayed in full. Tracks can behidden, collapsed into a condensed or single-line display, or filtered according to the

user's criteria. Zooming and scrolling controls help to narrow or broaden the displayedchromosomal range to focus on the exact region of interest. Clicking on an individualitem within a track opens a details page containing a summary of properties and links tooff-site repositories such as PubMed, GenBank, Entrez, and OMIM. The page providesitem-specific information on position, cytoband, strand, data source, and encoded protein,mRNA, genomic sequence and alignment, as appropriate to the nature of the track.

A blue navigation bar at the top of the browser provides links to several other tools anddata sources. For instance, under the "View" menu, the "DNA" link enables the user toview the raw genomic DNA sequence for the coordinate range displayed in the browser window. This DNA can encode track features via elaborate text formatting options. Other 

links tie the Genome Browser to the BLAT alignment tool, provide access to theunderlying relational database via the Table Browser, convert coordinates across differentassembly dates, and open the window at the complementary Ensembl or   NCBI MapViewer  annotation.

The browser data represents an immense collaborative effort involving thousands of  people from the international biomedical research community. The UCSC BioinformaticsGroup itself does no sequencing. Although it creates the majority of the annotation tracksin-house, the annotations are based on publicly available data contributed by many labsand research groups throughout the world. Several of the Genome Browser annotationsare generated in collaboration with outside individuals or are contributed wholly byexternal research groups. UCSC's other major roles include building genome assemblies,

Page 2: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 2/35

creating the Genome Browser work environment, and serving it online. The majority of the sequence data, annotation tracks, and even software are in the public domain and areavailable for anyone to download. 

In addition to the Genome Browser, the UCSC Genome Bioinformatics group provides

several other tools for viewing and interpreting genome data:

  BLAT - a fast sequence-alignment tool similar to BLAST. Read more.   Table Browser  - convenient text-based access to the database underlying the

Genome Browser. Read more.   Genome Graphs - a tool that allows you to upload and display genome-wide data

sets such as the results of genome-wide SNP association studies, linkage studiesand homozygosity mapping. Read more. 

  Gene Sorter  - expression, homology, and other information on groups of genesthat can be related in many ways. Read more. 

Getting Started: Genome Browser gateways 

The UCSC Genome Bioinformatics home page  provides access to Genome Browserson several different genome assemblies. To get started, click the Browser link on the

 blue sidebar. This will take you to a Gateway page where you can select which genometo display.

Opening the Genome Browser at a specific position To get oriented in using the Genome Browser, try viewing a gene or region of the

genome with which you are already familiar, or use the default position. To open theGenome Browser window:

1.  Select the clade, genome and assembly that you wish to display from thecorresponding pull-down menus. Assemblies are typically named by the first threecharacters of an organism's genus and species names. To access older assemblyversions that are no longer available from the menu, look in the GenomeBrowser  archives. 

2.  Specify the genome location you'd like the Genome Browser to open to. To select alocation, enter a valid position query in the search term text box at the top of the

Gateway page or accept the default position already displayed. The search supportsseveral different types of  queries: gene symbols, mRNA or EST accessionnumbers, chromosome bands, descriptive terms likely to occur in GenBank text, or specific chromosomal ranges.

3.  Click the submit button to open up the Genome Browser window to the requestedlocation. In cases where a specific term (accession, gene name, etc.) was queried,the item will be highlighted in the display.

Occasionally the Gateway page returns a list of several matches in response to a search,rather than immediately displaying the Genome Browser window. When this occurs,click on the item in which you're interested and the Genome Browser will open to that

Page 3: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 3/35

location.

The search mechanism is not a site-wide search engine. Instead, it primarily searchesGenBank mRNA records whose text annotations can include gene names, genesymbols, journal title words, author names, and RefSeq mRNAs. Searches on other 

selected identifiers, such as NP and NM accession numbers, OMIM identifiers, andEntrez Gene IDs are supported. However, some types of queries will return an error,e.g. post-assembly GenBank entries, withdrawn gene names, and abandoned synonyms.If your initial query is unsuccessful, try entering a different related term that may

 produce the same location. For example, if a query on a gene symbol produces noresults, try entering an mRNA accession, gene ID number, or descriptive wordsassociated with the gene.

Finding a genome location using BLAT If you have genomic, mRNA, or protein sequence, but don't know the name or thelocation to which it maps in the genome, the BLAT tool will rapidly locate the position

 by homology alignment, provided that the region has been sequenced. This search willfind close members of the gene family, as well as assembly duplication artifacts. Anentire set of query sequences can be looked up simultaneously when provided in fastaformat.

A successful BLAT search returns a list of one or more genome locations that matchthe input sequence. To view one of the alignments in the Genome Browser, click the browser link for the match. The details link can be used to preview the alignment todetermine if it is of sufficient match quality to merit viewing in the Genome Browser. If too many BLAT hits occur, try narrowing the search by filtering the sequence in slow

mode with RepeatMasker , then rerunning the BLAT search.

For more information on conducting and fine-tuning BLAT searches, refer to the BLATsection of this document.

Opening the Genome Browser with a custom annotation track  You can open the Genome Browser window with a custom annotation track displayed

 by using the Add Custom Tracks feature available from the gateway and annotationtracks pages. For more information on creating and using custom annotation tracks,refer to the Creating custom annotation tracks section.

Annotation track data can be entered in one of three ways:

-- Enter the file name for an annotation track source file in the Annotation File text box.

-- Type or paste the annotation track data into the large text box.

-- If the annotation data are accessible through a URL, enter the URL name in thelarge text box.

Once you've entered the annotation information, click the submit button at the top of the Gateway page to open up the Genome Browser with the annotation track displayed.

The Genome Browser also provides a collection of custom annotation

Page 4: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 4/35

tracks contributed by the UCSC Genome Bioinformatics group and the researchcommunity.

NOTE: If an annotation track does not display correctly when you attempt to upload it,you may need to reset the Genome Browser to its default settings, then reload the track.

For information on troubleshooting display problems with custom annotation tracks,refer to the troubleshooting section in the Creating custom annotation tracks section.

Viewing genome data as text The Table Browser , a portal to the underlying open source MySQL relational databasedriving the Genome Browser, displays genomic data as columns of text rather than asgraphical tracks. For more information on using the Table Browser, see thesection Getting Started: on the Table Browser . 

Opening the Genome Browser from external gateways Several external gateways provide direct links into the Genome Browser. Examplesinclude: Entrez Gene, AceView, Ensembl, SuperFamily, and GeneCards. Journalarticles can also link to the browser and provide custom tracks. Be sure to use theassembly date appropriate to the provided coordinates when using data from a journalsource.

Tips for Use To facilitate your return to regions of interest within the Genome Browser, save thecoordinate range or bookmark the page of displays that you plan to revisit or wish toshare with others.

It is usually best to work with the most recent assembly even though a full set of tracksmight not yet be ready. Be aware that the coordinates of a given feature on anunfinished chromosome may change from one assembly to the next as gaps are filled,artifactual duplications are reduced, and strand orientations are corrected. The GenomeBrowser offers multiple tools that can correctly convert coordinates between differentassembly releases. For more information on conversion tools, see thesection Converting data between assemblies. 

To ensure uninterrupted browser services for your research during UCSC server maintenance and power outages, bookmark a mirror  site that replicates the UCSCgenome browser.

Bear in mind that the Genome Browser cannot outperform the underlying quality of thedraft genome. Assembly errors and sequence gaps may still occur well into thesequencing process due to regions that are intrinsically difficult to sequence. Artifactualduplications arise as unavoidable compromises during a build, causing misleadingmatches in genome coordinates found by alignment.

Interpreting and fine-tuning the Genome Browser display 

The Genome Browser annotation tracks page displays a genome location specifiedthrough a Gateway search, a BLAT search, or an uploaded custom annotation track.There are five main features on this page: a set of  navigation controls, a chromosome

Page 5: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 5/35

ideogram, the annotations tracks image, display configuration buttons, and a set of track  display controls. 

The first time you open the Genome Browser, it will use the application default valuesto configure the annotation tracks display. By manipulating the navigation,

configuration and display controls, you can customize the annotation tracks display tosuit your needs. For a complete description of the annotation tracks available in allassembly versions supported by the Genome Browser, see the Annotation Track Descriptions section.

The Genome Browser retains user preferences from session to session within the sameweb browser, although it never monitors or records user activities or submitted data. Torestore the default settings, click the "Click here to reset" link on the Genome Browser Gateway page. To return the display to the default set of tracks (but retain customtracks and other configured Genome Browser settings), click the default tracks buttonon the Genome Browser page.

Display conventions The annotation tracks displayed in the Genome Browser use a common set of displayconventions:

-- Annotation track descriptions: Each annotation track has an associateddescription page that contains a discussion of the track, the methods used to createthe annotation, the data sources and credits for the track, and (in some cases) filter and configuration options to fine-tune the information displayed in the track. Toview the description page, click on the mini-button to the left of a displayed track 

or on the label for the track in the Track Controls section.-- Annotation track details pages: When an annotation track is displayed in full,

 pack, or squish mode, each line item within the track has an associated details pagethat can be displayed by clicking on the item or its label. The informationcontained in the details page varies by annotation track, but may include basic

 position information about the item, related links to outside sites and databases,links to genomic alignments, or links to corresponding mRNA, genomic, and

 protein sequences.

-- Gene prediction tracks: Coding exons are represented by blocks connected byhorizontal lines representing introns. The 5' and 3' untranslated regions (UTRs) aredisplayed as thinner blocks on the leading and trailing ends of the aligning regions.In full display mode, arrowheads on the connecting intron lines indicate thedirection of transcription. In situations where no intron is visible (e.g. single-exongenes, extremely zoomed-in displays), the arrowheads are displayed on the exon

 block itself.

-- Pattern Space Layout (PSL) alignment tracks: Aligning regions (usually exonswhen the query is cDNA) are shown as black blocks. In dense display mode, thedegree of darkness corresponds to the number of features aligning to the region or the degree of quality of the match. In pack or full display mode, the aligningregions are connected by lines representing gaps in the alignment (typicallyspliced-out introns), with arrowheads indicating the orientation of the alignment,

 pointing right if the query sequence was aligned to the forward strand of thegenome and left if aligned to the reverse strand. Two parallel lines are drawn over 

Page 6: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 6/35

double-sided alignment gaps, which skip over unalignable sequence in both targetand query. For alignments of ESTs, the arrows may be reversed to show theapparent direction of transcription deduced from splice junction sequences. Insituations where no gap lines are visible, the arrowheads are displayed on the block itself. To prevent display problems, the Genome Browser imposes an upper limit

on the number of alignments that can be viewed simultaneously within the tracksimage. When this limit is exceeded, the Browser displays the best several hundredalignments in a condensed display mode, then lists the number of undisplayedalignments in the last row of the track. In this situation, try zooming in to displaymore entries or to return the track to full display mode. For some PSL tracks ,extracoloring to indicate mismatching bases and query-only gaps may be available.

-- "Chain" tracks (2-species alignment): Chain tracks display boxes joined together  by either single or double lines. The boxes represent aligning regions. Single linesindicate gaps that are largely due to a deletion in the genome of the first species or an insertion in the genome of the second species. Double lines represent more

complex gaps that involve substantial sequence in both species. This may resultfrom inversions, overlapping deletions, an abundance of local mutation, or anunsequenced gap in one species. In cases where there are multiple chains over a

 particular portion of the genome, chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes. In the fuller display modes, the individualfeature names indicate the chromosome, strand, and location (in thousands) of thematch for each matching alignment.

-- "Net" tracks (2-species alignment): Boxes represent ungapped alignments, whilelines represent gaps. Clicking on a box displays detailed information about thechain as a whole, while clicking on a line shows information on the gap. The

detailed information is useful in determining the cause of the gap or, for lower level chains, the genomic rearrangement. Individual items in the display arecategorized as one of four types (other than gap):

  Top - The best, longest match. Displayed on level 1.  Syn - Lineups on the same chromosome as the gap in the level above it.   Inv - A lineup on the same chromosome as the gap above it, but in the

opposite orientation.   NonSyn - A match to a chromosome different from the gap in the level

above.

-- "Wiggle" tracks: These tracks plot a continuous function along a chromosome.Data is displayed in windows of a set number of base pairs in width. The score for each window displays as "mountain ranges". The display characteristics varyamong the tracks in this group. See the individual track descriptions for moreinformation on interpreting the display. If the "mountain peak" is taller or shorter than what can be shown in the display, it is clipped and colored magenta.

Changing the display mode of an individual annotation track  Each annotation track within the window may have up to five display modes:

-- Hide: the track is not displayed at all. To hide all the annotation tracks, click the hide all button. This mode is useful for restricting the display to only those

Page 7: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 7/35

tracks in which you are interested. For example, someone who is not interested inSNPs or mouse synteny may want to hide these tracks to reduce track clutter andimprove speed. There are a few annotation tracks that pertain only to one specificchromosome, e.g. Sanger22, Rosetta. In these cases, the track and its associatedcontroller will be hidden automatically when the track window is not open to the

relevant chromosome.

-- Dense: the track is displayed with all features collapsed into a single line. Thismode is useful for reducing the amount of space used by a track when you don'tneed individual line item details or when you just want to get an overall view of anannotation. For example, by opening an entire chromosome and setting the RefSeqGenes track to dense, you can get a feel for the known gene density of thechromosome without displaying excessive detail.

-- Full: the track is displayed with each annotation feature on a separate line. It isrecommended that you use this option sparingly, due to the large number of individual track items that may potentially align at the selected position. For 

example, hundreds of ESTs might align with a specified gene. When the number of lines within a requested track location exceeds 250, the track automatically defaultsto a more tightly-packed display mode. In this situation, you can restore the track display to full mode by narrowing the chromosomal range displayed or by using atrack filter to reduce the number of items displayed. On tracks that contain onlyhide, dense, and full modes, you can toggle between full and dense display modes

 by clicking on the track's center label.

-- Squish: the track is displayed with each annotation feature shown separately, but at50% the height of full mode. Features are unlabeled, and more than one may bedrawn on the same line. This mode is useful for reducing the amount of space used

 by a track when you want to view a large number of individual features and get anoverall view of an annotation. It is particularly good for displaying tracks in whicha large number of features align to a particular section of a chromosome, e.g. ESTtracks.

-- Pack: the track is displayed with each annotation feature shown separately andlabeled, but not necessarily displayed on a separate line. This mode is useful for reducing the amount of space used by a track when you want to view the largenumber of individual features allowed by squish mode, but need the labeling anddisplay size provided by full mode. When the number of lines within the requestedtrack location exceeds 250, the track automatically defaults to squish display mode.In this situation, you can restore the track display to pack mode by narrowing the

chromosomal range displayed or by using a track filter to reduce the number of items displayed. To toggle between pack and full display modes, click on thetrack's center label.

The track display controls are grouped into categories that reflect the type of data in thetrack, e.g. Gene Prediction Tracks, mRNA and EST tracks, etc. To change the displaymode for a track, find the track's controller in the Track Controls section at the bottomof the Genome Browser page, select the desired mode from the control's display menu,and then click the refresh button. Alternatively, you can change the display mode byusing the Genome Browser's right-click navigation feature, or can toggle between denseand full modes for a displayed track (or pack mode when available) by clicking on the

optional center label for the track.

Page 8: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 8/35

Changing the display mode for a group of tracks Track display modes may be set individually or as a group on the Genome Browser Track Configuration page. To access the configuration page, click the configure buttonon the annotation tracks page or the configure tracks and display button on theGateway page. Exercise caution when using the show all buttons on track groups or 

assemblies that contain a large number tracks; this may seriously impact the display performance of the Genome Browser or cause your Internet browser to time out.

Hiding the track display controls The entire set of track display controls at the bottom of the annotation tracks page may

 be hidden from view by checking the Show track controls under main graphic option inthe Configure Image section of the Track Configuration page.

Changing the display of a track by using filters and configuration options Some tracks have additional filter and configuration capabilities, e.g. EST tracks,mRNA tracks, NC160, etc. These options let the user modify the color or restrict the

data displayed within an annotation track. Filters are useful for focusing attention onitems relevant to the current task in tracks that contain large amounts of data. For example, to highlight ESTs expressed in the liver, set the EST track filter to displayitems in a different color when the associated tissue keyword is "liver". Configurationoptions let the user adjust the display to best show the data of interest. For example, themin vertical viewing range value on wiggle tracks can be used to establish a datathreshold. By setting the min value to "50", only data values greater than 50 percentwill display.

To access filter and configuration options for a specific annotation track, open the

tracks' description page by clicking the label for the track's control menu under theTrack Controls section, the mini-button to the left of the displayed track, or the"Configure..." option from the Genome Browser's right-click   popup menu. The filter and configration section is located at the top of the description page. In most instances,more information about the configuration options is available within the descriptiontext or through a special help link located in the configuration section.

Filter and configuration settings are persistent from session to session on the same web browser. To return the Genome Browser display to the default set of tracks (but retaincustom tracks and other configured Genome Browser settings), click the default 

tracks button on the Genome Browser tracks page. To remove all user configuration

settings and custom tracks, and completely restore the defaults, click the "Click here toreset" link on the Genome Browser Gateway page.

Zooming and scrolling the tracks display At times you may want to adjust the amount of flanking region displayed in theannotation tracks window or adjust the scale of the display. At a scale of 1 pixel per 

 base pair, the window accurately displays the width of exons and introns, and indicatesthe direction of transcription (using arrowheads) for multi-exon features. At a grosser scale, certain features - such as thin exons - may disappear. Also, some exons mayfalsely appear to fall within RepeatMasker features at some scales.

Click the zoom in and zoom out buttons at the top of the Genome Browser page to zoomin or out on the center of the annotation tracks window by 1.5, 3 or 10-fold.

Page 9: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 9/35

Alternatively, you can zoom in 3-fold on the display by clicking anywhere on the BasePosition track. In this case, the zoom is centered on the coordinate of the mouse click.To view the base composition of the sequence underlying the current annotation track display, click the base button.

Quickly zoom to a specific region of interest by using the browser's "drag-and-zoom"feature. To define the region you wish to zoom to, click-and-hold the mouse button onone edge of the desired zoom area in the Base Position track, drag the mouse right or left to highlight the selection area, then release the mouse button. The annotation tracksimage will automatically zoom to the new region. To drag-and-zoom in a part of theimage other than the Base Position track, depress the shift key before clicking anddragging the mouse. Note that the Enable advanced javascript features option on theTrack Configuration page must be toggled on to use this feature.

To scroll (pan) the view of the entire tracks image horizontally, click on the image anddrag the cursor to the left or right, then release the mouse button, to shift the displayed

region in the corresponding direction. The view may be scrolled by up to one imagewidth. To scroll the annotation tracks horizontally by set increments of 10%, 50%, or 95% of the displayed size (as given in base pairs), click the corresponding move arrow.It is also possible to scroll the left or right side of the tracks by a specified number of vertical gridlines while keeping the position of the opposite side fixed. To do this, click the appropriate move start or move end arrow, located under the annotation trackswindow. For example, to keep the left-hand display coordinate fixed but increase theright-hand coordinate, you would click the right-hand move end arrow. To increase or decrease the gridline scroll interval, edit the value in the move start or move end text

 box.

Changing the displayed track position To display a completely different position in the genome, enter the new query inthe position/search text box, then click the jump button. For more information on validentries for this text box, refer to the Getting Started section.

If a chromosome image (ideogram) is available above the track display, click anywhereon the chromosome to move to that position (the current window size will bemaintained). Select a region of any size by clicking and dragging in the image. Finally,hold the "control" key while clicking on a chromosome band to select the entire band.

Changing the order of the displayed tracks To vertically reposition a track in the annotation track window, click-and-hold themouse button on the side label, then drag the highlighted track up or down within theimage. Release the mouse button when the track is in the desired position. To move anentire group of associated tracks (such as all the displayed subtracks in a compositetrack), click-and-hold the gray mini-button to the left of the tracks, then drag.

Changing the width of the annotation track window The first time the annotation track window is displayed, or after the Genome Browser has been reset, the size of the track window is set by default to the width that best fitsyour Internet browser window. If you horizontally resize the browser window, you can

automatically adjust the annotation track image size to the new width by clickingthe resize button under the track image. To manually override the default width, enter a

Page 10: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 10/35

new value in the image width text box on the Track Configuration page, then click the submit button. The maximum supported width is 5000 pixels.

Changing the width of the label area to the left of the image  The item labels (or track label, when viewed in dense mode) are displayed to the left of 

the annotation image. The width of this area is set to 17 characters by default. Tochange the width, edit the value in the label area width text box on the Track Configuration page, then click Submit .

Changing the text size in the annotation track image The annotation track image may be adjusted to display text in a range of fonts from"tiny" to "huge". To change the size of the text, select an option from the text size pull-down menu on the Track Configuration page, then click Submit . The text size is set to"small" by default.

Hiding the annotation track labels The track and element labels displayed above and to the left of the tracks in theannotation tracks image may be hidden from view by unchecking the Display track 

descriptions above each track and Display labels to the left of items in tracks boxes,respectively, on the Track Configuration page.

Hiding the display grid on the annotation tracks image The light blue vertical guidelines on the annotation tracks image may be removed byunchecking the Show light blue vertical guidelines box on the Track Configuration

 page.

Hiding the chromosome ideogram The chromosome ideogram, located just above the annotation tracks image, provides agraphical overview of the features on the selected chromosome, including its bands, the

 position of the centromere, and an indication of the region currently displayed in theannotation tracks image. To hide the ideogram, uncheck the Display chromosome

ideogram above main graphic box on the Tracks Configuration page.

Enabling item and exon navigation When the Next/previous item navigation configuration option is toggled on, on theTrack Configuration page, gray double-headed arrows display in the Genome Browser tracks image on both sides of the track labels of gene, mRNA and EST tracks (or any

standard tracks based on BED, PSL or genePred format). Clicking on the gray arrowsshifts the image window toward that end of the chromosome so that the next item in thetrack is displayed. Similarly, the Next/previous exon navigation configuration optiondisplays white double-headed arrows on both the 5' and 3' end of each track item thathas exons positioned beyond the edges of the current image. Clicking on one of thewhite arrows shifts the image window to the next exon located towards that end of thefeature.

Enabling the right-click navigation feature Several of the common display and navigation operations offered on the GenomeBrowser tracks page may be quickly accessed by right-clicking on a feature on the

tracks image and selecting an option from the displayed popup menu. Depending oncontext, the right-click feature allows the user to:

Page 11: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 11/35

  change the track display mode  zoom in or out to the exact position coordinates of the feature  open the "Get DNA" window at the feature's coordinates  display details about the feature  open a popup window to configure the track's display  display the entire tracks image in a separate window for inclusion in

spreadsheets or other documents. (Note that the Genome Browser "PDF/PS"described below can also be used to generate a high-quality annotation tracksimage suitable for printing.)

To use the right-click feature, make sure the Enable advanced javascript 

eatures option on the Track Configuration page is checked, and configure your internet browser to allow the display of popup windows from genome.ucsc.edu. When enabled,the right-click navigation feature replaces the default contextual popup menu typicallydisplayed by the internet browser when a user right-clicks on the tracks image. A fewcombinations of the Mozilla Firefox browser on Mac OS do not support the right-click 

menu functionality using secondary click; in these instances, ctrl+left-click must beused to display the menu.

Printing a copy of the annotation track window The Genome Browser provides a mechanism for saving a copy of the currentlydisplayed annotation tracks image to a file that can be printed or edited. Images savedin PostScript format can be printed at high resolution and edited by drawing programssuch as Adobe Illustrator. This is useful for generating figures intended for publication.Images can also be saved in PDF format for viewing by Adobe Acrobat Reader.

To print or save the image to a file:

1.  In the blue navigation bar at the top of the screen, from the "View" menu, click the"PDF/PS" link.

2.  Click one of the PDF or EPS links.

 NOTE: If you have configured your browser image to use one of the larger font sizes,the text in the resulting screen shot may not display correctly. If you encounter this

 problem, reduce the Genome Browser font size using the Configuration utility, thenrepeat the save/print process.

Using BLAT alignments 

BLAT (BLAST-Like Alignment Tool) is a very fast sequence alignment tool similar toBLAST. For more information on BLAT's internal scoring schemes and its overall n-mer alignment seed strategy, refer to W. James Kent (2002) BLAT - The BLAST-LikeAlignment Tool, Genome Res 12:4 656-664.

On DNA queries, BLAT is designed to quickly find sequences with 95% or greater similarity of length 25 bases or more. It may miss genomic alignments that are more

divergent or shorter than these minimums, although it will find perfect sequencematches of 33 bases and sometimes as few as 22 bases. The tool is capable of aligning

Page 12: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 12/35

sequences that contain large introns. On protein queries, BLAT rapidly locates genomicsequences with 80% or greater similarity of length 20 amino acids or more. In general,gene family members that arose within the last 350 million years can generally bedetected. More divergent sequences can be aligned to the human genome by using

 NCBI's BLAST and psi-BLAST, then using BLAT to align the resulting match onto the

UCSC genome assembly. In practice DNA BLAT works well on primates, and proteinBLAT works well on land vertebrates.

Some common uses of BLAT include:

-- finding the genomic coordinates of mRNA or protein within a given assembly

-- determining the exon structure of a gene

-- displaying a coding region within a full-length gene

-- isolating an EST of special interest as its own track 

-- searching for gene family members

-- finding human homologs of a query from another species.

Making a BLAT query To locate a nucleotide or protein within a genome using BLAT:

1.  Open the BLAT Search Genome page  by clicking on the "Tools" pulldown in thetop blue menu bar of the Genome Browser.

2.  Select the genome, assembly, query type, output sort order, and output type. Toorder the search results based on the closeness of the sequence match, choose oneof the score options in the Sort output menu. The score is determined by thenumber of matches vs. mismatches in the final alignment of the query to thegenome.

3.  If the sequence to be uploaded is in an unformatted plain text file, enter the filename in the Upload sequence text box, then click the submit file button. Otherwise,

 paste the sequence or fasta-formatted list into the large edit box, and then click the submit button. Input sequence can be obtained from the Genome Browser aswell as from a custom annotation track.

Header lines may be included in the input text if they are preceded by > and containunique names. Multiple sequences may be submitted at the same time if they are of the

same type and are preceded by unique header lines. Numbers, spaces, and extraneouscharacters are ignored:

>sequence_1ATGCAGAGCAAGGTGCTGCTGGCCGTCGCCCTGTGGCTCTGCGTGGAGACCCGGGCCGCCTCTGTGGGTTTGCCTAGTGTTTCTCTTGATCTGCCCAGGC>sequence_2ATGTTGTTTACCGTAAGCTGTAGTAAAATGAGCTCGATTGTTGACAGAGATGACAGTAGTATTTTTGATGGGTTGGTGGAAGAAGATGACAAGGACAAAG>sequence_3ATGCTGCGAACAGAGAGCTGCCGCCCCAGGTCGCCCGCCGGACAGGTGGCCGCGGCGTCCCCGCTCCTGCTGCTGCTGCTGCTGCTCGCCTGGTGCGCGG

BLAT limitations 

Page 13: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 13/35

DNA input sequences are limited to a maximum length of 25,000 bases. Protein or translated input sequences must not exceed 5000 letters. As many as 25 multiplesequences may be submitted at the same time. The maximum combined length of DNAinput for multiple sequence submissions is 50,000 bases (with a 25,000 base limit per individual sequence). For protein or translated input, the maximum combined input

length is 12,500 letters (with a 5000 letter limit per individual sequence).

 NOTE: Program-driven BLAT use is limited to a maximum of one hit every 15 secondsand no more than 5000 hits per day.

BLAT query search results If a query returns successfully, BLAT will display a flat database file that summarizesthe alignments found. A BLAT query often generates multiple hits. This can happenwhen the genome contains multiple copies of a sequence, paralogs, pseudogenes,statistical coincidences, artifactual assembly duplications, or when the query itself contains repeats or common retrotransposons. When too many hits occur, try

resubmitting the query sequence after filtering in slow mode with RepeatMasker.

Items in the search results list are ordered by the criteria specified in the Sort 

output menu. Each line item provides links to view the details of the sequencealignment or to open the corresponding view in the Genome Browser. The details link gives the letter-by-letter alignment of the sequence to the genome. It is recommendedthat you first examine the details of the alignment for match quality before viewing thesequence in the Genome Browser.

When several nearby BLAT matches occur on a single chromosome, a simple trick can

 be used to quickly adjust the Genome Browser track window to display all of them:open the Genome Browser with the match that has the lowest chromosome startcoordinate, paste in the highest chromosome end coordinate from the list of matches,then click the jump button.

Creating a custom annotation track from BLAT output To make a custom track directly from BLAT, select the PSL format output option. Theresulting PSL track can be uploaded into the Genome Browser by pasting the data intothe data text box on the Genome Browser  Add Custom Tracks  page, accessed via the"add custom tracks" button on the Browser gateway and annotation tracks pages. Seethe Creating custom annotation tracks section for more information.

Using BLAT for large batch jobs or commercial use For large batch jobs or internal parameter changes, it is best to install command lineBLAT on your own Linux server. Sources and executables are free for academic,

 personal, and non-profit purposes. BLAT source may be downloadedfrom http://www.soe.ucsc.edu/~kent (look for the blatSrc*.zip file with the most recentdate). For BLAT executables, go to http://genome-test.cse.ucsc.edu/~kent/exe/ ; binariesare sorted by platform. Non-exclusive commercial licenses are available from the KentInformatics website.

BLAT documentation 

For more information on the BLAT suite of programs, see the BLAT ProgramSpecifications and the Blat section of the Genome Browser FAQ.

Page 14: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 14/35

 Annotation track descriptions 

Detailed information about an individual annotation track, including displaycharacteristics, configuration information, and associated database tables, may be

obtained from the track description page accessed by clicking the mini-button to the leftof the displayed track in the Genome Browser, or by selecting the "Open details..." or "Show details..." option from the Genome Browser's right-click  menu. Click the "Viewtable schema" link on the track description page to display additional information aboutthe primary database table underlying the track. Table schema information may also beaccessed via the "describe table schema" button in the Table Browser . For moreinformation on configuring and using the tracks displayed in the Genome Browser track window, see the section Interpreting and Fine-tuning the Genome Browser display. 

Tips for viewing annotation track data 

-- To display a description page with more information about the track, click on themini-button to the left of a track.

-- To display a details page with additional information about a specific line itemwithin a track in full display mode, click on the item or its label.

-- A track does not appear in the browser if its display mode is set to hide. To restrictthe browser's display to only those tracks in which you're interested, set the displaymode of the unwanted tracks to hide.

-- A track set to full display mode will default to a more tightly packed display mode if the total number of lines in the track exceeds 250.

-- To quickly toggle between full and dense or  pack display modes, click on the track's

center label.-- Only the most recent assemblies are fully active. Older assemblies may be archived. -- Not all tracks appear in all assemblies. Only a basic set of tracks appears initially in

a new assembly.-- Track data can be viewed as text tables using the Table Browser . -- Credit goes to many individuals and institutions for generously contributing the

tracks. For specific information about the contributors of a given track, look at theCredits section on a track's description page.

Getting started on the Table Browser 

The Table Browser   provides text-based access to the genome assemblies and annotationdata stored in the Genome Browser database. As a flexible alternative to the graphical-

 based Genome Browser, this tool offers an enhanced level of query support thatincludes restrictions based on field values, free-form SQL queries, and combinedqueries on multiple tables. Output can be filtered to restrict the fields and lines returned,and may be organized into one of several formats, including a simple tab-delimited filethat can be loaded into a spreadsheet or database as well as advanced formats that may

 be uploaded into the Genome Browser as custom annotation tracks. The Table Browser  provides a convenient alternative to downloading and manipulating the entire genomeand its massive data tracks. (See the Downloading Genome Data section.)

For information on using the Table Browser features, refer to the Table Browser User 

Page 15: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 15/35

Guide. 

Getting started using Sessions 

The Sessions tool allows users to configure their browsers with specific track combinations, including custom tracks, and save the configuration options. Multiplesessions may be saved for future reference, for comparison of scenarios or for sharingwith colleagues. Saved sessions persist for four months after the last access, unlessdeleted. User-generated tracks can be saved within sessions.

This tool may be accessed by clicking the "My Data" pulldown in the top bluenavigation bar in any assembly and then selecting Sessions. To ensure privacy andsecurity, you must create an account and/or  log in to use the Session tool. Individualsessions may be designated by the user as either "shared" or "non-shared" to protect the

 privacy of confidential data. To avoid having a new shared session from someone else

override existing Genome Browser settings, users are encouraged to open a new web- browser instance or to save existing settings in a session before loading a new sharedsession.

For more detailed information on using the Session tool, see the Sessions User Guide. 

Getting started on Genome Graphs 

The Genome Graphs tool can be used to display genome-wide data sets such as theresults of genome-wide SNP association studies, linkage studies, and homozygositymapping. This tool is not pre-loaded with any sample data; instead, you can uploadyour own data for display by the tool.

Once you have uploaded your data, you can view it in a variety of ways. You can viewmultiple sets of genome-wide data simultaneously either as superimposed graphs or side-by-side graphs. Once you see an area of interest in the Genome Graphs view, youcan click on it to go directly to the Genome Browser at that position. You can also set asignificance threshold for your data and view only regions or gene sets that meet thatthreshold.

For information on using the Genome Graphs features, refer to the Genome Graphs

User Guide. 

Using the VisiGene Image Browser 

VisiGene is a browser for viewing in situ images. It enables the user to examine cell- by-cell as well as tissue-by-tissue expression patterns. The browser serves as a virtualmicroscope, allowing users to retrieve images that meet specific search criteria, theninteractively zoom and scroll across the collection.

To start the VisiGene browser, click the VisiGene link in the left-hand sidebar menu onthe Genome Browser  home page.

Page 16: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 16/35

Images Available The following image collections are currently available for browsing:

  High-quality high-resolution images of eight-week-old male mouse sagittal brain slices with reverse-complemented mRNA hybridization probes from

the Allen Brain Atlas, courtesy of the Allen Institute for Brain Science   Mouse in situ images from the Jackson Lab Gene Expression Database (GXD)

at MGI  Transcription factors in mouse embryos from the Mahoney Center for Neuro-

Oncology  Mouse head and brain in situ images from NCBI's Gene Expression Nervous

System Atlas (GENSAT) database   Xenopus laevis in situ images from the  National Institute for Basic

Biology (NIBB) XDB project

Searching the Image Database The image database may be searched by gene symbols, authors, years of publication,

 body parts, GenBank or UniProtKB accessions, organisms, Theiler stages (mice),and Nieuwkoop/Faber stages (frogs). The search returns only those images that matchall the specified criteria. For a list of sample search strings, see the VisiGene Gateway

 page.

The wildcard characters * and ? are supported for gene name searches. For example, toview the images of all genes in the Hox A cluster, search for hoxa*. When searching onauthor names that include initials, use the format Smith AJ .

Image Navigation Following a successful search, VisiGene displays a list of thumbnails of imagesmatching the search criteria in the lefthand pane of the browser. By default, the imagecorresponding to the first thumbnail in the list is displayed in the main image pane. If more than 25 images meet the search criteria, links at the bottom of the thumbnail paneallow the user to toggle among pages of search results. To display a different image inthe main browser pane, click the thumbnail of the image you wish to view.

By default, an image is displayed at a resolution that provides optimal viewing of theoverall image. This size varies among images. The image may be zoomed in or out,sized to match the resolution of the original image or best fit the image display window,

and moved or scrolled in any direction to focus on areas of interest. The original full-sized image may also be downloaded.

Zooming in: To enlarge the image by 2X, click the Zoom in button above the image or click on the image using the left mouse button. Alternatively, the + key may be used tozoom in when the main image pane is the active window.

Zooming out: To reduce the image by 2X, click the Zoom out button above the imageor click on the image using the right mouse button. Alternatively, the - key may be usedto zoom out when the main image pane is the active window.

Sizing to full resolution: Click the Zoom full button above the image to resize the

Page 17: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 17/35

image such that each pixel on the screen corresponds to a pixel in the digitized image.

Sizing to best fit: Click the Zoom fit button above the image to zoom the image to thesize that best fits the main image pane.

Moving the image: To move the image viewing area in any direction, click and dragthe image using the mouse. Alternatively, the following keyboard shortcuts may beused after clicking on the image:

  Scroll left in the image: Left-arrow key or  Home key  Scroll right in the image: Right-arrow key or  End key  Scroll up in the image: Up-arrow key or  PgUp key  Scroll down in the image: Down-arrow key or  PgDn key

Downloading the original full-sized image: Most images may be viewed in their original full-sized format by clicking the "download" link at the bottom of the imagecaption. NOTE: due to the large size of some images, this action may take a long timeand could potentially exceed the capabilities of some Internet browsers.

If you have an image set you would like to contribute for display in the VisiGeneBrowser, contact Jim Kent. 

DNA text formatting 

The Genome Browser provides a feature to configure the retrieval, formatting, andcoloring of the text used to depict the DNA sequence underlying the features in the

displayed annotation tracks window. Retrieval options allow the user to add a paddingof extra bases to the upstream or downstream end of the sequence. Formatting optionsrange from simply displaying exons in upper case to elaborately marking up a sequenceaccording to multiple track data. The DNA sequence covered by various tracks can behighlighted by case, underlining, bold or italic fonts, and color.

The DNA display configuration feature can be useful to highlight features within agenomic sequence, point out overlaps between two types of features (for example,known genes vs. gene predictions), or mask out unwanted features.

Using the DNA text formatting feature 

To access the feature, click on the "View" pulldown on the top blue menu bar on theGenome Browser page and select "DNA", or select the "Get DNA..." option from theGenome Browser's right-click  menu depending on context. "The Get DNA in Window"

 page that appears contains sections for configuring the retrieval and output format.

To display extra bases upstream of the 5' end of your sequence or downstream of the 3'end of the sequence, enter the number of bases in the corresponding text box. Thisoption is useful in looking for regulatory regions.

The Sequence Formatting section lists several options for adjusting the case of all or 

 part of the DNA sequence. To choose one of these formats, click the correspondingoption button, then click the get DNA button. To access a table of extended formatting

Page 18: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 18/35

options, click the Extended case/color options button.

The Extended DNA Case/Color page presents a table with many more format options.The page provides instructions for using the formatting table, as well as examples of itsuse. The list of tracks in the Track Name column is automatically generated from the

list of tracks available on the current genome.

Tips for Use A few caveats mentioned on the Extended DNA Case/Color page bear repeating. Keepthe formatting simple at first: it is easy to make a display that is pretty to look at but isalso completely cryptic. Also, be careful when requesting complex formatting for alarge chromosomal region: when all the HTML tags have been added to the output

 page, the file size may exceed the size limits that your internet browser, clipboard, andother software can safely display. The maximum size of genome that can be formatted

 by the tool is approximately 10 Mbp.

Converting data between assemblies 

Coordinates of features frequently change from one assembly to the next as gaps areclosed, strand orientations are corrected, and duplications are reduced. Occasionally, achunk of sequence may be moved to an entirely different chromosome as the map isrefined. There are three different methods available for migrating data from oneassembly to another: BLAT alignment, coordinate conversion, and coordinate lifting.The BLAT alignment tool is described in the section Using BLAT alignments. 

Coordinate conversion 

The Genome Browser Convert utility is useful for locating the position of a feature of interest in a different release of the same genome or (in some cases) in a genomeassembly of another species. During the conversion process, portions of the genome inthe coordinate range of the original assembly are aligned to the new assembly while

 preserving their order and orientation. In general, it is easier to achieve successfulconversions with shorter sequences.

When coordinate conversion is available for an assembly, click on the "View" pulldown on the top blue menu bar on the Genome Browser page and select the "InOther Genomes (Convert)" link. You will be presented with a list of thegenome/assembly conversion options available for the current assembly. Select the

genome and assembly to which you'd like to convert the coordinates, then click theSubmit button. If the conversion is successful, the browser will return a list of regionsin the new assembly, along with the percent of bases and span covered by that region.Click on a region to display it in the browser. If the conversion is unsuccessful, theutility returns a failure message.

Lifting coordinates The liftOver tool is useful if you wish to convert a large number of coordinate ranges

 between assemblies. This tool is available in both web-based and command line forms,and supports forward/reverse conversions as well as conversions between species.

Web-based coordinate lifting 

Page 19: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 19/35

To access the graphical version of the liftOver tool, click on "Tools" pulldown in thetop blue menu bar of the Genome Browser, then select  LiftOver  from the menu.

To convert one or more coordinate ranges using the default conversion settings:

1.  Select the genome and assembly from which the ranges were taken ("Original"),as well as the genome and assembly to which the coordinates should beconverted ("New").

2.  Select the Data Format option: Browser Extensible Data format (BED) or  position (coordinates of the form chrN:start-end ).

3.  Enter coordinate ranges in the selected data format into the large text box, one per line.

4.  Click Submit.

Alternatively, you may load the coordinate ranges from an existing data file by enteringthe file name in the upload box at the bottom of the screen, then clicking the SubmitFile button.

The default parameter settings are recommended for general purpose use of the liftOver tool. However, you may want to customize settings if you have several very largeregions to convert.

Command-line coordinate lifting The command-line version of liftOver offers the increased flexibility and performancegained by running the tool on your local server. This utility requires access to a Linux

 platform. The executable file may be downloaded here. Command-line liftOver 

requires a UCSC-generated over.chain file as input. Pre-generated files for a givenassembly can be accessed from the assembly's "LiftOver files" link onthe Downloads  page. If the desired conversion file is not listed, send a request tothe genome mailing list and we may be able to generate one for you.

Downloading genome data 

Most of the underlying tables containing the genomic sequence and annotation datadisplayed in the Genome Browser can be downloaded. All of the tables are freelyusable for any purpose except as indicated in the README.txt file in the downloaddirectories. This data was contributed by many researchers, as listed on the Genome

Browser  Credits  page. Please acknowledge the contributor(s) of the data you use.

Downloading the data Genome data can be downloaded in two different ways:

-- Via ftp:The UCSC Genome Bioinformatics ftp site contains download directoriesfor all genome versions currently accessible in the Genome Browser. The ftp

commandftp://hgdownload.cse.ucsc.edu/goldenPath/ will take you to adirectory that contains the genome download directories. This download method isrecommended if you plan to download a large file or multiple files from a single

directory. Use the mget command to download multiple files: mget filename1ilename2, or mget -a (to download all the files in the directory).

Page 20: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 20/35

-- Via the Downloads link: Click the Downloads link on the left side bar on theUCSC Genome Bioinformatics home page to display a list of all databasedirectories available for download. If the data you wish to download pre-dates theassembly versions listed, look in the archives accessible from the Archive link onthe home page.

Types of data available There may be several download directories associated with each version of a genomeassembly: the full data set (bigZips), the full data set by chromosome (chromosome),the annotation database tables (database), and one or more sets of comparative cross-species alignments.

 BigZips contains the entire draft of the genome in chromosome and/or contig form.Depending on the genome, this directory may contain some or all of the following files:

-- chromAgp.zip: Description of how the assembly was generated, unpacking to onefile per chromosome.

-- chromFa.zip: The assembly sequence chromosomes, in one file per chromosome.Repeats from RepeatMasker and Tandem Repeats Finder are shown in lower case;non-repeating sequence is in upper case. The main assembly is contained inthe chrN.fa files, where chrN is the name of the chromosome.The chrN_random.fa files contain clones that are not yet finished or cannot be

 placed with certainty at a specific place on the chromosome. In some cases,including the human HLA region on chromosome 6, the chrN_random.fafiles alsocontain haplotypes that differ from the main assembly.

-- chromFaMasked.zip: The assembly sequence chromosomes, in one file per chromosome. Repeats are masked by capital Ns; non-repeating sequence is shownin upper case.

-- chromOut.zip: RepeatMasker .out file for chromosomes, generated byRepeatMasker at the -s sensitive setting.

-- chromTrf.zip: Tandem Repeats Finder locations, filtered to keep repeats with period less than or equal to 12, translated into one .bed file per chromosome.

-- contigAgp.zip: Description of how the assembly was generated from fragments ata contig layout level.

-- contigFa.zip: The assembly sequence contigs, in one file per contig. All contigsare in forward orientation relative to the chromosome. In some cases, this meansthat contigs will be reversed relative to their orientation in the NCBI assembly.Repeats are shown in lower case; non-repeating sequence is shown in upper case.

-- contigFaMasked.zip: The assembly sequence contigs, in one file per contig.Repeats are masked by capital N s; non-repeating sequence is shown in upper case.

-- contigOut.zip: RepeatMasker .out file for contigs, generated by RepeatMasker atthe -s sensitive setting.

-- contigTrf.zip: Tandem Repeats Finder locations, filtered to keep repeats with period less than or equal to 12, and translated into one .bed file per contig.

-- database.zip: The Genome Browser database as tab-delimited files and associated

MySQL table-creation tiles (eliminated in later assemblies due to size restrictions).-- est.fa.zip: Sequences of all GenBank ESTs for the selected species.

Page 21: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 21/35

-- liftAll.zip: The offsets of contigs within chromosomes.

-- mrna.zip: mRNAs in GenBank from the selected species.

-- refmrna.zip: RefSeq mRNAs from the selected species.

-- upstream1000.zip: Sequences 1000 bases upstream of annotated transcription start

of RefSeq genes. This includes only cases where the transcription start is annotatedseparately from the coding region start.

-- upstream2000.zip: Same as upstream1000, but with 2000 bases.

-- upstream5000.zip: Same as upstream1000, but with 5000 bases.

-- xenoMrna.zip: All GenBank mRNAs from species other than that of the selectedone.

Chromosomes contains the assembled sequence for the genome in separate files for each chromosome in a zipped fasta format. The main assembly can be found inthe chrN.fa files, where N is the name of the chromosome. The chrN_random.fa files

contain clones that are not yet finished or cannot be placed with certainty at a specific place on the chromosome. In some cases, the chrN_random.fa files also containhaplotypes that differ from the main assembly.

 Database contains all of the positional and non-positional tables in the genomeannotation database. Each table is represented by 2 files:

-- .sql file: the MySQL commands used to create the table.

-- .txt.gz file: the MySQL database table data in tab-delimited format andcompressed with gzip.

Schema descriptions for all tables in the genome annotation database may be viewed byusing the "describe table schema" button in the Table Browser . 

Cross-species alignments directories, such asthe vsMm4 and humorMm3Rn3 directories in the hg16 assembly, contain pairwise andmultiple species alignments and filtered alignment files used to produce cross-speciesannotations. For more information, refer to the READMEs in these directories and thedescription of the Multiple Alignment Format (MAF).

Creating custom annotation tracks 

The Genome Browser provides dozens of aligned annotation tracks that have beencomputed at UCSC or have been provided by outside collaborators. In addition to thesestandard tracks, it is also possible for users to upload their own annotation data for temporary display in the browser. These custom annotation tracks are viewable only onthe machine from which they were uploaded and are automatically discarded 48 hoursafter the last time they are accessed, unless they are saved in a Session. Optionally,users can make custom annotations viewable by others as well.

Custom tracks are a wonderful tool for research scientists using the Genome Browser.

Because space is limited in the Genome Browser track window, many excellentgenome-wide tracks cannot be included in the standard set of tracks packaged with the

Page 22: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 22/35

 browser. Other tracks of interest may be excluded from distribution because theannotation track data is too specific to be of general interest or can't be shared untilournal publication. Many individuals and labs have contributed custom tracks to the

Genome Browser website for use by others. To view a list of these custom annotationtracks, click the Custom Tracks link on the Genome Browser home page.

Custom annotation tracks are similar to standard tracks, but never become part of theMySQL genome database. Each track has its own controller and persists even when notdisplayed in the Genome Browser window, e.g. if the position changes to a range thatno longer includes the track. Typically, custom annotation tracks are aligned under corresponding genomic sequence, but they can also be completely unrelated to the data.For example, a track can be displayed under a long sequence consisting of millionsof  N s.

Genome Browser annotation tracks are based on files in line-oriented format. Each linein the file defines a display characteristic for the track or defines a data item within the

track. Annotation files contain three types of lines: browser lines, track lines, and datalines. Empty lines and those starting with "#" are ignored.

To construct an annotation file and display it in the Genome Browser, follow thesesteps:

Step 1. Format the data set Formulate your data set as a tab-separated file using one of the formats supported bythe Genome Browser. Annotation data can be in standard GFF format or in a formatdesigned specifically for the Human Genome Project or UCSC Genome Browser,

including  bedGraph, GTF, PSL, BED,  bigBed, WIG,  bigWig, BAM, VCF, MAF, BEDdetail, Personal Genome SNP,  broadPeak , narrowPeak , and microarray (BED15). GFFand GTF files must be tab-delimited rather than space-delimited to display correctly.Chromosome references must be of the form chrN (the parsing of chromosomenames is case-sensitive). You may include more than one data set in your annotationfile; these need not be in the same format.

Step 2. Define the Genome Browser display characteristics Add one or more optional  browser lines to the beginning of your formatted data file toconfigure the overall display of the Genome Browser when it initially shows your annotation data. Browser lines allow you to configure such things as the genome

 position that the Genome Browser will initially open to, the width of the display, andthe configuration of the other annotation tracks that are shown (or hidden) in the initialdisplay. NOTE: If the browser position is not explicitly set in the annotation file, theinitial display will default to the position setting most recently used by the user, whichmay not be an appropriate position for viewing the annotation track.

Step 3. Define the annotation track display characteristics Following the browser lines--and immediately preceding the formatted data--adda track line to define the display attributes for your annotation data set. Track linesenable you to define annotation track characteristics such as the name, description,colors, initial display mode, use score, etc. The track  type=<track_type> attribute is

required for some tracks. If you have included more than one data set in your 

Page 23: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 23/35

annotation file, insert a track line at the beginning of each new set of data.

Example 1:  Here is an example of a simple annotation file that contains a list of chromosomecoordinates.

browser position chr22:20100000-20100900track name=coords description="Chromosome coordinates list"visibility=2chr22 20100000 20100100chr22 20100011 20100200chr22 20100215 20100400chr22 20100350 20100500chr22 20100700 20100800chr22 20100700 20100900

Click  here to view this track in the Genome Browser.

Example 2:  Here is an example of an annotation file that defines 2 separate annotation tracks inBED format. The first track displays blue one-base tick marks every 10000 bases on chr 22. The second track displays red 100-base features alternating with blank space in thesame region of chr 22.

browser position chr22:20100000-20140000track name=spacer description="Blue ticks every 10000 bases"color=0,0,255,chr22 20100000 20100001chr22 20110000 20110001

chr22 20120000 20120001track name=even description="Red ticks every 100 bases, skip 100"color=255,0,0chr22 20100000 20100100 firstchr22 20100200 20100300 secondchr22 20100400 20100500 third

Click  here to view this track in the Genome Browser.

Example 3:  This example shows an annotation file containing one data set in BED format. Thetrack displays features with multiple blocks, a thick end and thin end, and hatch marks

indicating the direction of transcription. The track labels display in green (0,128,0), andthe gray level of the each feature reflects the score value of that line. NOTE: The track name line in this example has been split over 2 lines for documentation purposes. If you

 paste this example into the Genome Browser, you must remove the line break to displaythe track successfully. Click  here for a copy of this example that can be pasted into the

 browser without editing.

browser position chr22:1000-10000browser hide alltrack name="BED track" description="BED format custom track example"visibility=2

color=0,128,0 useScore=1chr22 1000 5000 itemA 960 + 1100 4700 0 2 1567,1488, 0,2512chr22 2000 7000 itemB 200 - 2200 6950 0 4 433,100,550,1500

Page 24: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 24/35

0,500,2000,3500

Click  here to view this track in the Genome Browser.

Step 4. Display your annotation track in the Genome Browser 

To view your annotation data in the Genome Browser, open the Genome Browser  home page and click the Genome Browser link in the top menu bar. On the Gateway page thatdisplays, select the genome and assembly on which your annotation data is based, thenclick the "add custom tracks" button. (Note: if the Gateway displays the "managecustom tracks" button instead, see Displaying and Managing Custom Tracks for information on how to display your track.)

On the Add Custom Tracks page, load the annotation track data or URL for your custom track into the upper text box and the track documentation (optional) into thelower text box, then click the Submit button. Tracks may be loaded by entering text, aURL, or a pathname on your local computer. The track  type=<track_type> attribute is

required for some tracks. For more information on these methods, as well asinformation on creating and adding track documentation, see Loading a Custom Track into the Genome Browser . 

If you encounter difficulties displaying your annotation, read thesection Troubleshooting Annotation Display Problems. 

Step 5. (Optional) Add details pages for individual track features After you've constructed your track and have successfully displayed it in the GenomeBrowser, you may wish to customize the details pages for individual track features. TheGenome Browser automatically creates a default details page for each feature in thetrack containing the feature's name, position information, and a link to thecorresponding DNA sequence. To view the details page for a feature in your customannotation track (in full, pack, or squish display mode), click on the item's label in theannotation track window.

You can add a link from a details page to an external web page containing additionalinformation about the feature by using the track line url attribute. In the annotation file,set the url attribute in the track line to point to a publicly available page on a web server.The url attribute substitutes each occurrence of '$$' in the URL string with the namedefined by the nameattribute. You can take advantage of this feature to provide

individualized information for each feature in your track by creating HTML anchorsthat correspond to the feature names in your web page.

Example 4:  Here is an example of a file in which the url attribute has been set to point to thefile http://genome.ucsc.edu/goldenPath/help/clones.html . The '#$$' appended to the endof the file name in the example points to the HTML NAME tag within the file thatmatches the name of the feature (cloneA, cloneB, etc.). NOTE: The track line in thisexample has been split over 2 lines for documentation purposes. If you paste thisexample into the browser, you must remove the line break to display the track successfully. Click  here for a copy of this example that can be pasted into the browser 

without editing.

Page 25: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 25/35

browser position chr22:10000000-10020000browser hide alltrack name=clones description="Clones" visibility=2color=0,128,0 useScore=1url="http://genome.ucsc.edu/goldenPath/help/clones.html#$$"chr22 10000000 10004000 cloneA 960

chr22 10002000 10006000 cloneB 200chr22 10005000 10009000 cloneC 700chr22 10006000 10010000 cloneD 600chr22 10011000 10015000 cloneE 300chr22 10012000 10017000 cloneF 100

Click  here to display this track in the Genome Browser.

Step 6. (Optional) Share your annotation track with others  The previous steps showed you how to upload annotation data for your own use onyour own machine. However, many users would like to share their annotation data withmembers of their research group on different machines or with colleagues at other sites.To learn how to make your Genome Browser annotation track viewable by others, readthe section Sharing Your Annotation Track with Others. 

Loading a Custom Track into the Genome Browser 

Using the Genome Browser's custom track upload and management utility, annotationtracks may be added for display in the Genome Browser, deleted from the GenomeBrowser, or updated with new data and/or display options. You may also use thisinterface to upload and manage custom track sets for multiple genome assemblies.

To load a custom track into the Genome Browser:

Step 1. Open the Add Custom Tracks page Click the "add custom tracks" button on the Genome Browser  Gateway page. (Note: if one or more tracks have already been uploaded during the current Browser session,additional tracks may be loaded on the Manage Custom Tracks page. In this case, the

 button on the Gateway page will be labeled "manage custom tracks" and willautomatically direct you to the track management page. See Displaying and ManagingCustom Tracks for more information.)

Step 2. Load the custom track data The Add Custom Tracks page contains separate sections for uploading custom track 

data and optional custom track descriptive documentation. Load the annotation datainto the upper section by one of the following methods:

  Enter one or more URLs for custom tracks (one per line) in the data text box.The Genome Browser supports both the HTTP and FTP (passive-only)

 protocols.  Data provided by a URL may need to be proceeded by a separate line

defining type=<track_type> required for some tracks, for example such as"track type=broadPeak".

  Click the "Browse" button directly above the data text box, then choose acustom track file from your local computer, or type the pathname of the file intothe "upload" text box adjacent to the "Browse" button. The custom track data

Page 26: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 26/35

may be compressed by any of the following programs: gzip (.gz ), compress (.Z ),or bzip2 (.bz2). Files containing compressed data must include the appropriatesuffix in their names.

  Paste or type the custom track data directly into the data box. Because the textin this box will not be saved to a file, this method is not recommended unless

you have a copy of the data elsewhere.

Multiple custom tracks may be uploaded at one time on the Add Custom Tracks pagethrough one of the following methods:

  Put all the tracks into the same file (rather than separate files), then load the filevia the Browse button.

  Place your track files in a web-accessible location on your server, then loadthem into the Genome Browser by pasting their URLs into the data box.

Step 3. (Optional) Load the custom track description page If desired, you can provide optional descriptive text (in plain or HTML format) toaccompany your custom track. This text will be displayed when a user clicks the track'sdescription button on the Genome Browser annotation tracks page. Descriptive textmay be loaded by one of the following methods:

  Click the "Browse" button directly above the documentation text box, thenchoose a text file from your local computer, or type the pathname of the file intothe "upload" text box adjacent to the "Browse" button.

  Paste or type the custom track data directly into the data box. Note that the textin this box will not be saved to a file; therefore, this method is not

recommended except for temporary documentation purposes.  If your descriptive text is located on a website, you can reference it from your 

custom track file by defining the track line attribute"htmlUrl": htmlUrl=<external_url >. In this case, there is no need to insertanything into the documentation text box.

To format your description page in a style that is consistent with standard GenomeBrowser tracks, click the template link below the documentation text box for anHTML template that may be copied and pasted into a file for editing.

If you load multiple custom tracks simultaneously using one of the methods described

in Step 2, a track description can be associated only with the last custom track loaded,unless you upload the descriptive text using the track line "htmlUrl" attribute describedabove.

Step 4. Upload the track  Click the Submit button to load your custom track data and documentation into theGenome Browser. If the track uploads successfully, you will be directed to the customtrack management page where you can display your track, update an uploaded track,add more tracks, or delete uploaded tracks. If the Genome Browser encounters a

 problem while loading your track, it will display an error. See thesection Troubleshooting Annotation Display Problems for help in diagnosing custom

track problems.

Page 27: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 27/35

Displaying and Managing Custom Tracks 

After a custom track has been successfully loaded into the Genome Browser, you candisplay it -- as well as manage your entire custom track set -- via the options on theManage Custom Tracks page. This page automatically displays when a track has been

uploaded into the Genome Browser (see Loading a Custom Track into the GenomeBrowser ). Alternatively, you can access the track management page by clicking the"manage custom tracks" button on the Gateway or Genome Browser annotation tracks

 pages. (Note that the track management page is available only if at least one track has been loaded during the current browser session; otherwise, this button is labeled "addcustom tracks" and opens the Add Custom Track page.)

The table on the Manage Custom Tracks page shows the current set of uploaded customtracks for the genome and assembly specified at the top of the page. If tracks have beenloaded for more than one genome assembly, pulldown lists are displayed; to view theuploaded tracks for a different assembly, select the desired genome and assemblyoption from the lists.

The following track information is displayed in the Manage Custom Tracks table:

  Name: a hyperlink to the Update Custom Track page where you can updateyour track configuration and data.

  Description: the value of the "description" attribute from the track line, if  present. If no description is included in the input file, this field contains thetrack name.

  Type: the track type, determined by the Browser based on the format of the

data.  Doc: displays "Y" (Yes) if a description page has been uploaded for the track;

otherwise the field is blank.  Items: the number of data items in the custom track file. An item count is not

displayed for tracks lacking individual items (e.g. wiggle format data).  Pos: the default chromosomal position defined by the track file in either the

 browser line "position" attribute or the first data line. Click this link to open theGenome Browser or Table Browser at the specified position (Note: only thechromosome name is shown in this column). The Pos column remains blank if the track lacks individual items (e.g. wiggle format data) and the browser line"position" attribute hasn't been set.

Displaying a custom track in the Genome Browser Click the "go to genome browser" button to display the entire custom track set for thespecified genome assembly in the Genome Browser. By default, the browser will opento the position specified in the browser line "position" attribute or first data line of thefirst custom track in the table, or the last-accessed Genome Browser position if thetrack is in wiggle data format. To open the display at the default position for another track in the list, click the track's position link in the Pos column.

Viewing a custom track in the Table Browser Click the "go to table browser" button to access the data for the custom track set in the

Table Browser. The custom tracks will be listed in the "Custom Tracks" group

Page 28: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 28/35

 pulldown list.

Loading additional custom tracks To load a new custom track into the currently displayed track set, click the "add customtracks" button. To change the genome assembly to which the track should be added,

select the appropriate options from the pulldown lists at the top of the page. For instructions on adding a custom track on the Add Custom Tracks page, see Loading aCustom Track into the Genome Browser . 

Removing one or more custom tracks To remove custom tracks from the uploaded track set, click the checkboxes in the"delete" column for all tracks you wish to remove, then click the "delete" button. Acustom track may also be removed by clicking the "Remove custom track" button onthe track's description page. Note: removing the track from the Genome Browser doesnot delete the track file from your server or local disk.

Updating a custom track  To update the stored information for a loaded custom track, click the track's link in the"Name" column in the Manage Custom Tracks table. A custom track may also beupdated by clicking the "Update custom track" button on the track's description page.

The Update Custom Track page provides sections for modifying the track configurationinformation (the browser lines and track lines), the annotation data, and the descriptivedocumentation that accompanies the track. Existing track configuration lines aredisplayed in the top "Edit configuration" text box. In the current implementation of thisutility, the existing annotation data is not displayed. Because of this, the data cannot be

incrementally edited through this interface, but instead must be fully replaced using oneof the data entry methods described in Loading a Custom Track into the GenomeBrowser . If description text has been uploaded for the track, it will be displayed in thetrack documentation edit box, where it may be edited or completely replaced. Once youhave completed your updates, click the Submit button to upload the new data into theGenome Browser.

If the data or description text for your custom track was originally loaded from a file onyour hard disk or server, you should first edit the file, then reload it from the UpdateCustom Track page using the "Browse" button. Note that edits made on this page todescription text uploaded from a file will not be saved to the original file on your 

computer or server. Because of this, we recommend that you use the documentationedit box only for changes made to text that was typed or pasted in.

Browser Lines 

Browser lines configure the overall display of the Genome Browser window when your annotation file is uploaded. Each line defines one display attribute. Browser linesconsist of the format:

browser attribute_name attribute_value(s) 

For example, if the browser line browser position chr22:1-20000 is included in the

Page 29: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 29/35

annotation file, the Genome Browser window will initially display the first 20000 basesof chr 22.

The following browser line attribute name/value options are available. Thevalue track_name must be set to the name of the primary table on which the the track is

 based. To identify this table, open up the Table Browser , select the correct genomeassembly, then select the track name from the track list. The table list will show the

 primary table. Alternatively, the primary table name can be obtained from a mouseover on the track name in the track control section.

 Note that composite track subtracks are not valid track_name values. To find thesymbolic name of a composite track, look in the tableName field in the trackDb table,or mouseover the track name in the track control section. It is not possible to displayonly a subset of the subtracks at this time.

  position <position > - Determines the part of the genome that the GenomeBrowser will initially open to, in chromosome:start-end format.

  hide all - Hides all annotation tracks except for those listed in the custom track file.

  hide <track_primary_table_name(s) > - Hides the listed tracks. Multiple track names should be space-separated.

  dense all - Displays all tracks in dense mode. NOTE: Use the "all" optioncautiously. If the browser display includes a large number of tracks or a large

 position range, this option may overload your browser's resources and cause anerror or timeout.

  dense <track_primary_table_name(s) > - Displays the specified tracks in dense

mode. Symbolic names must be used. Multiple track names should be space-separated.  pack all - Displays all tracks in pack mode. See NOTE for "dense all".  pack <track_primary_table_name(s) > - Displays the specified tracks in pack 

mode. Symbolic names must be used. Multiple track names should be space-separated.

  squish all - Displays all tracks in squish mode. See NOTE for "dense all".  squish <track_primary_table_name(s) > - Displays the specified tracks in

squish mode. Symbolic names must be used. Multiple track names should bespace-separated.

  full all - Displays all tracks in full mode. See NOTE for "dense all".  full <track_primary_table_name(s) > - Displays the specified tracks in full

mode. Symbolic names must be used. Multiple track names should be space-separated.

Definition: <track_primary_table_name(s)> . You can find the primary table name byclicking "View Table Schema" from the track's description page, or from the TableBrowser. It will be listed as the Primary Table. Alternatively, you can mouse-over thetrack label in the Browser and look at the URL the link points to. The part after the g=in the URL is the track's primary table name (e.g., for UCSC Genes you will seeg=knownGene in the URL. The track primary table is knownGene).

 Note that the Genome Browser will open to the range defined in the Gateway page search term box or the position saved as the default unless the browser line

Page 30: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 30/35

 position attribute is defined in the annotation file. Although this attribute is optional, it'srecommended that you set this value in your annotation file to ensure that the track willappear in the display range when it is uploaded into the Genome Browser.

Track Lines 

Track lines define the display attributes for all lines in an annotation data set. If more thanone data set is included in the annotation file, each group of data must be preceded by atrack line that describes the display characteristics for that set of data. A track line begins

with the word track, followed by one or more attribute=value pairs. Unlike browser lines - in which each attribute is defined on a separate line - all of the track attributes for agiven set of data are listed on one line with no line breaks. The inadvertent insertion of aline break into a track line will generate an error when you attempt to upload theannotation track into the Genome Browser.

The following track line attribute=value pairs are defined in the Genome Browser:

  name=<track_label > - Defines the track label that will be displayed to the left of the track in the Genome Browser window, and also the label of the track control atthe bottom of the screen. The name can consist of up to 15 characters, and must beenclosed in quotes if the text contains spaces. We recommend that the track_label

 be restricted to alpha-numeric characters and spaces to avoid potential parsing problems. The default value is "User Track".

  description=<center_label > - Defines the center label of the track in the GenomeBrowser window. The description can consist of up to 60 characters, and must beenclosed in quotes if the text contains spaces. The default value is "User Supplied

Track".  type=<track_type > - Defines the track type. The track type attribute is required

for  BAM, BEDdetail,  bedGraph,  bigBed,  bigWig,  broadPeak , narrowPeak , Microarray, VCF andWIG tracks.

  visibility=<display_mode > - Defines the initial display mode of the annotationtrack. Values for display_mode include: 0 - hide, 1 - dense, 2 - full, 3 - pack, and 4- squish. The numerical values or the words can be used, i.e. full mode may bespecified by "2" or "full". The default is "1".

  color=<RRR,GGG,BBB > - Defines the main color for the annotation track. Thetrack color consists of three comma-separated RGB values from 0-255. The default

value is 0,0,0 (black).  itemRgb=On - If this attribute is present and is set to "On", the Genome Browser 

will use the RGB value shown in the itemRgb field in each data line of theassociated BED track to determine the display color of the data on that line.

  colorByStrand=<RRR,GGG,BBB RRR,GGG,BBB > - Sets colors for + and -strands, in that order. The colors consist of three comma-separated RGB valuesfrom 0-255 each. The default is 0,0,0 0,0,0 (both black).

  useScore=<use_score > - If this attribute is present and is set to 1, the  score field ineach of the track's data lines will be used to determine the level of shading in whichthe data is displayed. The track will display in shades of gray unlessthe color attribute is set to 100,50,0 (shades of brown) or 0,60,120 (shades of blue).The default setting for useScore is "0". This table shows the Genome Browser's

Page 31: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 31/35

translation of BED score values into shades of gray:

shade

score in

range

166

167-

277

278-

388

389-

499

500-

611

612-

722

723-

833

834-

944

945

  group=<group > - Defines the annotation track group in which the custom track will display in the Genome Browser window. By default,  group is set to "user",which causes custom tracks to display at the top of the track listing in the group"Custom Tracks". The value for "group" must be the "name" of one of the

 predefined track groups. To get a list of allowable group names for an assembly, goto the table browser and select "group: All Tables", "table: grp", and "get output".Entries in the "name" column may be used.(Note that mirrors may define other group names in the grp table.)

 priority=<priority > - When the group attribute is set, defines the display positionof the track relative to other tracks within the same group in the Genome Browser window. If  groupis not set, the priority attribute defines the track's order relative toother custom tracks displayed in the default group, "user".

  db=<UCSC_assembly_name > - When set, indicates the specific genome assemblyfor which the annotation data is intended; the custom track manager will display anerror if a user attempts to load the track onto a different assembly. Any valid UCSCassembly ID may be used (eg. hg18, mm8, felCat1, etc.). The default setting is

 blank, allowing the custom track to be displayed on any assembly.  offset=<offset > - Defines a number to be added to all coordinates in the annotation

track. The default is "0".

  maxItems=<# > - Defines the maximum number of items the track can contain.The default value is 250. Be aware that tracks with an extremely large number of items can cause system instability. The Kent source utility  bedItemOverlapCountcan assist in analyzing base overlap with large tracks.

  url=<external_url > - Defines a URL for an external link associated with this track.This URL will be used in the details page for the track. Any '$$' in this string thiswill be substituted with the item name. There is no default for this attribute.

  htmlUrl=<external_url > - Defines a URL for an HTML description page to bedisplayed with this track. There is no default for this attribute. A template for astandard format HTML track description is here. 

  bigDataUrl=<external_url > - Defines a URL to the data file

for  BAM,  bigBed,  bigWig or  VCF tracks. This is a required attribute for thosetrack types. There is no default for this attribute.

Sharing Your Annotation Track with Others 

To make your Genome Browser annotation track viewable by people on other machines or at other sites, follow the steps below. (Note that some of the URL examples in this sectionhave been broken up into 2 lines for documentation display purposes).

Step 1. Put your formatted annotation file on your web site. Be sure that the file

Page 32: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 32/35

 permissions allow it to be read by others.

Step 2. Construct a URL that will link this annotation file to the Genome Browser. TheURL must contain 3 pieces of information specific to your annotation data:

  The species or genome assembly on which your annotation data is based. Toautomatically display the most recent assembly for a given organism, set

the org parameter: e.g.org=human. To specify a particular genome assembly for an

organism, use the db parameter, db=database_name, where database_name is theUCSC code for the genome assembly. For a list of these codes, see the GenomeBrowser  FAQ. Examples of this include: db=hg16 (Human July 2003 assembly),db=mm6 (Mouse Mar. 2005 assembly).

  The genome position to which the Genome Browser should initially open. This

information is of the form position=chr_position, where chr_position is achromosome number, with or without a set of coordinates. Examples of thisinclude: position=chr22, position=chr22:15916196-31832390.

  The URL of the annotation file on your web site. This information is of the

form hgt.customText=URL, where URL points to the annotation file on your website. An example of an annotation file URLis http://genome.ucsc.edu/goldenPath/help/test.bed . 

  If a login and password is required to access data loaded through a URL (e.g., viahttps: protocol), this information can be included in the URL using the format

 protocol://user:[email protected]/somepath. Only Basic Authentication issupported for HTTP. Note that passwords included in URLs are not protected. If a

 password contains a non-alphanumeric character, such as $, the character must bereplaced by the hexadecimal representation for that character. For example, in the

 password mypwd$wk, the $ character should be replaced by %24, resulting in themodified password mypwd%24wk.

Combine the above pieces of information into a URL of the following format (theinformation specific to your annotation file is highlighted):

http://genome.ucsc.edu/cgi-bin/hgTracks?org= organism_name&

position=chr_position&hgt.customText=URL.

Example 10:  The following URL will open up the Genome Browser window to display chr 22 of the

latest human genome assembly and will show the annotation track pointed to by the URLhttp://genome.ucsc.edu/goldenPath/help/test.bed:

http://genome.ucsc.edu/cgi-bin/hgTracks?org=human&position=chr22&hgt.customText=http://genome.ucsc.edu/goldenPath/help/test.bed  

Step 3. Provide the URL to others. To upload a custom annotation track pointed to by aURL into the Genome Browser, paste the URL into the large text edit box on the AddCustom Tracks page, then click the Submit button.

If you'd like to share your annotation track with a broader audience, send the URL for your 

track  — along with a description of the format, methods, and data used — to the UCSC

Page 33: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 33/35

Genome mailing list [email protected]

Example 11:  If you would like to share a URL that your colleague can click on directly, rather thanloading it in the Custom Track tool (as in Example 10), then the URL will need a few extra

 parameters. Let's assume that your data is on a server at your institution in one of the largedata formats:  bigBed,  bigWig, BAM, or  VCF. In this case, the URL must include

an hgct_customText parameter, which simulates the text box on the Custom Tracks page.

Also, the URL must include the bigDataUrl that points to the data file on your server. So,a clickable URL that opens a remote bigBed track for the hg18 assembly to a certainlocation on chr21 would look like this:

http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg18&position=chr21:33038447-33041505&hgct_customText=track%20type=bigBed%20name=myBigBedTrack%20description=%22a%20bigBed%20track%22%20visibility=full%20bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/bigBedExample.bb

Tip: Multiple tracks can be placed into one custom track submission. To do so, create anew file that contains the track lines to each file that will be included. To submit thiscustom set of tracks, merely use the URL to this new file.

Troubleshooting Annotation Display Problems 

Occasionally users encounter problems when uploading annotation files to the GenomeBrowser. In most cases, these problems are caused by errors in the format of the

annotation file and can be tracked down using the information displayed in the error message. This section contains suggestions for resolving common display problems. If you are still unable to successfully display your data, pleasecontact [email protected] for further assistance. Messages sent to this address

ill be posted to the moderated genome mailing list, which is archived on a public

Web-accessible pipermail archive. This archive may be indexed by non-UCSC sitessuch as Google.

Problem: When I try one of your examples by cutting and pasting it into the GenomeBrowser, I get an error message.Solution: Check that none of the browser lines, track lines, or data lines in your 

annotation file contains a line break. If the example contains GFF or GTF data lines,check that all the fields are tab-separated rather than space-separated.

Problem: When I click the submit button, I get the error message "line 1 of custominput:".Solution: Check that none of the browser lines, track lines, or data lines in your annotation file contains a line break. A common source for this problem is the track line: all of the attribute pairs must on the same line and must not be separated by a line

 break. If you are uploading your annotation file by pasting it into the text box on theGenome Browser Gateway page, check that the cut-and-paste operation did notinadvertently insert unwanted line feeds into the longer lines.

Page 34: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 34/35

Problem: When I click the submit button, I get the error message "line # of custominput: missing = in var/val pair".Solution: Check for incorrect syntax in the track lines in the annotation file. Be surethat each track line attribute pair consists of the format attribute=attribute name.

Problem: When I click the submit button, I get the error message "line # of custominput: BED chromStarts[i] must be in ascending order".Solution: This is most likely caused by a logical conflict in the Genome Browsesoftware. It accepts custom GFF tracks that have multiple "exons" at the same position,

 but not BED tracks. Because the browser translates GFF tracks to BED format beforestoring the custom track data, GFF tracks with multiple exons will case an error whenthe BED is read back in. To work around this problem, remove duplicate lines in theGFF track.

Problem: When I click the submit button, the Genome Browser track window displaysOK, but my track isn't visible.Solution: Check the browser and track lines in your annotation file to make sure thatyou haven't accidentally set the display mode for the track to hide. If you are using theAnnotation File box on the Genome Browser Gateway page to upload the track, check that you've entered the correct file name. If neither of these is the cause of the problem,try resetting the Genome Browser to clear any settings that may be preventing theannotation to display. To reset the Genome Browser, click the Click here to reset link on the Gateway page. If the annotation track still doesn't display, you may need to clear the cookies in your Internet browser as well (refer to your Internet browser'sdocumentation for further information).

Problem: I am trying to upload some custom tracks (.gz files) to the genome browser using a URL from a GEO query. However it is not uploading and I receive the errror "line 1 of somefile.gz: thickStart after thickEnd".Solution: The custom track mechanism supports plain BED files (not bigBed) that areof the type  broadPeak  or  narrowPeak . If you supply thetrack  type=<track_type> attribute then the loader will understand the special columns atthe end of each line. Your entry should be two lines, the first to define the track typeand the second for the URL. For example:

track type=broadPeak http://www.ncbi.nlm.nih.gov/geosuppl/...  

Problem: I've gotten my annotation track to display, but now I can't make it go away!How do I remove an annotation track from my Genome Browser display?Solution: Reset the Genome Browser by clicking the Click here to reset link on theGateway page. This should reset your Genome Browser display to its default settingsand remove all your custom tracks. To remove only one track of several, click the Manage Custom Tracks button and delete the track using the checkbox and Delete

 button.

Byte-Range Request Errors 

Occasionally when trying to visualize custom tracks in the Browser, users have received

Page 35: What Does the Genome Browser Do

7/16/2019 What Does the Genome Browser Do

http://slidepdf.com/reader/full/what-does-the-genome-browser-do 35/35

the error message "Byte-range request was ignored by server."

If this occurs, verify that the web server supports byte-range requests by issuing thefollowing command:

> curl -I <URL>

The output should contain the following:

> Accept-Ranges: bytes

If you do not receive that output, you should take the following steps:

  Check with the administrators about the configuration of the server. This willoften solve the problem.

  Find another site with a server that supports byte-ranges.

  Install an Apache alternative http server such as Cherokee.

Getting Started on Track Hubs 

Track hubs are web-accessible directories of genomic data that can be viewed on theUCSC Genome Browser alongside native annotation tracks. Hubs are a useful tool for visualizing a large number of genome-wide data sets. The Track Hub utility allowsefficient access to data sets from around the world through the familiar Genome

Browser interface. Browser users can display tracks from any public track hub that has been registered with UCSC. Additionally, users can import data from unlisted hubs or can set up, display, and share their own track hubs.

For information on using the Track Hub features, refer to the Genome Browser Track Hub User Guide. For specific information on configuring your trackDb.txt file, refer tothe Track Database Definition Document.