Drawing genome annotations with AnnotationSketch

In a previous post, I mentioned the GenomeTools package and, in particular, the AnnotationSketch tool. Here I give a few examples of how easy it is to use and customize.

Let’s say I have three sets of genome annotations for the same genomic region. Running AnnotationSketch with the defaults will give the following result.

$ gt sketch maize_default.png zm.chr5.0.yrgate.gff3 zm.chr5.1.cpgat.gff3 zm.chr5.2.msdo.gff3

AnnotationSketch graphic

Looking at the graphic sure beats trying to compare 3 GFF3 files, but unfortunately the graphic is still pretty busy. It took me a few days to figure out how to clean it up, but the AnnotationSketch tool is very customizable. Here are the adjustments I made.

  • I made the graphic a bit wider using the command-line option -width.
  • Since there are annotations from three different sources, I thought it would be nice to have a specific color scheme for each source. I copied the default style file from $GT_INSTALL_DIR/gtdata/sketch/default.style and edited it. The style files allow you to apply styles conditionally using callback functions. You can see in the style file where I’ve used the callback functions to apply colors conditionally based on the file from which the feature came. Additionally, I collapsed all features related to protein-coding transcripts into the mRNA tracks and disabled the display of unrelated feature types by setting the respective max_show_width properties to 0. This drastically reduced the visual clutter. When I run AnnotationSketch again, I must remember to specify this style file with the -style command-line option.
  • Finally, I used the -flattenfiles to collapse all 3 mRNA tracks into a single track. By default, AnnotationSketch will create a separate track for each feature type from each different file, but since I’ve applied a custom color scheme, the separate tracks are unnecessary.

With all these adjustments, I can rerun AnnotationSketch with the following command, producing a much more pleasing graphic.

$ gt sketch -flattenfiles -width 1200 -style gt.three.style maize_improved.png zm.chr5.0.yrgate.gff3 zm.chr5.1.cpgat.gff3 zm.chr5.2.msdo.gff3

AnnotationSketch graphic

The style files are quite easy to work with once you’ve done it once. GenomeTools provides some pretty in-depth documentation, including developer tools for C, Lua, Python, and Ruby if extensive customization is required.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s