I had the opportunity to attend Cold Spring Harbor’s Genome Informatics conference this year. Here are a couple of my favorite highlights.
Michael Schatz’s presentation briefly mentioned metassembly, but a student at Notre Dame (a collaborator and former intern of Schatz’s) presented a poster dedicated to the subject. He implemented a program called Metassembler, which takes as input 2 different assemblies of the same data (perhaps from different assemblers, or the same assembler using two different parameter settings) to derive a consensus assembly of superior quality to the two input assemblies.
When speaking to the student presenting the poster, he said it would be a couple of weeks before the code was ready for distribution. Given that their wiki has not been updated since before the conference, I’m not holding my breath…although I will be very interested to try this software out when it is available.
Another poster I enjoyed was presented by a student (undergrad?) of Dr. John Karro of Miami University Ohio. The student implemented a Hidden Markov Model to identify alternative sites of polyadenlyation in transcripts. The HMM was pretty simple, but I enjoyed discussing the relevant biology of which I was not previously aware.
One of the main sessions included a presentation about the Assemblathon genome assembly contest (since published in Genome Research). I don’t really remember much about which submissions/methods performed better than the others–what I enjoyed most about this presentation was the discussion about different comparison metrics they developed to measure the relative quality of the submitted genome assemblies. One I remember off the top of my head was the cc50 measure–the “correct contiguity” analog of the n50 measure. Essentially, cc50 measures the distance at which 50% of the contigs (or scaffolds?) in the assembly are situated correctly with reference to the other contigs. They defined several other metrics to assess a variety of important characteristics of assembly quality. This is something I will definitely be going to back to look at in more depth.
Steven Salzberg gave a presentation about the GAGE competition his research group conducted. Unlike the Assemblathon, which accepted community submission, the GAGE project was all conducted by Salzberg’s lab. Essentially, they tested a wide variety of already available genome assemblers on real data and tried to assess the relative performance of each assembler. Rather than trying to drive innovation, this project is trying to address practical questions commonly faced by biologists in this new information age of biology.
The takeaway message I got from the presentation is that using traditional assembly quality metrics (n50, n90, longest scaffold length, etc), SOAPdenovo consistently generated the best results, followed by AllPathsLG. However, for each assembly, there was a high-quality reference assembly available for comparison, so they also assessed the quality of the assemblies when contigs and scaffolds were split in regions containing large amounts of error. For these corrected assemblies, AllPathsLG consistently provided the best performance, followed by SOAPdenovo. At the end of the day, SOAPdenovo provides the largest, (perhaps) most complete assemblies, while AllPathsLG provides assemblies that may be a bit smaller but have far fewer errors.
I enjoyed many other presentations and posters, but I only have so much time to sit and reflect on them now!
The end of the semester has given me an opportunity to reflect on my first symposium experience as a graduate student. I attended the RNA in Motion symposium back in September, but since the deadline for my written review was the end of the semester, I have only had (made?) time recently to really reflect on my experience.
Science aside, what I enjoyed most about the symposium was the mealtimes. Each day during lunch, the organizers provided a map showing assigned seats for all the invited speakers, who were spread out so that there were no more than two invited speakers per table. This made the brightest minds at the conference very accessible–I was able to seek out and speak with two accomplished scientists in depth about their research at these lunches. At other conferences I’ve attended, the scientific big wigs either congregated together at the exclusion of us lesser mortals (as with smaller conferences) or were so busy and spread out that you will be lucky to see them anywhere but delivering a keynote address (as with big conferences). Spreading the invited speakers out at lunchtime was genius and made the whole experience much better.
In terms of cool science, I was intrigued by Jonathan Staley’s work with fidelity mechanisms in RNA splicing, particularly the fact that suboptimal splicing substrates can collect in the cytoplasm and undergo translation. I also enjoyed Alain Laederach’s presentation on ribosnitches and how he used exhaustive search approaches to identify mutations whose effect on RNA structure could be counteracted by another mutation.