Validating genome annotations revisited: gt speck

I have written previously (here and here) about my frustrations with the lack of validation tools for GFF3 data. I had even begun prototyping my own system for writing schemas that could be used to validate GFF3 files (much like XML schemas used in validating XML files). I hadn’t gotten very far, though, before the talented folks behind GenomeTools released the gt speck tool. The speck tool uses a domain-specific language where individuals, users, groups, or communities of practice can explicitly enforce a set of rules regarding how annotations should be formatted, how features should be related, and which attributes are required. Most scientists and software have requirements beyond what is detailed in the GFF3 specification, but in most cases these are implicit and inconsistently enforced. I look forward to integrating the speck tool into my annotation quality control workflow, and I have great faith that the genomics community has a lot of benefit to gain from wide adoption of tools like this.

Leave a comment