Proficiency with a command-line text editor is a valuable skill for a biologist. Computers that have enough resources to tackle large-scale bioinformatics problems often don’t have a graphical interface–the only way to create and modify files is to use an editor like
nano. Programmers have very strong feelings about which is the best, but it doesn’t really make a difference to the average biologist. The nano editor is probably the easiest for a beginner to learn.
When using any kind of text editor, I find that syntax highlighting is an extremely useful feature. It color-codes the text according to its function in the code, making it easy to browse through the code, understand what each statement does, and quickly identify errors. The text editors I mentioned above support syntax highlighting for common programming languages.
Using custom nano configuration files (
.nanorc files), you can define a custom syntax highlighting scheme for a given file type. Since I do a lot of work with gene annotations, I created a custom
.nanorc that defines syntax highlighting for GFF3 files. GFF3 files contain data (not code), but the syntax highlighting can still be useful. The color coding is still pretty simple at this point, but adds another dimension when viewing the file.
Here is the
.nanorc file (to be placed in your home directory)…
## GFF3 files ## syntax "gff3" "\.gff3$" ## Directives color brightblue "^##[a-z].*" ## Resolution symbol color red,white "^###$" ## Reserved attributes color brightblack "(ID|Name|Alias|Parent|Target|Gap|Derives_from|Note|Dbxref|Ontology_term|Is_circular)" ## Comments color white "^#[^#].*"
…and here is an example GFF3 file, with and without highlighting.