I typically use R for visualization, and I am familiar with a few basic visualization types. The simplest way to plot the distribution of a 1-dimensional data set is the histogram, available via the function
hist. However, sometimes I prefer plotting a curve rather than the bars that
hist creates. So given some 1-dimensional data, I typically end up using some combination of
uniq -c to convert the data to X and Y values that can be used with R’s
However, in the last couple of days I learned a nifty little shortcut that will save me the trouble in the future. Rather than pre-processing the data, I can store the histogram as an object and retrieve the X and Y values from that object. For example, this week I wanted to look at the distribution of read lengths in some Illumina data I had just cleaned. I used the following commands to plot the distribution as a curve rather than a histogram. No pre-processing required!
lengths <- read.table("lengthdist.dat") h <- hist(lengths$V1, plot=FALSE) plot(h$mids, h$density, type="l", col="red", main="", xlab="Read length", ylab="Frequency")