R: plot histogram as a curve

I typically use R for visualization, and I am familiar with a few basic visualization types. The simplest way to plot the distribution of a 1-dimensional data set is the histogram, available via the function hist. However, sometimes I prefer plotting a curve rather than the bars that hist creates. So given some 1-dimensional data, I typically end up using some combination of sort and uniq -c to convert the data to X and Y values that can be used with R’s plot function.

However, in the last couple of days I learned a nifty little shortcut that will save me the trouble in the future. Rather than pre-processing the data, I can store the histogram as an object and retrieve the X and Y values from that object. For example, this week I wanted to look at the distribution of read lengths in some Illumina data I had just cleaned. I used the following commands to plot the distribution as a curve rather than a histogram. No pre-processing required!

lengths <- read.table("lengthdist.dat")
h <- hist(lengths$V1, plot=FALSE)
plot(h$mids, h$density, type="l", col="red", main="", xlab="Read length", ylab="Frequency")

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s