Thursday, August 20, 2020

hist

Create a histogram. The hist function is in the base R package. With hist, the only param you need (out of about 20) is a data frame.

There are other histogram functions available from other packages (lattice.histogram, for example). Regardless of which you choose, the advantages and disadvantages are the same.

Use a histogram when you want to view the frequency of occurrence of something across a set of sequential bins. Another way of saying it is a histogram shows a distribution of a variable. Silt depth from shore to 3', from 3' to 6', 6' to 9', and so on. It is distinguished from a bar graph, which is used to compare one or more variables.

Histogram Weakness

You must take care determining the bin size for your histogram. A bin size that is less than optimal can provide a misleading depiction. With hist, experiment with the breaks parameter to adjust the graph for the best information display for your purposes.

R Example

This example shows precipitation for each of 70 U.S. + Puerto Rico cites.

hist(precip, breaks = 25)
histogram of precip in US cities

The breaks parameter directs histogram is to have 25 bins (note that if you count the bins, you don't get 25 -- I'm thinking that R adjusts it +/- from the value of breaks for a better result). Anyway, 4 cities had rainfall between 0-5 inches, and one city had rainfall over 65 inches.

You can have hist show frequency or relative probability. It just changes the left hand scale.

No comments:

Post a Comment

Please help to combat malicious use of the Internet.