Thursday, October 8, 2020

boxplot

Boxplots are a fast way to judge spread, shape, and center. Also called box and whisker plots, they depict the five number summary in a structured shape.

Use a boxplot when you want a visual summary of the data set, and you don't have (or don't want to use) many data points. I think the main negative thing about boxplots is that you may need to educate people what they are seeing, as opposed to a distribution or bar chart, which are ubiquitous in public culture.

The boxplot function has a pile of options, but you only need the data and count options to show something.

> boxplot(count ~ spray, data = InsectSprays, horizontal = TRUE)

Values are represented by the lines around each box.

min - leftmost whisker. Insecticides A and B have min values about 6.

IQR - the box shape. Insecticide F has the widest IQR, and insecticide D has the narrowest.

median - the heavy line within the box

max - the rightmost whisker. Insecticide D has a max equal to the upper value of its IQR

outliers - circles outside a box (C and D in this plot both have an outlier)



No comments:

Post a Comment

Please help to combat malicious use of the Internet.