If the difference (distance) between the 75th and 25th percentiles of the data is H, then the outside values are those values that are more than 1.5H but no more than 3H above the upper quartile, and those values that are more than 1.5H but no more than 3H below the lower quartile. The far outside values are values that are at least 3H above the upper quartile or 3H below the lower quartile.
If there are only a few outliers, and the boxplot otherwise has the mean (+) value close to the median (the center line in the box) and the median line evenly divides the box, then there may be anomalous data values in a sample that otherwise comes from a normal or near-normal distribution. If there are numerous outliers to one side or the other of the box, or the median line does not evenly divide the box, then the population distribution from which the data were sampled may be skewed. If there are numerous outliers on both sides of the box, the population distribution from which the data were sampled may be heavy-tailed. Here is an example of a boxplot with a possible outlier at the lower range of the data:
Here is a hypothetical example of a boxplot for data sampled from a mixture of two normals with the same mean but different variances:
Here is a hypothetical example of a boxplot for data sampled from a mixture of two normals with the same variance but different means:
Here is a hypothetical example of a boxplot for data sampled from a normal distribution truncated at the left:
Here is a hypothetical example of a boxplot for data sampled from a normal distribution truncated at the right:
Do a keyword search of PROPHET StatGuide.
Back to StatGuide home page.
©1996 BBN Corporation All rights reserved.