A Box Plot helps teams see variation and compare process groups quickly. It is useful before statistical testing because it shows center, spread, overlap, and unusual observations.

Back to BoK Index
MetricMeasurementDecision Support

Definition

A Box Plot, also called a box-and-whisker plot, is a graphical display of a data distribution. The box usually shows the interquartile range from the first quartile to the third quartile, the line inside the box shows the median, and whiskers show the range under a defined rule. Points beyond the whiskers may be shown as potential outliers.

Box plots are especially useful for comparing multiple groups because they show center, spread, skew, and unusual values in a compact form. They support exploratory analysis before hypothesis testing, ANOVA, regression, capability studies, or root cause investigation.

History

Box plots were popularized by statistician John Tukey as part of exploratory data analysis. The intent was to give analysts a simple visual method to understand data shape and compare groups without relying only on averages and tables.

In quality improvement, box plots became common because process data often varies by machine, shift, supplier, product family, operator, material lot, region, or method. A box plot gives teams a quick way to see whether groups behave similarly or differently.

When to Use

Use a Box Plot when comparing continuous data across groups or studying the spread of a single distribution. Good uses include cycle time by shift, dimension by machine, strength by supplier, wait time by clinic, claim duration by region, and test score by training method.

Do not rely on a box plot alone when sample sizes are tiny or when time order matters. For time-ordered behavior, use a run chart or control chart. For detailed distribution shape, use a histogram as a companion view.

Step-by-Step

  1. Define the response. Choose the continuous metric to analyze, such as time, length, weight, cost, temperature, or score.
  2. Select comparison groups. Group data by factor such as machine, shift, supplier, method, product, site, or period.
  3. Check data quality. Confirm measurement system, units, missing values, and obvious data entry errors.
  4. Create the plot. Show median, quartiles, whiskers, and potential outliers for each group.
  5. Compare center. Look at median differences across groups.
  6. Compare spread. Review interquartile range and whisker length to understand variation.
  7. Look for skew and outliers. Unusual points or asymmetric whiskers may indicate special causes, mixed populations, or process issues.
  8. Follow with analysis. Use process knowledge, stratification, hypothesis testing, ANOVA, or regression to confirm what the visual suggests.

Examples

  • Machine comparison: A box plot of part diameter by machine shows one machine with a higher median and wider spread, guiding maintenance and setup investigation.
  • Supplier study: Material strength is compared across four suppliers. One supplier has similar median strength but much larger variation.
  • Service wait time: Clinic wait times by day of week show Monday has a wider distribution and more outliers, prompting staffing review.
  • Training method: Assessment scores by training method show higher median and lower variation for hands-on instruction.
  • Cycle time analysis: A box plot by product family shows that one family drives most long-cycle outliers.

Common Pitfalls

  • Ignoring sample size. Box plots can look stable with very few points, but small samples may be misleading.
  • Overreacting to outliers. Outliers are signals to investigate, not automatic errors to delete.
  • Using only averages elsewhere. The median and spread often tell a different story than the mean alone.
  • Forgetting time order. A box plot hides sequence, trends, shifts, and cycles.
  • Comparing mixed groups. If groups combine different products or conditions, conclusions can be distorted.
  • Assuming statistical significance. Visual differences should be confirmed with appropriate statistical methods when decisions require proof.

Related Tools

Further Reading