Box plot

A box plot is a very efficient way of summarizing distributional information of variables. By default the command graph box depicts:

  • the median
  • the 25th and the 75th percentile
  • the adjacent values (the largest/lowest values that are no further away from the nearest quartile than 1.5 times the inter quartile range)
  • plus the remaining outliers
* Box plots
sysuse auto.dta, clear 

// Box plot works similar to bar graph.
// The command follows same structure of yvars and over().
// Default shows median, 25th and 75th percentile, upper and lower adjacent values and outside values
graph box price 

// use option over() to display box plot for one variable by different groups 
graph box price, over(foreign, total)
// take outliers out
graph box price, over(foreign, total) nooutsides
// remove note
graph box price, over(foreign, total) nooutsides note("")
// relabel last category
graph box price, over(foreign, total relabel(3 "All cars")) nooutsides note("") 
// add text at arbitrary spot based on coordination sytem
graph box price, over(foreign, total relabel(3 "All cars")) nooutsides note("") ///
text(9100 14 "You can add") text(10000 50 "text") text(9100 86 "here")