Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google

Stem and Leaf Plots

A http://en.wikipedia.org/wiki/Stem_and_leaf_diagramStem-and-leaf plot is a simple textual plot of numeric data that is useful to get an idea of the shape of a distribution. It is similar to the graphic histograms that we will see next, but a useful quick place to start for smaller datasets. A stem-and-leaf plot has the advantage of showing actual data values in the plot rather than just a bar indicating frequency.

In reviewing a stem-and-leaf plot we might look to see if there is a clear central value, or whether the data is very spread out. We look at the spread to see if it might be symmetric about the central value or whether there is a skew in one particular direction. We might also look for any data values that are a long way from the general values in the rest of the population.



> stem(wine$Magnesium)

  The decimal point is 1 digit(s) to the right of the |

   7 | 0
   7 | 888
   8 | 0000012444
   8 | 55555566666666666777888888888888899999
   9 | 0000112222233444444
   9 | 55566666666777778888888889
  10 | 000111111111222222233333444
  10 | 55666677778888
  11 | 00011122222233
  11 | 5566678889
  12 | 0001234
  12 | 678
  13 | 24
  13 | 69
  14 |
  14 |
  15 | 1
  15 |
  16 | 2

The stem is to the left of the bar and the leaves are to the right.

Note the change in where the decimal point is.

> stem(wine$Alcohol)

  The decimal point is 1 digit(s) to the left of the |

  110 | 3
  112 |
  114 | 1566
  116 | 1245669
  118 | 1224476
  120 | 000478888867
  122 | 01255599993346777777
  124 | 22235711238
  126 | 004790022779
  128 | 124556783369
  130 | 355555578116677
  132 | 034478902469
  134 | 0015889900126688
  136 | 2347891123345678
  138 | 23346678804
  140 | 266002369
  142 | 01223047889
  144 |
  146 | 5
  148 | 3

Copyright © 2004-2006 [email protected]
Support further development through the purchase of the PDF version of the book.
Brought to you by Togaware.