Recently I came across this nice exploratory data analysis technique called stemplot. I like it because it seem a low-tech technique for data exploration. It has the advantage to retain the original data and order the data like in a histogram.

In statistics, a stemplot (or stem-and-leaf plot) is a graphical display of quantitative data that is similar to a histogram and is useful in visualizing the shape of a distribution. The are generally associated with the Exploratory Data Analysis (EDA) ideas of John Tukey and the course Statistics in Society (NDST242) of the Open University, although in fact Arthur Bowley did something very similar in the early 1900s.

Typically, the leaf contains the last digit of the number and the stem contains all of the other digits. In the case of very large or very small numbers, the data values may be rounded to a particular place value (such as the hundreds place) that will be used for the leaves. The remaining digits to the left of the rounded place value are used as the stems.

Tags: data mining, statistics