by George Taniwaki
Often times, one has lots of data to display that are grouped in pairs such as men vs. women. Further, we want to show the pairs but not compare them. Instead, we are more interested in comparing different groups within the pairs than between pairs within a group. For instance. our groups could be age and we are more interested in comparing 20-24 women to 25-29 women than we are between 20-24 women and 20-24 men.
The diverging stacked bar chart is a very good way to display a pair of values next to each other. To allow easier comparison of the length of adjacent bars, there is usually no white space between them. To allow easier comparison between the left and right bar, they usually have no space between them but are different colors. The values for the bars running to the left are not negative. They are positive, just like the values on the right. The left and right bars are paired and measure the values for two different related groups.
The most common use of the diverging stacked bar chart is to display age distribution of a population broken down by gender. This type of chart is often called a population pyramid. An example of a population pyramid using data from the 2000 U.S. Census is shown in Figure 1.
The population pyramid is a special case of the diverging stacked bar chart. Notice that each of the horizontal bars is the same width and covers the same age range (except the oldest group). Thus, the height of each bar represents the same number of years and the stack of bars forms a vertical axis showing age. Similarly, the the area of each bar represents the proportion of the population in that age group and the area of all the bars shows the total size of the population. A well-drawn population pyramid shows three dimensions at once, age, gender, and counts.
The shape of a population pyramid tells a lot about the population growth (which itself is a result of economic and political conditions that affect fertility, infant survival, immigration and emigration, and longevity) of a group.Figure 2 shows the four commonly seen shapes for a population pyramid.
The two triangles at the right (labeled stage 1 and stage 2) describe a group with a combination of high birthrates, high emigration, and high mortality cause the number of young to greatly exceed the old. Several countries in Sub-Saharan Africa and India have population pyramids of this shape.
The flatter shape (labeled stage 3) describes a group where births, immigration/emigration, and mortality are in balance. Most of the developing countries and the U.S. have population pyramids of this shape.
Finally, the egg-shaped pyramid (labeled stage 4) has a base that is smaller than the center. This describes a group where a combination of low birth rate, high immigration rate, and low mortality causes a bulge in the middle. If the fertility rate is below the replacement rate (about 2.1 child per female lifetime) then the population is growing older and may even be shrinking. Nearly all of the developed countries and China have population pyramids of this shape.
Figure 2. Four commonly seen shapes for population pyramid. Image from Wikipedia
If you would like to explore population pyramids on a national, state, and metro area basis, go to http://www.censusscope.org/us/chart_age.html
Diverging stacked bar charts can be used in cases where there are more than two categories. In a paper presented at the 2011 Joint Statistical Meeting, Naomi Robbins and Richard Heiberger suggest that Lickert scale data should be presented using this method. If the questionnaire uses the standard 5 point scale, they argue that the “Strongly disagree,” “Disagree,” and half the “Neither agree nor disagree” counts should be shown on the left bar. The counts for “Strongly agree,” “Agree,” and half of “Neither agree nor disagree” should be shown on the right bar. An example is shown in Figure 3.
Figure 3. Diverging stacked bar chart used to display Lickert scale data. Image from 2011 JSM
I’ve tried a bunch of different ways of presenting Lickert scale data (as well as other scaled data for importance, satisfaction, and other opinions) and have never been happy with my efforts. I really like this technique. If you review the paper, you will see eight common methods for displaying Lickert scale data that the authors label as “Not recommended.” I’ve used many of them.
For instance, I’ve used the standard colored bar chart like the one shown in Figure 4. The problem is that every bar is the same length so the ends of the bars, which your eyes are drawn to don’t convey any data. All of the data is conveyed at the interior points of the bars. By comparison, in Figure 3, the data is conveyed by the lengths of each bar and the proportion of each bar that is filled with the darker shade of color.
Figure 4. Standard bar chart to display Lickert scale data. Image from 2011 JSM
So how do you create your own diverging stacked bar charts? If you are an R language user, you can use functions available in the HH package and the latticeExtra package for the R language. These functions are also available in the RExcel for R add-in for Excel on Windows.
If you are not an R user, you can create diverging stacked bar charts manually using Excel or Tableau. For instructions using Excel, Amy Emery has a good tutorial on Slideshare.net.
Incidentally, if you have time, check out some of Ms. Emery’s other slide shows, they are quite good and cover a range of topics. There is even one to help R novices like me get started in learning the language.
Much thanks to my friend and colleague Carol Borthwick for pointing me to this new use for the diverging stacked bar chart.