by George Taniwaki

A graphic in Bloomberg Businessweek Mar 2013 (reproduced below) lists the four metro areas with the greatest economic growth over the five-year period 2007-2011. It also gives their population change during the same period. And it lists the four cities that had both negative population growth and GDP growth during the same period.

Business Week

Figure 1. Ranking 8 cities by total growth. Image from Bloomberg Businessweek

This chart is a bit light on data, containing only 16 data points. And the changes to population and GDP are not directly comparable since the population change is reported cumulatively for the four years (total number of years minus one) while GDP is annualized. Let’s calculate the cumulative GDP change as follows:

Total change GDP = (1 + Annual change GDP)^Years – 1.

Also, notice that the data has differing numbers of significant digits. The annualized GDP changes are displayed with two digits. The population changes show one, except for Chicago and Providence which have several. I’m sure this was done to show that the populations of these two cities were falling rather than flat. Let’s get rid of those extra digits.

This chart ranks the best and worst performing metro areas. One could reasonably argue that the metro areas with the greatest absolute GDP growth are the best. (I will argue otherwise shortly.) But should the worst performing areas be defined as the four that had both declining population and declining GDP? For a counterexample, consider a city where the population is growing but GDP is falling. I would say it is actually in worse shape based on the negative value of its per capita GDP growth. In fact, any city where the population is growing faster than GDP (or shrinking slower than GDP) would have negative GDP growth. Perhaps GDP per capita is a better measure of performance than total GDP change.

To address this, let’s calculate the change in GDP per capita as follows:

Change GDP per capita = ((Change in GDP + 1) / (Change in pop. +1)) –1.

The normalized data from the chart is summarized in the table below.

Metro area
Total change in pop Total change in GDP Total change in GDP per capita
Portland   4%   22%   18%
San Jose   3%   19%   15%
Austin 12%   14%      2%
New Orleans 16%     8%    -7%
Detroit -4% -19%   -16%
Cleveland -1%   -5%     -4%
Chicago -0%   -2%     -2%
Providence -0%   -1%     -1%

Notice now that New Orleans, despite having very high GDP growth, has large negative per capita GDP growth because its population is growing faster than its GDP. Austin’s performance now looks less impressive too. And while Cleveland, Chicago, and Providence all have negative per capita GDP growth, they are not doing as badly as it first appears.

Even when normalized, the data in the table is still lacking context. It doesn’t give the reader a feel for the big picture. For instance, how many metropolitan areas over 1 million are there in the U.S.?  What is the average population change and GDP change among those cities? Which cities had the greatest change, either positive or negative, in population, GDP, and per capita GDP?

Continuing that analysis, we would want to know if most cities were growing near the average rate or if there is a large dispersion. What is the shape of this dispersion? Are there geographic location, city size, or other factors that correlate with growth? Finally, are there time series trends? To answer these questions we need to go back to the source data and create our own charts.

Creating the metro population and GDP dataset

The footnote to the Bloomberg Businessweek chart says the data is from the Bureau of Economic Analysis and the Census Bureau. The BEA GDP data is available from an interactive website. I selected Table = GDP by metro area, Industry = All industry total, Area = All MSAs, Measures = Levels, and Year = 2007 to 2011.

The Census Bureau population estimates for the metropolitan statistical areas (MSAs) are available for download from the Census website. I downloaded the historical decennial data for 2000 to 2009 and the current decennial data that covers 2010 to 2012. I merged the three data sets keyed off of Census Bureau statistical area (CBSA) code.

Note that several of the CBSAs changed in 2010, meaning the code changed too. The most significant is that Los Angeles-Long Beach-Santa Ana, CA (31100) changed to Los Angeles-Long Beach-Anaheim, CA (31080).

In addition to the MSA records, I created two additional records. One contains the total population and GDP for all MSAs and the other for MSAs with population greater than one million.

Since the geographic names of the MSAs are often quite long, I want to find shorter labels that I can use on a scatterplot. I decide to use airport codes. These are short, unique, cover any big city with an airport worldwide, and if you travel a lot, you’ve possibly memorized quite a few, so you don’t need a legend to decode them. I append this to each record.

Finally, I calculate the following descriptive statistics for each MSA and append them to the records:

Change in population = (Pop on Jul 2011 / Pop on Jul 2007) – 1

Change in GDP = (GDP for 2011 / GDP for 2007) – 1

Per capita GDP for year 20xx = GDP for year 20xx / Pop on Jul of year 20xx

Change in GDP per capita = (Per capita GDP for 2011 / GDP per capita for 2007) – 1

An Excel spreadsheet containing the original data, the merged table, and the scatterplot is available on SkyDrive.

Interactive data available on SkyDrive

Comparing the data I collected with the Bloomberg Businessweek data, the ranking for the top four cities match, but the values for population change and GDP change do not. This could be because different data was used (historical population and GDP estimates are revised annually).

The data for the bottom four cities don’t match at all. The data I collected shows only one city that had falling population and GDP during the time period, Detroit. The three other cities showed rising GDP and two showed rising population as well. And despite the falling population and GDP, all four cities showed rising GDP per capita.

The data for those 8 metro areas plus a few outliers are shown below The means for all 51 MSAs with population greater than one million are included for comparison.

Metro area
Total change in pop Total change in GDP Total change in GDP per capita
Portland   4.5% 23.1% 17.8%
San Jose   5.0% 28.6% 12.9%
Austin 11.7% 18.4%   6.0%
New Orleans   9.4% 17.5%   7.4%
Salt Lake City   1.3% 14.9% 13.4%
Mean   4.2%   6.9%   2.6%
Detroit -3.8% -2.4%   1.4%
Cleveland -1.5%   3.5%   5.0%
Chicago   0.5%   5.4%   4.9%
Providence   0.0%   6.7%   6.6%
Las Vegas   7.0% -5.9% -12.1%
Charlotte 36.7%   9.3% -20.1%

Visualization and Analysis

I generated a simple scatterplot of change in GDP against change in population for all 51 MSAs. The cities from the table above are highlighted in green and red. I added a population weighted trend line shown in brown. The trend line passes through the mean (4.2%, 6.9%) and has an y-intercept at 2.6%.

I could have made the chart fancy by adding information using the size, shape and color of the markers. For instance I could change the size of the markers based on the population of the MSA, change the shape of the marker based on whether the city was coastal or inland, and change the color of the marker based on which Census region it was in.


Figure 2. A simple scatterplot showing the top 51 MSAs. Image by George Taniwaki

The four metro areas with the highest GDP growth are all above the trend line and have high per capita GDP growth. However, the Bloomberg Businessweek chart leaves off Salt Lake City which has lower GDP growth but because its population only grew 1.3% during the period, its per capita GDP growth is a very high 13.4%.

The fastest shrinking metro area is Detroit, which matches the Bloomberg Businessweek result. Note that its lies above the 45-degree diagonal running through the origin, meaning its GDP decline is less than its population decline and so has positive GDP per capita growth. However, it is still below the trend line, meaning it is growing slower than the average.

The other three metro areas in the Bloomberg Businessweek chart, Cleveland, Chicago, and Providence, all show slow or negative population growth, but are all above the trend line. They probably should not be considered bust-towns. The true bust-town in the scatterplot is Las Vegas. It is an outlier with a population growth of 7% but a GDP decline of 6% which results in a 12% drop in GDP per capita.

The final outlier is Charlotte. It shows a population gain of nearly 37% which is more than double of the next fastest growing city. But it has only a 9% increase in GDP leaving it with a 20% drop in GDP per capita. This is a sign that rapid growth can actually be very bad for a city.

Data error and bias

The statement in the paragraph above assumes that the data for change in economic activity and in population at the MSA level developed by two separate organizations for separate reasons is accurate and comparable. Neither of these assumptions is particularly sound. Specifically, there is a big discontinuity in the population estimate for Charlotte between Jul 2009 (the last estimate based on the 2000 census) and Jul 2010 (the first estimate based on the 2010 census) that accounts for most of the population gain. Thus, the annual population estimates may need to be smoothed before calculating the change between years.

I believe the BEA estimate of economic activity for an MSA is based partly on the population estimate for the MSA. Thus, if the population estimate changes (it is revised annually), then the GDP estimate will no longer be valid and will need to be updated.

Finally, you should be careful when combining data from different sources and comparing them. We do it all the time but we have to be conscience of what the consequences are. This is an especially important point since everybody today is rapidly building giant data warehouses and running analytics on data that has never been combined before.