Real Numeracy


MentalError

Don’t let mental errors cloud your thinking. Image by Jan Buchczik for The Atlantic

by George Taniwaki

Arthur Brooks is a conservative social scientist. He is on the faculty of Harvard Business School and was formerly president of the American Enterprise Institute. Since 2019, he has been writing a series of articles in The Atlantic, now called “How to Build a Life.” With the onset of the Covid-19 pandemic, the articles have included advice on how to live a a happier and better life by understanding our life circumstances.

In his Apr 23, 2020 article entitled “Two Errors Our Minds Make When Trying to Grasp the Pandemic”, he makes the case that we would be happier if we understood the difference between two experiences that make us unhappy and two conditions that make us nervous. It is a very thought provoking article and I highly recommend it.

Regret and disappointment

Regret and disappointment both lead to unhappiness. They seem similar but are not. We should only feel regret for bad decisions that we have made. Then we should work hard to develop strategies to do better next time. But we should not feel disappointment.

In contrast, we should only feel disappointment when we are in situations where we had no control, like the Covid-19 pandemic. And once we recognize we have no control, we should endeavor to stop our disappointment and get on with other thoughts that will make us happy. As Brooks says, “rumination on what you would be doing if it weren’t for the coronavirus is a destructive waste of your time.”

Risk and uncertainty

Most people dislike risk and uncertainty. Again, these conditions seem similar but are not. As Secretary of Defense Donald Rumsfeld famously stated, “There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know.”

Risks can be thought of as the known unknowns. These are outcomes that we cannot accurately predict, but understand well enough that we can forecast them using stochastic models. We can also mitigate and manage risks by working hard using the appropriate strategies and interventions.

Uncertainty are the unknown unknowns. How many people will die from Covid-19? Is it safe to open schools in the fall? Will I or a family member get the disease? We don’t know and can’t predict these with the information currently available. That is, we as laypersons cannot convert uncertainty into risk. Thus, we should not spend a lot of time worrying about these questions. Doing so will exhaust us and make us unhappy without leading us to a better prediction.

Acknowledge, distinguish, resolve

Mr. Brooks has a three step solution to overcoming these two cognitive errors. He calls his solution “acknowledge, distinguish, resolve.” As he writes, “Disappointment and uncertainty are inevitable, but we don’t have to turn them into suffering.”

TSA Pre✓Renewal

A simple questionnaire with a big flaw. Image from TSA Pre✓

by George Taniwaki

I recently received a voice mail message from the Transportation Security Administration. A woman’s voice told me that my Known Traveler Number (KTN) would be expiring soon and that I would need to renew it if I wanted to remain in the the TSA Pre✓ program. That’s the short line through security at the airport.

I haven’t been to the airport recently (and I hope you haven’t either) so I don’t know how long the lines are right now. But joining the TSA Pre✓ program is not expensive ($85 for 5 years) and has been worth it for me. So I pointed my browser to https://universalenroll.dhs.gov/ and started the renewal process.

Near the end of the process, I landed on a very unexpected page. It was a survey form asking questions about my flying habits (see screenshot at top of post). There are many problems with this survey that market research experts will immediately catch. But check out the fourth question. “How satisfied are you with your overall airport security experience?”

Geez, I hate airport security. It is intrusive, arbitrary, and time consuming. It also subjects you to radiation and chemicals of unknown safety. I guess it would be worthwhile if it effectively stopped violence and terrorism at a reasonable cost. Unfortunately, there is no evidence of efficacy and lots of evidence that it is really expensive.

Now, how should I answer this question? There is no explanation on the page about how your response data will be used. Specifically, there is no assurance that the responses will not be associated with your personally identifiable information (PII) and only aggregated data will be provided to the TSA.

Since TSA can make your life miserable, including revoking your KTN, the safest thing to do is to tell them you love your experience with airport security. Question 4 has 10 unlabeled radio buttons with the phrases “Extremely Poor” and “Extremely Satisfied” at the ends. I decide to pick the 9th button. High but not perfect. I figured anyone picking the 10th button will also be flagged for attention as either a liar or an obsequious bootlicker.

Anyway, as a marketer you may be tempted to increase response rate to your market research survey by integrating it into a customer transaction flow. Don’t do this. Your responses will be biased.

* * * *

Update1: Revised the third paragraph to clarify that there are many other problems with this survey. Thanks to my friend and colleague Carol Borthwick for reminding me that not all readers of my blog are survey experts. Below is a list of some of the obvious errors in this survey.

  1. In the first question, how should one respond if you fly for both business and pleasure? And really, you fly to a destination for pleasure, you don’t fly because the experience itself is pleasurable. Almost nobody flies for pleasure, unless they are a pilot.
  2. In the second question, what is the TSA trying to measure? My guess is the number of times respondents are screened by TSA in a year. A round trip usually involves two waits through the TSA line. However, one should not count trips on private aircraft where you don’t go through TSA lines or flights that originate outside the U.S., even if you go through U.S. immigration at the foreign airport.
    Further, if you have a connecting flight on a US domestic flight, you usually do not go through a TSA line again. If you arrive from an international flight and pass through immigration after the flight, you usually do go through TSA before boarding the next flight.
  3. In any event, this survey was probably designed before the collapse in travel due to Covid-19. Does the TSA want to know the number of trips respondents took last year, this year (zero for me so far), or how many they would have taken if there was no pandemic. It doesn’t say.
  4. What’s up with those weird ranges in question 2? And which radio button should respondents select if they fly exactly 31 times a year?
  5. In the fourth question, notice that the wording of the two end point labels for the scale are not parallel. The low end should read “Extremely unsatisfied”. Also there are no labels for any of the intermediate points, leaving the distance between points up to the respondent’s imagination.

* * * *

Update2: Getting back to question 2, if you have a KTN, the TSA records each time you pass through security. So it should already know the actual distribution of how many times a year KTN holders pass through security. So what will it do with the survey data? Compare the response data to the actual data for accuracy? Check for lying and throw out outliers? Who knows.

DenverCovid19HospRate

Valverde neighborhood has the highest rate of Covid-19 hospitalization in Denver. Image from Christie Mettenbrink for Denver Public Health

by George Taniwaki

Denver’s Valverde neighborhood is just a few miles from the Barnum West neighborhood where I grew up. The streets there are busier and noisier, with more industrial businesses lining Alameda Ave. and Federal Blvd. Studies show excessive car traffic can lead to stress and chronic respiratory ailments, especially when combined with smoking, which is more common among residents there.

The houses are smaller, with more families living in multigenerational arrangements. Residents are more likely to ride public transportation to get to work or school. They are also more likely to have jobs that are considered essential. Crowded living and working conditions increase the likelihood of contracting Covid-19.

Finally, adults in Valverde are less likely to speak English at home, meaning they have less access to healthcare information. They are less likely to have health insurance and less access to healthcare providers, even if they have insurance.

This isn’t an accident of history. Cities like Denver had long adopted policies (Colorado Trust, May 2018) that encouraged racial segregation and discrimination. From the 1920s until the 1970s, the city worked with banks, mortgage companies, and property insurers, to draw maps of neighborhoods that were safe, a practice called redlining. Similar maps were used by Denver Public Schools to plan the location of new buildings to ensure schools were kept racially segregated.

Even today, the impact of segregation is still visible. An excellent article in The Conversation (May 2020) looks at the distribution of Covid-19 hospitalization rates by neighborhood (see map at top). You can see more charts and an explanation at Denver Public Health (May 2020).

Nightingale-mortality

Example of polar area chart showing causes of mortality among soldiers by month during Crimean war. Image from Wikimedia

by George Taniwaki

May 12 is International Nurses Day to recognize the contribution nurses make and to celebrate the birth of Florence Nightingale. Today marks the 200th anniversary of her birth. The World Health Organization named this year the Year of the nurse and midwife in her honor. Certainly, with the Covid-19 pandemic in full force, 2020 will be remembered as the Year of the nurse for many years to come.

Ms Nightingale, who was born in Florence, Italy was the founder of the modern nursing profession. Prior to her efforts, nursing was a volunteer activity, most often undertaken by untrained family members, soldiers, or religious members. Ms Nightingale trained nurses during the Crimean War. She later founded the first secular nursing school and published many nursing textbooks.

In addition to advancing nursing in a clinical setting, Ms Nightingale was a social activist who advocated for more government spending on healthcare for the poor. She helped develop the field of public health nursing to reach patients who were poor and sick at home.

Finally, Ms Nightingale was an incredible statistician and a pioneer in data visualization. She kept thorough notes and documented which treatments worked and which did not, making it possible for others to replicate her results. She popularized a type of pie chart that she called a coxcomb (see image above) and is now known as a polar area chart. She was the first woman elected to the Royal Statistical Society and became an honorary member of the American Statistical Association.

IHME-Covid1 IHME-Covid2

Yesterday’s forecast (left) and today’s (right). Images from IHME

by George Taniwaki

The Covid-19 story moves very fast. Yesterday, I posted a blog entry with a chart showing that the Institute of Health Metrics and Evaluation (IHME) forecast 72,000 deaths in the U.S. by June 2020 with almost no new deaths between then and August 2020. The IHME forecast assumed that stay-at-home orders will remain in place until August (see chart above left).

Today, three interesting pieces of news were reported. First, the IHME abandoned its assumption that the population will stay at home and instead switched to using smartphone location data provided by mobile carriers to estimate population mobility. This boosted their estimate of deaths in August to 134,000, an increase of 62,000 (see chart above right).

Second, a group of data scientists led by the University of Sydney’s Centre for Translational Data Science has reviewed the forecasts by IHME and found that they underestimate the uncertainty associated with COVID-19 deaths. 70% of the state level forecasts were outside the 95% prediction interval (Arxiv May 2020). You should only expect 5% of the forecasts to be outside the 95% prediction interval.

Finally, in yesterday’s blog post I discussed the ensemble forecast of Covid-19 deaths created by the Center for Disease Control and Prevention (CDC).  Until last week, the IHME forecast was included in the ensemble. On Friday, the CDC dropped the IHME forecast from its ensemble and replaced it with forecasts from Imperial College.

The IHME forecast was lower than most other forecasts and had been a favorite of the Trump administration (Politico Apr 2020) and of the Center for Disease Control (CDC) (Medium Apr 2020).

National-Forecast-2020-04-20 National-Forecast-2020-04-27-1280px

Last week, the CDC ensemble forecast (left) included the IHME data but does not this week (right). Image from CDC

DomoCovidTracker

Animated Covid-19 map, screenshot from Domo

by George Taniwaki

In order to make predictions about the future trajectory of the spread of Covid-19, you need to be able make sense of the currently available data. There are several steps to get good data.

Medical event data

First, you have to be able to collect data from multiple sources, clean them, and aggregate them based on a standard criteria. Each data record could include the following elements:

  1. Event (what was counted, e.g., tests administered, positive test results, negative results, hospital admissions, ICU status, ventilation status, discharges, recoveries, deaths, etc.)
  2. Location ID (where the event occurred, see below)
  3. Date of incidence (when the event occurred)
  4. Date of reporting (sometimes data is reported days or even months after the event and can be updated many times as errors are corrected or missing data is estimated)
  5. Value (a count)

The best repository of Covid-19 data is maintained by the New York Times (on GitHub) with an interactive viewer. Johns Hopkins University Coronavirus Resource Center also has a dataset. The best source for counts of tests in the U.S. is available from the Covid Tracking Project sponsored by the Atlantic.

NYTimesCovidMap

One of several graphics available from the New York Times

Public policy change data

In addition to medical events, there are public policy events that can be tracked, such as government orders to close nonessential businesses, travel restrictions, and so forth. These records could include the following elements:

  1. Event (what type of public policy change was made)
  2. Location ID (where the change applies to, see below)
  3. Date of incidence (when the change was implemented)
  4. Date of reporting (when change was reported, usually before the change is implemented)

Unfortunately, I could not find a centralized source of information on government restrictions and the dates they became effective. A different source of information that can help indicate how much contact there is between people is the amount of movement by people who carry smartphones. Smartphones contain a GPS antenna and can report their position. The position can be used to indicate what type of activity the person is engaging in. Google Health has a community mobility report that is updated regularly. An example report is shown below and the data in .csv format is available for download.

GoogleMobilityReport_en.pdf

Among those who own Android smartphones and participate in tracking, trips have declined. Screenshot from Google Health

Demographic and geographic data

To analyze the data, you will want append demographic and geographic data about the locations. Unlike events, demographic and geographic data changes slowly, so only needs to be collected once during the model building process. The following data elements could be useful to prepare a model of forecast:

  1. Location ID (from above)
  2. Name or description
  3. Location hierarchy (continent > country > region > state > county > city > zip code, etc.)
  4. Latitude and longitude of centroid
  5. Latitude and longitude of center of largest city
  6. Surface area (km3)
  7. Total population
  8. Age distribution
  9. Gender distribution
  10. Income distribution
  11. Race distribution
  12. Political party affiliation distribution
  13. Health insurance coverage distribution
  14. Comorbidity distribution (smoking, diabetes, etc.)
  15. Number of hospitals
  16. Number of hospital beds
  17. Number of ICU beds
  18. Number of ventilators

Some good sources for this type of data are US Census, United Nations Demographic Year Book, United Nations Development Programme’s (UNDP) Human Development Report and the World Bank’s World Development Report, Gapminder, and ESRI.

Visualize the data

Once the data is aggregated, there are many ways to visualize it. Maps are an obvious way to display location data. Line charts are an obvious way to display time series data. Domo, a developer of business intelligence software, has very nice animation that displays time series data on a map (screenshot at top of blog).

Two caveats about their display. First, the number of cases is underreported because testing for infection was not widespread early in the pandemic, and is still too low today.

Second, outside the U.S. the data is by reported by country, not state or other smaller region. A single marker is used to represent the location of events. This is probably fine for Europe or Africa, where countries tend to be small. However, it is misleading for larger countries like Canada, Russia, China, Indonesia, Australia, and Brazil. Even data for a states like California is distorted because one would expect separate markers for the Bay Area and for the LA Basin instead of a single one in the middle of the state.

Johns Hopkins Center for Systems Science and Engineering has produced a nice dashboard hosted on ArcGIS (screenshot below). It does a better job of dividing large countries into smaller geographic partitions, but the colors are dark. A description of the project was published in Lancet Infect Dis (Feb 2020) and in a press release (Jan 2020). All of the data and the dashboard are available in a GitHub repository.

JohnsHopkinsCSSECovid

Another example of a Covid-19 map. Screenshot from ArcGIS

A note about line charts. You often see Covid-19 growth charts by country that display time (either calendar date, or days since the nth event occurred) on the horizontal axis and count on the vertical axis. Both are scaled linearly. I find these charts hard to interpret and compare. I think a better way to display growth data is to display data on the vertical axis using logarithm of counts per 100,000 population and on the horizontal axis using days since the n*(population/100,000)th event occurred. Even better would be to divide large countries into smaller regions so that all the charts covered regions with similar populations.

Making Forecasts

There are many groups making forecasting of Covid-19 infection rates and death rates. The CDC has a summary of them along with its own ensemble forecast. It predicts under 100,000 deaths in the U.S. at the end of May. The Institute of Health Metrics and Evaluation (IHME) predicts about 72,000 total deaths at the end of May but with a range from 60,000 to 115,000. You can download the data from the Global Health Data Exchange.

In addition to forecasting deaths, the IHME forecasts hospital utilization. These forecasts are used by hospitals to schedule resources and plan for peak usage.

National-Forecast-2020-04-27-1280px

Individual forecasts of cumulative reported deaths in U.S. from Covid-19 (left) and CDC ensemble forecast (right). Image from CDC

IHME-Covid

Cumulative death forecast in U.S. Image from IHME.

One of the best forecasts I have seen was produced by the Economist. It synthesizes data from US Census, New York Times, Covid Tracking Project, IHME, Google Health, and Unacast. The choropleth map of the U.S. below shows risk factors for Covid-19 mortality at the county level. Green shows areas where the risk level is low (less than 1%) and red shows high (6% or above).

Economist20200425_GDC200

Dixie in the crosshairs. Image from Economist

* * * *

Update1: In just one day, the IHME forecast is obsolete. See my response at https://realnumeracy.wordpress.com/2020/05/04/tracking-the-growth-of-covid-19-redux/

Update2: Add link to New York Times dataset and interactive viewer

AdvocacyInAction

Five steps for advocacy. From WebJunction.org

by George Taniwaki

Sue and I attended a class on advocacy. It was an eye opening experience. If you want to get involved in your community and improve governance, you should attend an advocacy class and get going. The class we attended was sponsored by the Seattle/King Country Coalition on Homelessness. If you live in the Seattle area, other organizations that hold advocacy classes include Northwest Harvest and Arc of King County.

We learned a lot. Below are some details .

Understanding advocacy

One of the first lessons is that advocacy is not lobbying. This is important since lobbying a government official can cause the nonprofit you are supporting to lose its tax-exempt status. Lobbying is approaching a politician or regulator and asking them to adopt a position that will directly or indirectly benefit you or the non-profit you represent, usually monetarily. Advocacy is asking a politician to do something that you think will address a social issue. It may benefit you or your organization, but only because it fills the unmet community need. Thus, all lobbying is advocacy, but not all advocacy is lobbying.

Advocate for bills not positions

You vote for legislators based on their positions. But there often isn’t a clear path to convert positions into actual legislation, especially related to budget bills. Your legislators are busy and subject to competing demands. They will not have time to read each bill in detail. They rely on community feedback on individual bills to gauge what is important to pass or defeat.

Thus, if you want to have impact, you have to determine which bills you want your legislator to vote for and against. (I’m assuming you are a layperson and not influential enough to actually write the bills you want to pass.) However, you also do not have time to read each bill in detail either. Thus, you will need to rely on a nonprofit organization to read them for you and pick out the talking points to make when you contact your legislators.

How to find your legislators

Every state has different number of legislators and districts. In Washington State, the there is a senate and a house with identical districts. Senators serve 4 year terms and representatives serve 2-year terms. There are 49 districts and each district has one senator and two house representatives, so a total of 49 senators and 98 representative.

To find your district and the names of your legislators, go to the district finder.

When to contact your legislators

Every state has a different legislative calendar. In Washington, in even years (like 2020), there is a short session lasting 60 days. In odd years, when the two-year budget is debated, there is a long session lasting 105 days, or sometimes longer if the budget is not approved in time. So in short session years, it is critical that you contact your legislators in January and February on the dates when the bills you are concerned about are “read” on the floor.

You may want to contact your legislators multiple times during the session. First is when the bill is in committee. If your legislator is on the committee that is debating the bill, you want to give a detailed comment. If they are not, you want to register your approval or disapproval of the bill so that they can include your tally and forward it to the committee. Once the bill is out of committee, you will want to contact your legislators again to indicate how you want them to vote. Finally, at the end of the session, thank the legislators for their vote. If they did not vote in the way you wanted, express your disappointment but say you hope they are still open to your future advocacy.

To find the status of the bills you are interested in, go to the bill report.

How to contact your legislators

There are many ways to contact your legislators and register comments. You can write an email, call by phone, send a fax, letter, or postcard, fill out a web form, or post comments on their social media sites (Facebook, Twitter, etc.).

The legislators may not read your comment personally, they may have an aide summarize them. Thus, it is important that you know which comments get summarized. Many legislators still do not get summaries of social media comments. So if you post something on their Facebook page, it may not get seen.

To comment by phone, call 1.800.562.6000

To send a comment by mail or email, go to the district finder.

To submit a comment on the web, go to Comment help.

Submitting a comment is fast and easy. If you want to get involved in advocacy, find a non-profit you want to support and get started now.

* * * *

Advocacy is cheap and easy and everyone can get involved. Lobbying is expensive and requires specialized skills as the story below shows.

“Let me tell you about the very rich. They are different from you and me.” So wrote F. Scott Fitzgerald. I realize this to a greater degree after taking the advocacy class. While moderately rich people like me often have the time and inclination to ask our representatives to vote for what we want at the local level, the very rich advocate at the national level.

I knew that many U.S. senate races involve out-of-state money. But I hadn’t realized why. A story by David Frum in the Atlantic (Apr 2020) gives a good explanation. Most very rich people live in big cities, located in coastal states, which tend to vote for leftish politicians. The rich tend to be more conservative and often find it difficult to sway their own senator’s vote. But every senator gets one vote. For the sake of efficiency, it makes more sense for them to contribute money to the campaign committees of conservative senators in small red states and then advocate or lobby for what they want by approaching those senators.

Says Frum, “United States senators from smaller, poorer red states… do not… primarily represent their states. They represent, more often, the richest people in bigger, richer blue States who find it more economical to invest in less expensive small-state races. The biggest contributor to Mitch McConnell’s 2020 campaign and leadership committee is a PAC headquartered in Englewood, New Jersey…”

Next Page »