May 2011

Here’s a riddle.

Question: I donated a kidney anonymously on Wednesday, September 29, 2010. This is a rare act. Perhaps 300 people worldwide did it last year. I also write extensively about kidney donation. I was reading Renal & Urology News May 2011 and I saw a story written by a person explaining why he/she donated a kidney to a stranger on Wednesday, September 29, 2010. But it wasn’t written by me. It was a weird experience reading the story about someone who is very similar to me. (I encourage you to read the article.) What are the odds that two people who enjoy writing also donate a kidney anonymously on the same day?

Short answer: Ex post, p=1.

Long answer: Before the two surgeries occur (called ex ante), the joint probability that my surgery (let’s call it event A) occurs on the same day as the other donor (let’s call it event B) is written as P(AB). We want to break this probability into two parts. First is the probability of my surgery happening on a particular day given that the other person donates the same day. This is written as P(A|B). Similarly, the probability of the other donor’s surgery date given mine is P(B|A) and the joint probability is obtained by multiplying the two together, P(AB) = P(A|B)*P(B|A).

In this case, I am certain (P>.99) that the date of my surgery was not influenced by the other donor. I was unaware of the existence of the other donor until I saw the story in Renal & Urology News. Thus, we can write P(A|B) = P(A).

Further, I will assume that the other donor’s surgery date was unaffected by my date and so P(B|A) = P(B). Thus, I ignore the possibility that the other donor or his/her surgeons read this blog and selected the donation date to match mine. I will also ignore the possibility of spooky effects like quantum entanglement, ESP, and God’s will forcing the two surgery dates to be identical.

Now we have P(AB) = P(A|B)*P(B|A) = P(A)*P(B).

Now, I will assume that the surgery dates for both me and the other donor are random and independent. If this is true, then P(B) = P(A). Substituting gives us P(A)*P(B) = P(A)^2.

Actually, this is not quite true. Elective surgeries are not randomly scheduled. For instance, surgeons like everyone else, want their weekends free and dislike scheduling elective surgeries on Saturday or Sunday. Similarly, surgeons like to visit their patients for two days after surgeries, but want to avoid coming in on weekends. Thus, they don’t schedule elective surgeries on Thursdays or Fridays. Finally, emergency care patients who enter the hospital on weekends are often taken into surgery on Monday. Thus, elective surgeries are nearly always scheduled on Tuesdays and Wednesdays. Eliminating the weeks of New Years, Christmas, and Thanksgiving, the Tuesdays after 3-day weekends, and allowing time off for vacations leaves about 90 possible surgery dates each year.

Now, there are about 300 other nondirected donors, so on average over 3 (300/90) nondirected donors will have surgery on the same day. Note however, that it is unlikely that the doctors at my hospital are on vacation the same dates as the doctors at the other donor’s hospital, or have the same holiday schedule, so this estimate isn’t quite right. Further, not all 300 donors like to write. And not all the writers will be English speakers. Now we have a complicated mess.

Yuck. Let’s start over. Instead, let’s look at the probability that an event will occur after we know the outcome, called ex post. It is always either 100% (it happened) or 0% (it didn’t happen). In this case, we know it happened so P=1.

[Update: I clarified the logic. I also changed the wording to indicate that I don’t know the gender of the other donor. On initial reading of the story, I thought it was written by a man. Now I think it is a woman. But since the writer is anonymous, I can’t be sure. About 60% of anonymous donors are female. (But that doesn’t mean there is a 60% chance that I am female.)]

I love looking at maps. Good maps contain high data density and if you are already familiar with the region being described on the map, looking at the map evokes your memories of passing through the area defined. A good map is a piece of art.

The current issue of the Univ. Chicago Mag. May 2011 describes the map work of Eric Fischer, an engineer at Google.

One of his projects was to create a Flickr photostream with maps of various cities with colored lines showing people’s inferred path based on the timestamps of their geocoded photos uploaded to Flickr or Picasa. The project is called the Geotaggers’ World Atlas. The map of Seattle area is shown below.


Map showing paths of people who shared photos taken in Seattle. Image from Eric Fischer in the Geotaggers’ World Atlas

The color of the lines shows the estimated speed of the photographer. Black is by foot, red is by bike, and green is by car. You’ll notice that the densest set of lines are in downtown Seattle and most of them are black. The fewest lines are in the suburbs, and most of them are green. Even though the population on either sides of Lake Washington are similar, most of the photos are taken on the Seattle side. More proof that the eastside is boring.

Using the same data, Mr Fischer creates another set of maps that teases out whether the photographs are taken by tourists or locals. He does this by first seeing if there is a single city in which the user’s pictures span more than a month. If so, the user is considered a local in that city. All other pictures taken by the same user outside that city (and which have timestamps that span less than a month) are considered tourist pictures. This provides two interesting bits of data. First, do locals and tourists take pictures of different things? Second, do some cities attract more photo-happy tourists than others.

In another project, Mr Fischer has created a set of maps of major U.S. cities showing the level of racial segregation in 2010. Normally, maps like these start with neighborhood boundaries and then use a chloropleth technique to indicate the population density of each race in the neighborhood. However, with the power of computers today, it is now possible to reverse the process. One can display households on the map and let the colors define the boundaries of neighborhoods.

Mr Fischer’s maps do this by placing a colored dot on a map to represent 25 households of a particular race (African in blue, Caucasian in red, Asian in green, and Other in yellow) or Hispanic origin regardless of race (orange). He uses U.S. Census self-reported race and ethnicity data from the both 2000 and 2010 census at the census block level. A map of Seattle (it also includes most of King County and parts of Kitsap County) is shown below.

As can be seen in the map,  Seattle has a smaller proportion of blacks than any other large urban area in the U.S., at 6.1%. The black population is concentrated in neighborhoods south of downtown. Asians make up a larger proportion of the population at 13.2%. They are also concentrated south of downtown but also tend to be dispersed throughout the region with pockets in Bellevue (east of Lake Washington), Renton, and Kent (both south of Bellevue).


Map showing Seattle’s racial distribution. Image from Eric Fischer

The idea for the maps comes from Bill Rankin’s map of Chicago that is available at a website called Radical Cartography. This site has a variety of other maps for cities, the United States, the world, and even the universe.

[Update: I corrected an error in the description of the race maps. Pacific Islanders are included in the Other race category, not in the Asian category. Pacific Islanders make up 0.5% of the Seattle population. I subtracted this from the population data for Asians.]