By John K. White
The most basic national demographic is a country’s population distribution. According to the latest U.S. census, 8,391,881 people live in New York, 3,831,868 in Los Angeles, and 2,851,268 in Chicago. Using only three such data points, we can easily work out the centre, or population-weighted mid-point, of the United States, and in the process learn about weighted averages and comparative statistics, ever more important as we celebrate the International Year of Statistics in an ever-shrinking globalized world.
Here, we use the longitude and latitude coordinates of each city multiplied by the population, weighting the data in the same way a parent balances children on a teeter-totter. In this case, our teeter-totter is two dimensional–north-south and east-west–revealing a three-pronged balance with midpoint about 20 miles southwest of Terra Haute, Indiana.
|City||Longitude deg||Latitude deg||Population||Weighted Longitude deg||Weighted Latitude deg|
|1. New York||-73.97||40.78||8,391,881||-41.18||22.70|
|4. Weighted Average||15,075,017||-87.87||39.23|
Three-city population-weighted midpoint of the United States
Adding more cities will give us a better estimate, say 10 cities or about 10% of the total population, which is sufficient to calculate the middle of a country’s population distribution, and from there compare different countries in a standard way.
Ten-city population-weighted midpoint of the United States, Europe, India, China
How do such pictures help? Deciding where to build new power plants, erect cell phone masts, or create high-speed rail lines depends on the population-weighted distribution. With China rapidly building high-speed rail lines–projected to account for 50% of world capacity by 2020–the volume-weighted data confirms that a north-south corridor is best. The same is true in India.
We can also use volume-weighting to analyze the foreign exchange market, a $4-trillion daily operation. If we were to take each trade as equal, we would miss the real picture, that some are for millions and others peanuts. Sports statistics should also be weighted. In baseball, batting averages and earned run averages should include the opponent’s numbers to more accurately measure changing form. It’s no good just to hit soft pitchers or strike out easy batters.
Using a weighted mean also allows us to understand higher statistical moments–standard deviation, skewness, and kurtosis–which can be worked out in just the same way.* But the real beauty is in the elegance and simplicity of weighting data, such as a population-weighted mean using only state capitals. Beauty is indeed in the weighted eye of the statistical beholder.
50-capital U.S. population-weighted midpoint (37.9 N, 92.2 W)
John K. White is an adjunct lecturer in the School of Physics, University College Dublin and author of Do The Math!: On Growth, Greed, and Strategic Thinking (Sage, 2013).
Further examples are available on the Do The Math! download site.
*Weighted moment calculation (n =1, mean, n = 2, standard deviation, n = 3 skewness, n = 4 kurtosis. Note, higher-order moments must be centred.)
μ_n=(∑▒〖f(x) x^n 〗)/(∑▒〖f(x)〗)