Monday, March 08, 2010

Correlations and causation

Hubble Space Telescope picture of the cluster of galaxies Abell S0740
Image Credit: NASA, ESA, and the Hubble Heritage Team

Many people think that there have been a lot of big earthquakes in the past few months.  In early January, a magnitude 7.1 earthquake hit the Solomon Islands, causing a small tsunami.   One week later, a magnitude 6.5 earthquake hit just off the California coast.  Two days after that, a magnitude 7.0 earthquake in Haiti killed hundreds of thousands of people. The Chilean magnitude 8.8 earthquake on February 27th was the 5th largest earthquake ever recorded. A few days ago, a magnitude 6.4 earthquake hit Taiwan, and today a magnitude 6.0 earthquake hit Turkey.  (For a more detailed list of recent large earthquakes, see this list by the US Geological Survey.)  Is there a relation between these earthquakes?  Or are we just hypersensitized after the tragedy of Haiti?  Or are we having the same frequency of big earthquakes as normal?  More on this later in the post.

In astronomy, one of the common questions we ask is if two objects are related.  In the picture at the top of this post, you see galaxies at the center of the galaxy cluster Abell S0740.  Most of the galaxies you see are all together in a big swarm aroung the giant center galaxy, but some of the galaxies (especially the ones that appear smaller) are unrelated, either much closer to us or much, much further away than the galaxies in the cluster.  One question you might ask is, are clusters of galaxies like these just random chance, where a bunch of galaxies happen to be close together, or is there a true correlation (roughly speaking, is there a reason that these galaxies are close together besides random chance)?

Random chances, or coincidences, are much more common than many people think.  One reason for this is that true randomness is not uniform.  A visual example is given in this blog post at In the Dark posted about a year ago; look for the images of dots about 1/3 of the way down the page.  In a completely random Universe, in some places galaxies will be clumped together, and in some places there will be a lack of galaxies. 

Another reason coincidences are common is that , in a big place, even very rare events can happen, and can happen a lot.  Let's pretend that a Gallup poll finds that 99.67% of Americans think that taking candy from a baby should be illegal, and only 0.33% of Americans think that taking candy from a baby is just fine.  There are 300 million Americans, so that 0.33% adds up to a full one million people! (This is why I am not impressed when some group tries to influence my opinion by advertising that they have several hundred thousand supporters.  In the USA, several hundred thousand people is a tiny fraction of a percent of everyone.)  

Back to galaxies.  When astronomers see galaxies close together, are they close because of the natural apparent clumping caused by randomness?  Are they close because there are so many galaxies in the universe that there are likely to be millions of close groupings of galaxies, and our eyes and computers find it easy to pick out the clumps?  Or is there actually a correlation between galaxies, that they prefer to be closer together than random chance would predict?

Astronomers have used a tool we call the 2-point (or galaxy-galaxy) autocorrelation function, often just the "correlation function" for short.  This tool is simply a measurement of how often galaxies appear close to each other compared with how often they would be close to each other if galaxies were distributed completely randomly.  Modern values from surveys using hundreds of thousands of galaxies find that there is a significant correlation for distances smaller than about 15 million light-years.  In other words, galaxies tend to like to be close to each other, closer than randomness alone would predict.  But if you look at big distances, galaxies tend to be more randomly distributed (by this particular statistical measurement).  This is a major simplification of the issue, but you can get the idea.

The fact that galaxies are not randomly distributed, but that they are correlated (that they like to be close together), leads us to ask why these galaxies are correlated.  The research into why galaxies like to be close together helped astronomers develop our understanding of how the Universe went from a smooth Big Bang to the lumpy Universe we see today.  In fact, these studies forced astronomers to make significant changes to the Big Bang theory, because the distribution of galaxies predicted by the original Big Bang theory is different than what we actually see!

But the main points: I want to make here are:

  • Things that are randomly distributed can look like they are clumped together,
  • Things that actually are clumped together on one scale can still be random on other scales,
  • Our own intuition into whether things are related or random is often wrong, but there are tools we can use to see if things like galaxies are random or related, and
  • These tools tend to require large numbers of objects

Now, back to big earthquakes.  Are the many big earthquakes we've seen in the last few months related? And let me be clear -- I am not talking about aftershocks, which we know are related, and I am not talking about big earthquakes that sometimes happen right next to each other, which we also know are related.  I'm talking about big earthquakes on different faults around the world.  According to some experts, they are not related, but randomAccording to other experts, perhaps they are related. (Read the links for a more detailed explanation).  I actually think it is possible both are correct.  Perhaps a big earthquake can trigger other earthquakes nearby, or set of earthquakes that were going to happen soon anyway.  And certainly over long times and distances, earthquakes are probably random.  (For example, I find it very, very unlikely that the recent earthquake in Haiti is related in any way to the earthquake that toppled the Colossus of Rhodes).  On the other hand, maybe we've just hit a stretch where big earthquakes have hit densely inhabited areas.   To know for sure, we need robust statistics, which requires large numbers of earthquakes, which will take more time.  But the final word may come from the same statistics that astronomers have used for decades.

No comments:

Post a Comment