Another challenge is whether the correlation is spurious because it is based on biased data. Harford notes this is a particular challenge for data sources such as Twitter, whose users do not represent the overall population. Money quote:
MBA students at NC State are getting training on a wide range of tools for analyzing big data, but they also are getting the theoretical insight they will need to make sense of the results they obtain.“Big data” has arrived, but big insights have not. The challenge now is to solve new problems and gain new answers – without making the same old statistical mistakes on a grander scale than ever.