Pages

Causation, Sampling Bias, Big Problems With Big Data

Problems With Big Data?

The promise that “N = All”, and therefore that sampling bias does not matter, is simply not true in most cases that count. As for the idea that “with enough data, the numbers speak for themselves” – that seems hopelessly naive in data sets where spurious patterns vastly outnumber genuine discoveries. “Big data” has arrived, but big insights have not. The challenge now is to solve new problems and gain new answers – without making the same old statistical mistakes on a grander scale than ever. (source infra)

Big data: are we making a big mistake? - FT.com: "...Big data is a vague term for a massive phenomenon that has rapidly become an obsession with entrepreneurs, scientists, governments and the media... As with so many buzzwords, “big data” is a vague term, often thrown around by people with something to sell... Consultants urge the data-naive to wise up to the potential of big data. A recent report from the McKinsey Global Institute reckoned that the US healthcare system could save $300bn a year – $1,000 per American – through better integration and analysis of the data produced by everything from clinical trials to health insurance transactions to smart running shoes. But while big data promise much to scientists, entrepreneurs and governments, they are doomed to disappoint us if we ignore some very familiar statistical lessons. “There are a lot of small data problems that occur in big data,” says Spiegelhalter. “They don’t disappear because you’ve got lots of the stuff. They get worse.”.... Who cares about causation or sampling bias, though, when there is money to be made?...There’s a huge false positive issue...“We have a new resource here,” says Professor David Hand of Imperial College London. “But nobody wants ‘data’. What they want are the answers.” To use big data to produce such answers will require large strides in statistical methods....we’re flying a little bit blind at the moment...." (read more at the link above)




No comments:

Post a Comment