« Open Innovation Intermediaries

The Future of Analytics

By Daniel Gold, Director, Kaggle

Kaggle logoJust about every modern organization relies heavily on statistical thinking. Banks predict which loan applicants are likely to default, government treasuries forecast tax revenues, websites make personalized product recommendations, hedge funds crunch data to find trading opportunities and bioinformaticians search for links between genetic markers and disease.

However, it is almost never the case that any single organization has access to the advanced machine learning and statistical techniques that would allow them to extract maximum value from their data. Meanwhile, data scientists crave real-world data to develop and refine their techniques.

Kaggle corrects this mismatch by offering companies a cost effective way to harness the 'cognitive surplus' of the world's best data scientists.

Kaggle provides statistical/analytics outsourcing via global data modeling competitions. Companies, researchers and governments upload datasets and problems, the world's best data scientists then submit solutions and compete for prizes.

Crowdsourced data modeling is particularly effective because there are countless models that can be applied to solve any one problem. It is impossible to know at the outset which technique will be most effective. By exposing the problem to a wide audience, with different participants trying different techniques, competitions can very quickly reach the frontier of what's possible from a given dataset.

Through Kaggle, an organization can have 100s of PhD-level specialists working on their problem quickly and cheaply.

The result for our clients is cheaper, faster and more powerful analytics.

As examples, a recent bioinformatics competition was hosted by a Drexel University researcher to predict the progression of viral load in HIV patients. Within a week and a half, the best entry had outperformed the best methods in the scientific literature.

A current competition is running to improve the Elo chess ratings system that is used by the World Chess Federation to rank players. Within the first 24 hours, three teams were outperforming Elo, and Elo now stands in 111th place! At the time of writing, the leader was a Portuguese physicist, followed by an Israeli mathematician.

Kaggle's infrastructure includes a user-friendly interface for competition uploads, user submissions and real-time scoring. We represent a step-change in the evolution of analytics outsourcing.

[BACK TO KAGGLE LISTING ON IDEACONNECTION.COM]