Companies without enough data scientists, meet lots of analysts willing to take your work.
EMC's Greenplum division, which makes data analytics software, is joining forces with Kaggle, a company that finds and deploys people good at statistical inference, to produce a kind of Big Data engineer marketplace.
Customers of a Greenplum product called Chorus will be able to search and examine the profiles of thousands of people worldwide who have participated in Kaggle's online statistical competitions. The companies can then hire these master statisticians by the hour to solve their data problems.
Kaggle contests cover things as diverse as finding dark matter in the universe and predicting bond prices. In addition to solving specific problems, they identify the people who are good at the problem solving. Kaggle then runs private data competitions for big companies, hiring its best contestants as contractors. It collects a fee for that.
So far, Kaggle ha s registered 55,000 contestants, any of whom can register to be hired for projects by Greenplum customers. The project with Greenplum was announced on Tuesday, so it is still unclear how many people will want to participate.
âIt's a way to get them into companies and solving problems,â said Anthony Goldbloom, the chief executive of Kaggle. âIt should also be good for our business. Right now we attract people who are in the competitions for love. If this generates more money, it will attract more people.â
Kaggle's private contests tend to be large-scale projects. Mr. Goldbloom expects the Greenplum customers to have simpler needs. Insurers, for example, are frequently looking for ways to improve their premiums system. Kaggle will charge $300 to $500 an hour for the service (the final price has not yet been set) and will collect a commission.
Scott Yara, a co-founder of Greenplum and now its senior vice president of products, said his company already has a dedicated staff of 25 data scientists but has more work than it can handle. âWe'll never fill the gap,â he said. âEven the biggest companies in the world are just starting out on this.â
Greenplum's and Kaggle's effort may be one way that the market copes with a perceived shortage of data scientists. Last year, McKinsey Global Institute said that the United States needs perhaps 190,000 skilled data analysts and 1.5 million more data-literate managers to cope with all the information companies are collecting.
Other methods include offering marketplaces of algorithms and better user interfaces to automate some of the statistical process. Both of these efforts are under way, both at established companies and at start-ups.
Still, Mr. Yara said, there will be an acute need for talent. âIn 1975, if you said we'll need millions of people writing software, it would have made no sense,â he said. On the other hand, it happened.