Harvard Business Review once called it "the sexiest job of the 21st century."
Data scientist is not only the top job this year (the position ranked number one on Glassdoor’s top jobs for 2016) but based on hiring demand and the potential for salary growth it's poised to be the top job in the future as well.
But there may not be enough people to fill it. A McKinsey report predicted that by 2018, "the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills, as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions."
Why the deficit? It’s no secret that individuals and companies throw off massive amounts of data every day. From emails to social updates and location data, all that user-generated information—aka big data—is valuable, provided that it’s parsed into something that can be easily read by workers in any industry from banking to retail, construction to government. In retail for example, the same McKinsey report estimated that a retailer could increase its operating margin by 60% using big data. That’s where a data scientist comes in.
To become a data scientist, an individual needs to have skills in database management, statistics, machine learning, and distributed and parallel systems, according to the American Statistical Association. As such, many of the data scientists working today have PhDs, as well as research backgrounds in math or physics. In other words, not the kind of skill level you would find in an average tech worker. Add to it the fact that it’s a relatively new field and only a few universities, such as MIT, are offering dedicated programs in the pursuit of degrees, and it’s easy to see how the gap is growing.
There are some companies focused on narrowing that gulf. We talked to three of them about their strategies for meeting the demand for data scientists with a bigger supply.
We’ve reported recently that the master’s degree is the new bachelor’s degree, factoring heavily into employer’s hiring decisions.
As most undergraduates come out of four-year institutions with mountains of debt, getting an advanced degree might be out of reach, especially in a field like data science where PhDs are the norm. Coursera is aiming to make the master’s track for data science a bit more affordable and accessible.
An established player in the massive online open course (MOOC) landscape, Coursera announced earlier this week that it will offer a professional data science master’s degree from the University of Illinois at Urbana-Champaign.
Daphne Koller, president and cofounder of Coursera, tells Fast Company that before this launch, data science was the most popular topic for online learners on its platform. "We think this is due to a combination of the quick growth in demand for data science skills in the job market and the lack of formal education opportunities in data science until very recently," she says.
The cost is $19,200 for the degree which is lower than the cost of a traditional on-campus master of computer science in data science (MCS-DS) degree. Students must apply for admission and the first cohort of 150 will begin classes on August 22, 2016.
Koller points out that unlike other master’s degrees, students can try out the MCS-DS degree with a shorter specialization certificate program in data mining or cloud computing, earning credentials that can then fully transfer to the MCS-DS if they decide to pursue the degree later.
Koller says that because the field is so new, employers must be open to identifying skilled candidates who come from varied, nontraditional educational and career backgrounds. "However, Illinois MCSDS degree offers employers the best of both worlds: They can hire a data scientist who has a master’s degree in the discipline, provided by one of the world’s best departments in that area," she says, adding, "Many employers will value both the smaller credentials as well as the promise of an upcoming degree from a top department."
Taking the cost-conscious learner into further consideration is DataCamp. The online learning platform that does not offer a degree, bills itself as the first to focus on data science and claims to have trained over 250,000 aspiring data scientists in over 150 countries since opening in November of 2013. This thanks to partnerships with companies such as Microsoft, IBM, and RStudio as well as schools such as Princeton, Duke, and University of Washington.
DataCamp founder and CEO, Jonathan Cornelissen points out that DataCamp gives students a certificate of completion for every course at a cost of $25 per month (or $9 per month for students enrolled in other schools). "On average, a student completes courses after four to six hours and students can then share their course completion certificate on LinkedIn," he says.
Cornelissen underscores the fact that DataCamp isn’t like other learning platforms. "We don’t claim that our course completion certificates are an indication of mastery on the topic," and the real value to the student and their employer is in mastery. "Certifying student mastery in data science requires a thoughtful certification approach that takes into account data science fluency and we are developing such a certification system that will be launched in Q3 of 2016," he explains. This is done in a learning-by-doing way similar to the language app Duolingo, he says.
Cornelissen says that approach is what sets DataCamp apart from MOOCs, too. With course completion often at rates below 10% at many MOOCs, DataCamp’s students are completing at rates between 30% and 60% for most of courses.
DataCamp students are predominately professionals, 60% are working in technology, finance, and health care, and 10% are professors or researchers. The remaining 30% of users are students. "Even a few fifth graders have used DataCamp," Cornelissen says.
Just don’t call it a hack school like the ones designed to teach people to code in three months. Not only is DataCamp less expensive than most coding bootcamps, but Cornelissen maintains its instructors and their methodology set them apart.
"Our students use DataCamp because they want to become more data literate and because they want to acquire data science skills that help them to get ahead in their career," Cornelissen explains. "A lot of people are bumping into the limits of proprietary and old-fashioned technologies such as Microsoft Excel, etc. and they are looking for new more innovative solutions to import, clean, manipulate, visualize and model their data."
"One of the most important skills required to be a successful data scientist is the ability to tell a story using data," says Correlation One’s CEO Sham Mustafa. Correlation One is a job matching platform for data scientists that just launched on March 31 after 18 months of beta testing.
Mustafa says that over 90% of its candidate pool has either a PhD or a graduate degree, "although exceptional candidates with undergraduate degrees have also been successful." Employers typically look for candidates who have deep expertise in statistics, math, Bayesian nonparametrics, and programming.
So far, Mustafa tells Fast Company that over 800 data scientists have joined the platform and almost two dozen employers have utilized one of the services provided by Correlation One. "We continue to add 40 to 50 candidates every week," he says.
Correlation One allows data scientists to create portfolios and validates their skills via a battery of proprietary online tests. A resume analysis algorithm (patent pending) identifies relevant factors from a candidate’s resume and issues a "report card" for employers so they can find the right talent more quickly.
Mustafa says that it's free for job seekers to create a portfolio, but businesses are charged by the service. Resume ranking starts at $9,900 per quarter, and if they hire someone they pay 25%-30% of that candidate's annual wages. To get a "Data Scientist On-Demand" companies can expect to pay $200 per hour or more.
"During our beta period, we helped employers identify over 100 candidates for positions such as data scientists, machine learning engineers, quantitative researchers, business intelligence analysts, and data analysts." Mustafa was unable to comment on the number of candidates actually hired through the platform because of the strict nondisclosure agreements signed with participating companies.