The Potential Hidden Bias In Automated Hiring Systems

More companies are using machine-learning software to screen candidates, but it may be unwittingly perpetuating past bias.

The Potential Hidden Bias In Automated Hiring Systems
[Illustrations: PhonlamaiPhoto/iStock; rashadashurov/iStock]

About a generation ago, hiring, development, and managing were at the heart of human resources. Because people who were recruited usually ended up staying with the company for life, the hiring process was systematic and drawn out. Companies would periodically assess people with physical tests, performance tests, IQ tests, and keep exhaustive files on individuals from the time they were hired.


By the 1990s, the rules of talent management had to be rewritten. Businesses became less predictable and had to become agile. Employees went from being lifers to job hoppers, and consequently, companies were no longer able to keep long, detailed notes on employees. People constantly moving in and out of companies made hiring overwhelming, and businesses found they needed to be savvier to find the right person for the evolving work environment.

The result is the growing prevalence of  reliance on technology when it comes to finding and tracking candidates. Vendors like Ascendify and HackerRank help major corporations like GE, IBM, and Cisco find talent and hire them faster.

“We moved to a model where we pushed decisions from the corporate offices and headquarters down to the individual hiring managers when it comes to making decisions about who to hire,” says Peter Cappelli, professor of management and director of the Center for Human Resources at the University of Pennsylvania’s Wharton School. Because the days of systematic hiring are long gone, managers end up making most of their hiring decisions based on gut, Cappelli says. “Part of the problem you have then is you have to persuade people to stop going with their gut.”

Basically, when it comes to predicting who is going to be good for a job, humans just aren’t very good–and apparently neither are machines.

A New Challenge In The Age Of Automation: Algorithmic Fairness

In the constant scramble to find the right person for the job, companies turn to machine-learning algorithms to compile more data faster in order to improve hiring decisions. Machine learning software takes data that a company already has to predict who is going to be a good fit for the company, good in a specific role, a high performer, and even advance the fastest.


But as you might be able to predict, if these systems use historical data to make these predictions, the results are likely also going to be biased, ranging from racial discrimination to narrow bracketing. Machine learning systems make their predictions based on patterns, so if a technology is built to source candidates, it might “learn” that most candidates hired come from a specific school or zip code. This could encourage bias and greater disparity.

“There’s no bias in the algorithms,” says Cappelli. “It treats everyone equally. The algorithm in the software doesn’t care that you’re white or black or what your last name is.” But what ends up happening is the algorithm might very well recommend that you hire white men because “white men have historically gotten higher appraisal scores and advance faster through organizations,” he says.

“You have to be aware that even though the algorithm is not biased, the history that created the algorithm is biased,” Cappelli continues. “And any bias that’s already in the data is going to be picked up and incorporated into something that’s systematic.”

Furthermore, because of the complexity and capacity of these systems, these problems can happen much faster and at a much larger scale, says Joshua New, a policy analyst at think tank The Information Technology and Innovation Foundation (ITIF). If businesses aren’t careful, this could create an even bigger problem, where instead of prejudice or discrimination being a series of one-off decisions made by individual managers, it could be institutionalized.

As machine learning algorithms become more prevalent, the question of algorithmic fairness also gains importance. For businesses, this question should be asked regularly far beyond the hiring process. Josh Bersin, principal and founder of Bersin by Deloitte, a research and advisory services for HR professionals, says after you’ve hired someone, once they get into the company, decisions like how much of a raise they get, whether they get promoted, are they chosen to go on a new assignment, and are they in one of your high potential programs are usually all done based on historically judgmental talent practices.


For instance, if a company has always promoted white males who went to Harvard and got MBAs, the algorithm may suggest that everyone who falls under this category gets precedence in, say, the company’s coaching or leadership development programs. And when the system predicts that a certain individual is expected to be a high performer, then that person might very well get treated like a higher performer by their manager. “You don’t always know why a software system selected what it selected,” Bersin says, explaining that a lot of these algorithms are not yet auditable, and it’s up to HR to look at why the system made certain recommendations.

“I’m not down on it at all,” Bersin continues. “I don’t think it’s negative, but I think there’s a little bit of overconfidence about how great this is going to be.”

One bright side? Julie Fernandez, a partner at Information Services Group who leads efforts with automation in HR for the company, says that since these systems are rule-driven, it’s easier to identify when a machine is recommending discriminatory actions versus when a human does it.

Still, while automating decisions continue to prove to be helpful in certain instances, relying on these predictions in this nascent stage without an auditing body is irresponsible. As these systems grow in prevalence, companies must now ask themselves, what do we do with these predictions? If you set machine learning software loose inside organizations and find patterns like who’s most likely to steal or might leak classified information, then what do you do with that information? That’s an ethical question. That’s algorithmic fairness.

Who’s Responsible Here?

In order to train the system, a lot of data is needed. The algorithms can’t yield perfect answers if they aren’t being used regularly.


“If you’re a big company and you have a 100,000 employees and you’re planning on rolling out a system that helps you figure out who to hire, who to promote, you better audit it,” warns Bersin. “If employees find out that the system is biased and can prove that the system is biased, you can be in a lot of trouble.”

With these kinds of consequences, the onus is on businesses to identify flawed systems, even if the recommendations are fact-based.

“The people selling you things, the vendors, they don’t really want to explain all of the limitations, all of the uncertainties, all of the potential problems with what they’re selling you,” Cappelli explains. So businesses should be aware of their algorithms’ strengths and limitations. They should know how the algorithm is generated. They should know why they’re predicting what they’re predicting.

Since these predictions are often complicated questions, and the algorithm–no matter how transparent it is–is challenging for the average worker to understand, businesses that choose to use these systems may want to hire in-house expertise to look at the recommendations and make sure they’re rational before putting it into managers’ hands.

Just be aware, says New, that while actively training data scientists in this area is a huge step toward a smarter, better system, most of data science development teams are “incredibly homogeneous,” he says.


“They’re mostly all white men. Maybe it’s not that they have a subconscious bias against a particular demographic, it’s just that they might not be aware of factors that introduce those biases to their systems, whereas if you were to encourage diversity on the development teams, that would be a much less influential factor.”

In general, if these systems have so much risk, are they doing more harm than good for our society?

Cappelli says we don’t actually know: “A vendor is selling [you] software to identify who is going to be a good hire–does it work? Well, how would you know? That requires a little preparation. You have to look at how the software scores different candidates, and look at how the hires actually perform later on. That’s at least a start. And the problem is, most companies don’t even know who a good performer is in any kind of systematic way. Companies are buying software to predict who is going to be a good hire, but they don’t themselves know what’s good performance.”

About the author

Vivian Giang is a business writer of gender conversations, leadership, entrepreneurship, workplace psychology, and whatever else she finds interesting related to work and play. You can find her on Twitter at @vivian_giang.