The company plans to release over a million facial images that anyone can use as a data set to help train their AI facial recognition system. The data set of over a million images will be highly diverse with different skin colors and tones. It’s also over five times larger than the current largest public data set available, which contains 200,000 images. The company is also releasing an additional data set of 36,000 facial images equally distributed across all ethnicities, ages, and genders. As IBM notes in a blog post:
AI holds significant power to improve the way we live and work, but only if AI systems are developed and trained responsibly, and produce outcomes we trust. Making sure that the system is trained on balanced data, and rid of biases is critical to achieving such trust.
However, the company also stresses that AI facial recognition systems should never replace human judgement:
As the adoption of AI increases, the issue of preventing bias from entering into AI systems is rising to the forefront. We believe no technology–no matter how accurate–can or should replace human judgement, intuition and expertise. The power of advanced innovations, like AI, lies in their ability to augment, not replace, human decision-making. It is therefore critical that any organization using AI–including visual recognition or video analysis capabilities–train the teams working with it to understand bias, including implicit and unconscious bias, monitor for it, and know how to address it.
Recently there have been growing concerns that facial recognition systems could be infected with biases that would profile specific groups of people over others.