IBM’s “Most Human” Computer Voice Makes HAL Jealous
Making a PC generate sounds that resemble human speech is relatively simple. But making a machine sound <em>convincingly</em> human is very tricky. Yet IBM claims to have coded a synthetic voice that is the more similar to a human voice than any created before. Its level of human mimicry is so high that it even copies our errors by umming, erring, and even sighing. If your attention is needed, it can even gently cough, or shush you when you’re interrupting.
The newly patented tech has been dubbed “generating paralinguistic phenomena via markup in text-to-speech syntheses” and it’s designed for use on telephone automated systems, devices like GPS units, and possibly even cellphones. Its sophistication is in the “paralinguistic” part of its name: those gentle little quirks that make a human voice unique. As Andy Aaron of IBM’s speech research team says, “These sounds can be incredibly subtle, even unnoticeable, but have a profound psychological effect.” The system can pause for effect, react to situations by modulating its speech, and it can learn new affectations, which it will then place in the correct part of a sentence.
Intelligability has been achieved in synthetic voices for decades–check out the video, which shows the first ever computer singing (along with a classic clip of HAL)–but these speaking devices are obviously not human, which may incline you to distrust or dislike listening to them.