Typing is still the chief tool of the digital age. You can ask your iPhone little questions, or dictate emails to your tablet, if you speak like a nervous, semi-sedated translator. But getting your spoken thoughts, interviews, and meetings into text still requires transcription work. Now, however, the web has made affordable, fast, and easy to use transcription a rather convenient thing to obtain.
Rev, an audio transcription and translation service founded by early employees of online work marketplace oDesk, hits all those qualifying marks, and provides some assurances about its workforce, too. Getting your voice files to Rev is easy through its website, and nearly effortless if you use their recently launched iPhone/iPad recording app. Record through the app itself, fling the files to Rev with the speaker names attached, and the guaranteed turnaround for an hour of non-complex audio, with 98 percent accuracy, is 48 hours, for about $1 per minute. The results are mailed to you as a clean Word document.
I know this because I interviewed Rev CEO Jason Chicola, recorded the two-way conversation through Google Voice, and uploaded the 28-minute MP3 to Rev’s own servers shortly before 3:06 p.m. on a Tuesday. At 12:22 p.m. Wednesday, Rev emailed me to let me know that my 4,000 words were ready. On another occasion, I took a four-and-a-half minute “note to self” about plans for a future e-book, and received the transcription back in 23 minutes.
What does 98 percent accuracy on a transcript look like? I’ll use only Rev’s non-corrected transcription to quote Chicola for this post. In an interview, he noted that Rev’s price/accuracy/speed were pretty good by industry standards, but not record-setting–not yet, anyway.
“We named the company Rev because we know that it’s not possible to be too fast, and so we’re doing a lot of stuff in HQ to make this faster and faster,” Chicola said. “We expect that the (turnaround numbers), while they’re true now, we’ll probably be twice that speed by the end of the year.”
Most of the non-enterprise transcription options on the web function as front-ends for global commission marketplaces, like Amazon’s Mechanical Turk. In the few times I’ve sought out interview transcriptions, I’ve come to services like NoNotes or Speechpad, among others, only to hesitate when faced with a week’s turnaround time, the higher prices for 48-hour jobs, or a general uncertainty. Rev’s simple pricing and convenient recording app are competitive, but it is the backend efficiency and upfront labor practices that take up the team’s attention.
More than two-thirds of Rev’s screened transcription workers are native English speakers–not a requirement, Chicola said, but a result of the firm’s proficiency requirements. Hired freelancers use Rev’s custom-built tools for transcribing: text editors with built-in shortcuts and automation tools, audio playback tools with fine-grain rewind and advance powers, foot-pedal integrations, and more.
There are a wealth of computer-powered services, free and paid, that offer quick transcription. Chicola doesn’t see Siri, Evernote, Google Voice, or similar offerings as Rev’s competition.
“First and most importantly, the technologies that are out there tend do work extremely poorly when you have multiple speakers,” Chicola said (and Rev transcribed). “Technology that can be effective for one speaker is not effective for two speakers, and that’s an important distinction because nearly everything we transcribe is multiple speakers.”
The real challenge is expanding the market. Chicola said Rev’s market is segmented almost evenly between academia, lawyers, doctors, marketing firms doing surveys and panels, and journalists. More jobs that haven’t traditionally had transcription services can think of Rev as a secretary, and the Rev Recorder as a kind of Dictaphone. Chicola and Rev have to get programmers and artists to consider transcription for calls with clients, and small boards of directors to send over their meetings for much more detailed minutes. To that end, Rev has an API that can automate the process even further.
Rev must, in other words, get people to trust that they can speak at length, by themselves or with others, and very quickly get those words back as text they can copy, paste, and run with.
“The bottom line is we realize it’s an enormous market and nobody was squarely focused on it,” Chicola said. “We saw the opportunity to really turn the market upside down.”
[Image: Flickr user Abulic Monkey]