• News
  • Columns
  • Interviews
  • BW Communities
  • BW TV
  • Subscribe to Print
BW Businessworld

My Voice Is My Password

Photo Credit :

I first encountered speech recognition on the PC, with your Dragon software. How far do you think we’ve come from then?
Frankly, when I joined Nuance, we had started to get concerned about whether people would install the box product for Dragon dictation and use it. Today, they probably wouldn’t. But voice is really seeing a resurgence.  Cloud has changed everything. Certainly if you spend a lot of time typing and inputting professional information such as lawyers and, doctors do, you would need voice products. In the US, doctors now have tablets instead of pen and paper. They dictate to the tablet. It saves time and is more accurate. There’s a lot of workflow that can happen from that.  We have a lot of solutions for healthcare that are tuned up for hospitals. All the needed lexicon is built into the model.

What is happening with voice beyond dictation?
You can now bring in some intelligence into that in the background. So if a doctor is saying Jason’s got a temperature and he has a blood pressure issue, etc., suggestions on what to explore can be brought up. In medicine there’s a bunch of different niche areas, including cardiology  where voice speech is being used. At Nuance we think that voice recognition is transitioning to conversations very rapidly. This means moving into natural language, understanding and semantic processing, and classification because we have to work out not just what you said but what it means.

What other uses are there for intelligent speech recognition? Are you implementing any?
There are many uses in the enterprise, especially for customer care. For example, we’re doing something with Tata Sky where you can order a movie and update packages by interacting with a voice. We’re also seeing use in the call centre business and in banks where the combination of semantic processing and voice recognition is helping deliver services.  Of course, the biggest challenge is accents, especially in India, because there are so many varied ones. But that’s something we are working on, building in many languages and accents. But it’s a very difficult piece of engineering and needs a lot of data and time for accuracy to grow gradually.
I use the Swype keyboard, which is one of your apps for voice input for text, but I don’t always find it accurate. What can be done to improve it?
It’s taken many years to work this out but now the ability to send that data to the cloud has really helped us because we get a big bank of material to work on. But so many factors can degrade the raw data, among them microphones, which are now getting more sophisticated, and also the way the data travels through multiple systems through cellular networks before it reaches us. These have been some of the challenges of achieving accuracy. But now with data going straight to our cloud, it’s helping accuracy. Another thing we’re investing in a lot is noise cancellation, especially in cars. That too is enhancing the experience.

(This story was published in BW | Businessworld Issue Dated 20-04-2015)