BLOGGER TEMPLATES AND TWITTER BACKGROUNDS

Sunday, August 30, 2009

Speech to Text Coming To iPhone?

According to a patent filing, Apple is working on speech-to-text technology for its iPhone and iPod product lines. Speech recognition could be the holy grail for data entry and retrieval on mobile devices, especially as they continue to shrink in size.

The Baltimore Sun found the patent and has included a diagram of how the system would work when composing an email.

There is a lot of engineering speak in the filing, but I could decipher a few tidbits of info - that and I've seen this stuff on Star Trek so I know how it is supposed to work. It seems the speech recognition module they are working on would be able to not only handle text but non-speech data as well, such as punctuation.

To varying degrees, this has been tried before on mobile devices. The most rudimentary are the voice snippets you can record into your phone for a few of your favorite contacts. One of the better speech tools for phones is by Microsoft and called Voice Command. It is really pattern recognition. You can say "Call Sally Jones at work" and it will search through your contacts and find a name that matches what your digitized voice said and dials the number. You don't have to train it or record her name before you can use it. You can also ask it the time, battery level, signal strength, upcoming appointments and more. It is rather limiting though and there is no way to compose an email with it or tell it to do anything outside of dozen or so tasks it was written for.

I recall one demo by Bill Gates a few years ago where he spoke into a Pocket PC (that is what they were called way back when) and got nearly flawless text recognition out of it, but the trick there was the voice data was converted to digital then sent via wireless to a powerful server which did the heavy lifting. It returned the text to the screen. In the day's GPRS networks, it just wasn't feasible, which is why he was using WiFi. Today with 3G networks, it is more realistic, but you have the issue of who is going to pay for the server to potentially service hundreds of thousands of voices simultaneously?

It seems to me from perusing the patent that the speech recognition module is a separate chip or other such hardware that will be in the device that will be purpose built for this, much like a video card offloads graphics from your compute's main processor. If Apple can pull this off, they will have a huge win on their hands.

I just hope they put an altimeter on it that cuts the module off at 10,000 feet so I don't have to listen to the guy next to me on a cross country flight dictate a research paper into his phone.
http://www.informationweek.com/blog/main/archives/2009/08/speech_to_text.html;jsessionid=1P544IFZ1JHDPQE1GHPCKH4ATMY32JVN

0 comments: