Baidu's Voice Recognition Software Is More Accurate Than Typing (thestack.com) 55
The massive Chinese web services company Baidu has
launched their sophisticated new TalkType 'keyboard' which defaults to voice recognition app. An anonymous reader quotes The Stack: Baidu claims that the app's speech recognition is more accurate than actual typing, having developed and tested the technology alongside speech software experts at Stanford University...The researchers concluded that Baidu's technology was three times faster than a typical user typing in English. The results showed that the TalkType error rate was 20.4% lower than an English texter hunting and tapping for letters. The accuracy was even greater for those typing in Mandarin, with the error rate dropping 63.4% when using TalkType.
Of course, last year Baidu was also accused of gaming the testing for their image-recognition software.
Of course, last year Baidu was also accused of gaming the testing for their image-recognition software.
Better than hunt and peck? (Score:3)
That is like the test where someone claimed they defeated the Turing test by pretending to a retarded foreign boy that didn't speak English.
I guess they also only picked people who had never before typed anything in their life as well.
Leads to hunt-n-smash (Score:5, Insightful)
Another thing -- when I'm typing, and there is an error, I'm right there to correct it.
With voice recog, at least right now, editing it after it's been screwed up by Google or whatever is more of a PITA than just typing it out in the first place.
Trying to actually do decent editing (at least on my S7) is seriously annoying. Cursor positioning is flaky as hell, parts of messages disappear above and blow the edit point, I try to drag the edit point and it scrolls up or down so fast there's no chance of actually getting where I meant to go...
I grant you that this kind of thing is the result of bad design at some level in Android or some library most everyone is using, and could be corrected... but right now, it's SN/AFU. That's a big factor in why editing as I go, rather than trying to get "somewhere" in something already containing lots of text, is much easier on my temper.
That said, I would welcome 99.99999% accurate voice recog. Not holding my breath, though.
Re: Leads to hunt-n-smash (Score:2)
Re: (Score:2)
That's it, unfortunately: if you want to have some arrow keys to position the edit point more conveniently, you have to root your phone!
Character versus word errors (Score:2)
A character error is easy to read past be a word error changes the meeting.
Re: (Score:2)
I complete agree: a turd error changes the meating. [send] (goddammit)
Re: (Score:2)
Re: (Score:1)
Not really. The key is - (faster than) "English texter hunting and tapping for Mandarin letters".
Are those... clutches? (Score:1)
Here, have some Trumpy crotch grabbing humor. [twitter.com]
Texting isn't typing (Score:3)
Re: (Score:2)
You're right, texting isn't as fast or accurate as typing, but I think you got the numbers wrong.
Near the turn of the millennium, speech recognition software (ViaVoice, etc,.) achieved a claimed 99% accuracy. So I tried it out. After training, I got over 95% by speaking carefully (and slowly). The problem was finding and fixing those 05% mistakes took longer than typing the whole document over would have taken.
And yeah, most touch typists can't get more than 35 wpm and touch screens are worse, so the dec
Re: (Score:2)
Current Keyboard Was Designed For Speed (Score:2)
While it is true that there are keyboard layouts that can make typing faster on a computer, the current keyboard was designed to space out the hammers so they would not jam on typewriters thus increasing the speed people could type.
Re: (Score:1)
Re: (Score:3)
On a full English language keboard
Heh.
there is no way speech is faster if you know how to type.
Nah, that's just not true. Most professional typists don't exceed 100wpm, while the average person talks at 130-150wpm.
If typing was so much faster than speaking, they wouldn't do live subtitles by having someone repeat the words into a mic for speech recognition. Which is what they do, with occasionally hilarious results.
Re: (Score:2)
Re: (Score:2)
"Revoicing" is becoming more popular for live TV captioning. Revoicers, also known as respeakers, repeat clearly what is being said during unscripted events using special software that's trained to recognise their voice. Their speech is then converted into text which appears on a caption unit, an LED or large screen. Revoicers also need to pare down (edit) the live dialogue or conversation, which means the text that appears isn't verbatim, although it will always give a good idea of what's being said.
Re: (Score:1)
Re: (Score:2)
and with a display and backspace key (since I'm human) my ultimate accuracy is 100%.
That applies to every form of input. Texting only has a low accuracy rate because people need to make corrections, kind of like you are doing which makes you 100%.
Also you typing 100wpm is atypical. Most people don't type that fast. No actually that's not right. I'll wager that very very few people are able to type that fast.
Re: (Score:2)
Then you have a major speech impediment and should probably see a therapist for it.
Using your post at a sample, I am able to read it aloud in 22 seconds at a conversational rate. This is the same rate I use reading stories aloud to my children. Using my slower, more enunciated "speech recognition" voice, usually reserved for Google input, it takes me 37 seconds and the only thing I had to correct afterwards was the ( and ) you used. That includes all of your punctuation and the automatic correction of "kebo
Re: (Score:2)
On a full English language keboard there is no way speech is faster if you know how to type.
How fast do you type?
I've transcribed hundreds of hours of tapes, mostly lectures and panel discussions. I tested ~72 wpm. I spent a lot of time perfecting my typing methods and speed.
I estimated that most lectures were about 120 wpm. Some people talk much faster, particularly in bursts. I think certified courtroom stenographers have to pass a test at 210 wpm.
I could never keep up with continuous speech. I used a transcribing machine, and played it back at a slower speed, and/or backpedaled. I could usually
I'm not the average typist (Score:4, Interesting)
Re: (Score:2)
Re: (Score:3)
The article talks about speech recognition, not voice recognition. EditorDavid has the two concepts mixed up: speech recognition is all about trying to recognized what you are saying, whereas voice recognition is all about recognizing specific voice, like e.g. for reasons of identifying who is speaking.
[actual expert here]
Not exactly: "speech recognition" means taking in speech and putting out some kind of text; "speaker recognition" is a general term for identifying speakers or verifying speaker identity. "Voice recognition" is a term that is not used in the field (but is sometimes used in the media) which generally means the same thing as "speech recognition".
What about autocorrect? (Score:2)
The results showed that the TalkType error rate was 20.4% lower than an English texter hunting and tapping for letters.
How many of those errors could have been reliably corrected by some form of autocorrect, or was such already included in the tests?
If I try and type "thw quick rbown fox jump sover the lazy dog" as fast I can... well, that's the result. Autocorrect could have fixed most of those problems.
Re: (Score:2)
Re: (Score:2)
About a month later, you will discover you never needed it in the first place. Plus, you will never have have to deal with people who mis-interpret your meaning in your text communication, as the improperly spelled uncorrected version of whatever you were trying to say will be instantly recognizable by whomever is reading it for what it was supposed to be, because as humans we are very, very good at that.
Why would anyone w
Oh Good... (Score:3)
Oh good, more assholes yelling into their phones while in public spaces. That's exactly what we need.
Suspicious... (Score:3)
One of the things that characterises modern Chinese language is the proliferation of homophones (words that sound alike).
The way that Chinese people cope with this is extreme use of context and of spelling; the homophones don't have the same character. Sometimes Chinese people will clarify meaning by sketching a character in the air, often unconsciously.
If the error rate reduction is so huge based on speech recognition this would suggest that pinyin can replace characters for writing Chinese. And this has been disproved on many occasions; you can literally write an entire story using only the syllable 'ma'. In pinyin it all comes out as 'ma' with the 4 tones. In characters its actually readable. Same with the story of the lion eating poet in the stone den which is all 'shi'.
So a great test of this Baidu software would be to get someone to read this to it and see what it comes up with:
https://chinesepod.com/blog/ho... [chinesepod.com]
https://en.wikipedia.org/wiki/... [wikipedia.org]
and see if it gets it right:
Sh Shì shí sh sh
Shíshì shshì Sh Shì, shì sh, shì shí shí sh.
Shì shíshí shì shì shì sh.
Shí shí, shì shí sh shì shì.
Shì shí, shì Sh Shì shì shì.
Shì shì shì shí sh, shì sh shì, sh shì shí sh shìshì.
Shì shí shì shí sh sh, shì shíshì.
Shíshì sh, Shì sh shì shì shíshì.
Shíshì shì, Shì sh shì shí shì shí sh.
Shí shí, sh shí shì shí sh sh, shí shí shí sh sh.
Shì shì shì shì.
Re: Suspicious... (Score:2)
In fact, Google's voice accuracy
Re: (Score:2)
Chinese input using New Phonetic Method actually does this too. Basically after I type the sounds then tone and move on to the next character it will change the first character based on the sounds and tones of the following characters and continue to do so onto I press enter. Often I will type out an entire sentence before pressing enter, though sometimes it starts to spit out bad results if you go for too long. It also does some recognition based off previous character choice such as when using gendered
'hunting' and typing? (Score:2)
No big brother (Score:1)
What not a single comment about big brother implication .... specially for a Chinese company where each of them are accused of being an extension of the party.
And I don't care about look what yahoo did or whatever an extension is different from complying with the law in a democracy
NSA announces a competing product (Score:2)
The problem isn't the low error rate... (Score:2)
The problem is the false sense of security and subsequent lack of proofreading and error correction.
Try voice recognition software for a week. You'll likely find that you will read over something that you dictated and not realize that there are errors in it. People are less likely to find errors in something they dictated than in something they typed.
Huh? (Score:2)
I can type more precisely and quickly and confidently than I can talk. Deciding which words to type is faster and easier. How is voice recognition going to improve on the speed by which I can speak, which is inferior to my typing ability whenever precision is required?
Meanwhile back at the ranch... (Score:1)