Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Australia Cellphones Handhelds Technology

Sophisticated Voice Commands the Next Big Step For Smartphones, Says Woz 93

splitenz writes "Sophisticated voice commands will be the next big step for the development of smartphones, according to Apple co-founder Steve Wozniak. 'We have gotten down to such tiny devices with amazing computers inside with all the human senses; vision, hearing, touch, location acceleration and movement. I don't want to click buttons anymore and I just want to do things without having to think about which buttons to click.' He was speaking at the Australian Chamber Business Congress. Wozniak also sees a continuing place for touchpads."
This discussion has been archived. No new comments can be posted.

Sophisticated Voice Commands the Next Big Step For Smartphones, Says Woz

Comments Filter:
  • by h4rr4r ( 612664 ) on Friday June 03, 2011 @10:32AM (#36331562)

    I don't want to have to yell at my phone in public, I don't want to have to remember which keywords to say.

    I have google voice search and other the other crap it does, I never use it. It is far easier, faster and less annoying to myself and others to type in what I want.

    • by somersault ( 912633 ) on Friday June 03, 2011 @10:38AM (#36331616) Homepage Journal

      +1 concur.

      My Mac had voice commands in the 90s, but it didn't include useful stuff like being able to choose which file was selected etc, despite being able to open and close files/windows with it. Even if it could do everything that your mouse and keyboard can do, it's still faster to just use your hands for the most part. Voice command is great for people with disabilities, but on a smartphone in a busy environment, what's the point? It's either not going to work because of ambient noise, or you're just going to piss everyone off.

      Bonus clip. [youtube.com]

      • by gnick ( 1211984 )

        Some of the basics are just fine - Like when I'm driving it's convenient to say "Call ***** mobile" and have it ring my girlfriend. But for most other applications (games, calendar, movies), I'm going to have to interact with my phone anyway just because of the nature of the activity. What use is it to say "Play Fight Club" if I'm not going to be holding and viewing the device?

        • by bsharp8256 ( 1372285 ) on Friday June 03, 2011 @11:03AM (#36331842)
          I call my girlfriend ***** too.
        • Some of the basics are just fine - Like when I'm driving it's convenient to say "Call ***** mobile" and have it ring my girlfriend. But for most other applications (games, calendar, movies), I'm going to have to interact with my phone anyway just because of the nature of the activity. What use is it to say "Play Fight Club" if I'm not going to be holding and viewing the device?

          Thankfully it would be optional for those instances. I actually see quite a bit of use for it and was disappointed with my iPhone's meager amount of commands that I never use and the activation of which takes too much effort, is not smooth. If I could treat it more like a personal assistant than an input device it would be nice, esp when driving.
          Ala star trek, "computer, captains log, make appointment on 15th of June, 3pm for tennis lessons".
          Computer, "Text Dave, 'hey dave want to play a game?"
          Compute

      • by syousef ( 465911 )

        +1 concur.

        My Mac had voice commands in the 90s, but it didn't include useful stuff like being able to choose which file was selected etc, despite being able to open and close files/windows with it. Even if it could do everything that your mouse and keyboard can do, it's still faster to just use your hands for the most part. Voice command is great for people with disabilities, but on a smartphone in a busy environment, what's the point? It's either not going to work because of ambient noise, or you're just going to piss everyone off.

        Bonus clip. [youtube.com]

        I know "The Woz " is a geek favourite, and he certainly has technical prowess BUT the man is also a self aggrandising fool who has a bad habbit of exaggerating things. He wrote a book "How I Invented the Personal Computer And Had Fun Doing It" for feck sake. The man for all his prowess did NOT invent the PC. I'm sure I'll be modded into oblivion but it has to be said. I wouldn't take any of his predictions seriously.

    • Exactly my experience as well. Some limited voice commands I have found useful (play song, call this person, etc) but the others are often too much broadcasting of my activities
      • Agreed. Have you ever noticed how in Trek, computer voice interactions were generally limited to single actors in a given scene, generally either the ranking officer or a technical expert? Something like that might work.

        The communicator was also voice-commanded, of course. But they never tried it in a bar. :)

    • by Rei ( 128717 ) on Friday June 03, 2011 @10:42AM (#36331648) Homepage

      Shut up friends. My internet browser heard us saying the word Fry and it found a movie about Philip J. Fry for us. It also opened my calendar to Friday and ordered me some french fries.

    • by kinnell ( 607819 )
      This [youtube.com] pretty much sums up my experiences with voice recognition technology.
    • by DdJ ( 10790 )

      ...I don't want to have to remember which keywords to say.

      If you have to remember keywords, it's not the sort of system I think Woz is ultimately talking about. How many years until IBM's Watson will fit entirely within your cell phone? Imagine something that you could chat with under your breath as if it were a person, not something like the voice command software of the 1990s.

      Imagine being able to mutter under your breath "now how do I get to the doctor's office?", and your phone presenting a little not

      • by h4rr4r ( 612664 )

        Even if it was as good as the human the failure rate would still be too high. Instead of whispering, typing is more accurate and much faster.

    • by wiedzmin ( 1269816 ) on Friday June 03, 2011 @11:16AM (#36331944)
      FORMAT C ENTER!
    • Agreed. Until computers reach the point where they can process and understand natural language like we see happen in Star Trek, I just don't see how voice can replace touch for all but a handful of situations (such as when hands aren't available while driving, exercising, etc.). If natural language processing can become capable enough such that we can speak to computers normally and get the output we expect, then sure, it'll have a place. But until then, cell phone use has showed us that talking to someone

  • Yeah sure (Score:5, Funny)

    by nitehawk214 ( 222219 ) on Friday June 03, 2011 @10:32AM (#36331564)

    We can have more people yelling into their phones. "Call Frank. No, not Balls Sank! CALL FRANK!"

  • NOISE REJECTION... the Iphone voice control works great with my bluetooth helmet when below 40mph but as soon as I hit highway speeds it stops responding to my commands.

    It's great to be able to ride and change songs, make and receive calls, but I'd love to be able to also select podcasts, that right now does not work. only playlists, or you need to manually start your podcast, and it will play all in that podcast folder.

    • by lxs ( 131946 ) on Friday June 03, 2011 @10:42AM (#36331652)

      That's nature's way of telling you to concentrate on the road.

    • Would a throat mic help?
    • BTW - I had this same problem and discovered that you can make a playlist *of* your podcasts and play that with voice commands.

      Apple made it possible entirely from the iPhone when they added playlist editing on the phone.

  • by LWATCDR ( 28044 ) on Friday June 03, 2011 @10:34AM (#36331582) Homepage Journal

    There are things I would love to do with voice on a mobile device. Play lists, nav, texting, dialing. What I do want is to live a world full fo people talking to their phones or themselves. Can you imagine a mall full of people using voice to text?
    Or more simply hell.

    • by Kozz ( 7764 )

      There are things I would love to do with voice on a mobile device. Play lists, nav, texting, dialing. What I do want is to live a world full fo people talking to their phones or themselves. Can you imagine a mall full of people using voice to text?
      Or more simply hell.

      Well, we can always hope that adequate social pressure will prevent most people from doing stupid/annoying/obnoxious things, as with anything in life. That being said, I can also imagine how useful it might be to have a kind of "always on" mode for my phone with regards to voice commands. Set it on the kitchen counter as I'm going about my day, and give it commands like

      • "Smartphone: grocery list addition, Jolt Cola"
      • "Smartphone: reminder, 1 hour, call Jake regarding LAN party"
      • "Smartphone: text message
      • > "Smartphone: text message to Jane. I had a great time on our date last night. What are you doing on Saturday?"

        I was shocked that my Nexus S can practically do this now. "Send text to XXXX, I'll be home around 6 p.m." It worked, mostly. The text recognition worked flawlessly, but I did have to hit the "send" button with a finger. Beats using the on-screen keyboard.
      • "Well, we can always hope that adequate social pressure will prevent most people from doing stupid/annoying/obnoxious things, as with anything in life. "
        You mean like having their ringer on in the movie theater? Have loud private conversations on their cell phone while standing in line at the store? Like playing loud profanity filled music from their car while parked at the 711 at 10am on a Saturday morning?
        Where do you live because I want to move there.

      • Smartphone: text message to Jane. I had a great time on our date last night. What are you doing on Saturday?

        phones can already do this. it's called a phone call.


  • Good idea, but too late for Woz. He could have tried "Phone, find me some proper dance steps for the Argentine Tango."
  • Presumably this is just part of some prank. We'll see who he gets to think their cell phone is out shopping for groceries.

  • This cannot be a replacement for tactile input. While using voice commands to accomplish a task can be very useful in some situation there are many places where it would be inappropriate or inconvenient. Not to mention typing it faster than talking and using a mouse allows for precision when it is needed.

    A truly well developed voice command system would be a boon to accessibility and convenience in many situations but cannot fully replace the older styles of input.

    When we have a useful thought interface T

    • Actually thought while better than speech for some commands is worse for data entry.

      Human's don't think in a straight line. even typing out a full story rarely do you think the entire thing out but figure it out as you go. Sometimes as you enter in a large body of text you are also revising previous statements, or sorting out future ones.

      we can't program in parallel processes easily yet why do you think we can do input in parallel with our minds?

      Mice might go away replaced with touch screens and thought co

      • > Actually thought while better than speech for some commands is worse for data entry...Human's don't think in a straight line.

        Thought as we currently think, yes. But with some kind of mental training, we could learn to use a thought interface productively--perhaps setting a mental flag for dictation.

    • I think the thing that voice recognition proponents are missing is that all non-text entry via a voice command system has to happen in band. This creates the problem of either a) systems that erroneously respond to conversations that aren't directed at them, or b) systems that are so tightly limited to very specific cues that they're difficult to use. The infuriatingly non-intuitive escape sequences necessary to switch between direct literal transcription and command entry just add fuel to that fire. Voice
  • The situation with people yelling into their phones is already way too annoying when sitting in a restaurant, bus, or other public place without adding more vocal static to the background.

    I propose working on a Cone of Silence add-on for cell phones, or maybe a neural headset transmitter-to-speech accesory.

  • irrelevant commentator is irrelevant

  • Even among humans, how many times one must ask for someone to say it again.

    Giving voice commands to a computer will get you a row boat when you ask for a robot and tickets for a nudist play when you want a new display.

  • Kinda surprising that they will be first to have a fully voice activated phone. Either way, it's coming to everybody.

    • by h4rr4r ( 612664 )

      And it will fail. For the same reason it always has, no way to tell "colon" from ":", speech has a high error rate even between two humans communicating, it is slow, it does not work in noisy environments, and it is annoying to everyone else in quiet environments.

  • use it every day when driving, it just works...

    "hey vlingo, call steve"

    http://www.vlingo.com/ [vlingo.com]

    • by h4rr4r ( 612664 )

      Even the built in one can handle that.
      Still makes it useless in public, in quiet places or for anything using long words. Try to send a text message including the names of some pharmaceuticals or other complicated words and see what kind of mess you get.

  • by Anonymous Coward

    While it seems we have many people here that would make good slapstick writers, I've never had to comically yell at Google search to make it work. I simply say:

    "Call Eve"
    "Text Eve, I'll be home in 20 minutes"
    "Map of Walmarts"
    "Directions to Walmart"
    "Navigate to Walmart"
    "Note to self, post something on Slashdot tonight"
    "Listen to Beethoven"
    "Go to Wikipedia"

    And other very useful commands. Yes, most of the time I use my fingers (especially in public), but there are still many times that the voice commands are i

    • by mridoni ( 228377 )

      The problem I have with Vlingo, or Google voice actions, is that they only work in English. While I have no problem in saying "Call" instead of "Chiama" (in Italian) I would have to "translate" the pronounciation of my contacts'names into English, for Google to be able to find them. This is immensely awkward and, by the way, doesn't work well (I tried). It's a pity, also because Google's voice recognition engine works very well in Italian (Voice Search works, Voice Actions do not) and a major usability hurd

  • I was actually thinking about this earlier today. Admittedly, I do love how accurate speech-to-text is on my Android phone; typing out a text message *is* a lot quicker that way. However, my thoughts were more focused on how "For sales, press 1. For support, press 2. For billing, press 3." has been replaced with an automated voice (invariably female) who says "tell me what you're calling about", and then hasn't the slightest idea what to do when I say "technical support" or "representative"...or tries to ev

    • higher, higher, a tad higher. No, Lower. just a smidge higher. There! Thats it! Now pull back, more, more, a little less good. Ok Shoot.
    • The call for "voice activation" goes out semi-regularly from certain high-profile people. Fred Brooks said the same thing in 1995 (Mythical Man-Month After 20 Years: "I expect the WIMP interface to be a historical relic in a generation... speech is surely the right way to express the verbs").

      My best guess is that these high-profile people spend their working in days in home offices, private cars, etc., where there's no one else to contribute noise pollution or get annoyed by voice-commands. For the rest of

    • "elevate to 32 degrees. pull to 80%. release."

      That might actually be easier than the current interface ;)

  • Voice commands work pretty well on my Windows phone. "Call xxx Home" never fails. Well, actually it did fail when I first set up the phone for English UK, (I want my U in colour) but it was expecting an English accent, and wouldn't respond to my Canadian. I'm wondering how it works in the Southern US. "Y'all Call Home"
  • is Scotty talking into the Mac's mouse.

    • by h4rr4r ( 612664 )

      And even in star trek they had to say "Computer" before they spoke. Yet, PADDS still took touch input since no one wants to say out loud, "Computer, interspecies gangbang porn with at least 3 sexes and a midget".

  • I tell my iphone - Call. Jane. Bonner.

    iphone says - Calling Dave Norwood.

    Iphone voice dialling totally useless at the moment. Maybe they should fix that first.
    • The Android one works extremely well, and is very good for composing texts quickly as well. Its accuracy is pretty amazing. I think Apple just bought a company that specializes in voice (Nuance), so it's quite likely that it will be fixed.
  • First recognized command

    "Find Wankage Material"

    Second recognized command

    "fap...fap...fap...uhhhuhhhhhhh"

    Third recognized command

    "zzzzzzzzzzzzzzzzzzzzzzz"

  • Watch, apple will force this to be the only way to do things on their devices, and force their customers who buy their crap anyways instead of taking a stand. Its what they do.
  • Any more such brilliant insights, Mr. Woz? And who do you suggest Apple should try to rip-off this time to get the technology?

    Of course, if Apple's foray into handwriting is any indication, Apple will "solve" this problem by having us speak in Morse code, just like they didn't manage to get a decent handwriting system together.

    • You say that as if handwriting is somehow superior to a virtual keyboard. Evidence?

      • by t2t10 ( 1909766 )

        Oh, dear, the analogy was lost on you. Let me explain.

        Apple's last tablet, the Newton, had lousy handwriting recognition. They bought some third party software, it didn't work, and it dragged the whole product down. Their solution? A cheap, slow on-screen keyboard on their next attempt.

        Speech recognition is similar to handwriting recognition. If Apple's past record is any predictor, they "solve speech recognition" by buying a third party speech recognizer (check), failing with it, and then having us s

        • When I think of Apple, I don't think of the Newton. Anything produced by Apple while Jobs wasn't there shouldn't be counted towards Apple's product philosophy.

          How has the current iOS voice recognition failed?

          • by t2t10 ( 1909766 )

            When I think of Apple, I don't think of the Newton.

            That statement is fully in line with Apple's usual "innovation by redefinition". The 1984 commercial was prophetic.

            How has the current iOS voice recognition failed?

            It's not an input method, it just recognizes a few commands.

            • That's not a failure. That's a design decision. I doubt there is a big demand at the moment for the type of voice recognition you're talking about, so it's silly for Apple to invest so much time in that area when there's still a lot of work to be done in more important areas.

              Who does have flawless voice recognition incorporated into a smartphone?

              • by t2t10 ( 1909766 )

                I doubt there is a big demand at the moment for the type of voice recognition you're talking about,

                Well, Woz disagrees with you, hence the whole f*cking discussion.

                Who does have flawless voice recognition incorporated into a smartphone?

                Nobody, because voice recognition doesn't work reliably yet. That's why Woz's suggestion is stupid. But Apple did the same kind of stupid thing with Newton.

                • Woz was talking about voice commands, not an input method. Woz hasn't had much input into Apple's design decisions in quite some time, anyway.

  • Over a decade ago, there was a really good voice-controlled phone system called Wildfire (audio demo) [virtuosity.com]. It took a lot of computer power for the time, it was an expensive service to provide (racks of machines in the central office) and originally cost about $5 to $10 a day. It let you juggle multiple calls and callers through a very fast-responding voice interface.

    Orange, the European mobile provider, offered Wildfire as an extra-cost service from 2000 to 2005, then discontinued it over customer objectio

    • Orange, the European mobile provider, offered Wildfire as an extra-cost service from 2000 to 2005, then discontinued it over customer objections.

      Seems like it didn't work all that well in the field. And that seems to be The Problem. Microsoft aside, it demos pretty well, but ends up being a very niche sort of thing.

      I think Voice Recognition should be kept (and improved) especially for disabled persons and those niche applications, but I don't see it being a general method. Besides, as someone has previously pointed out, the current texting generation will probably fuse their fingers to a keyboard in the next decade or so, so they won't have to

  • My phone smells funny and tastes like eww.
  • Nothing like trying to call a friend late night on a Saturday, and instead your phone thinks it's a great idea to call your landlord instead
  • Why would we use speech with our phone?

    (Returns to typing his blog on his iPhone...)

  • Slashdot comments will be filled with those who believe it must be an all or nothing concept and can't imagine talking to their phone all the time and make outlandish scenarios where like minded editors with karma mod them up.
    While the rest of us see it and think yeah, I had that same idea about 5 years ago and would find it very useful in certain situations I run into frequently. At least I'd have it as an option when I couldn't use my hands or was multitasking. Nice article and then move on to the next
  • The idea is to use speech to do complex stuff or answer questions that would take multiple screen based input steps....

    "order me a pizza" (phone leverages location data, payment data, etc to order)
    "how long will it take me to drive to Frank's house from here?" (phone responds with time/mileage/cost)
    "When was the last time Mike called? (phone responds with date/time/call length etc.)
    "play me some Jay-Z music"
    "add an appointment for next Wednesday to see the doctor at 8am, alert me the day before"

    This is the

  • I remember reading about an interesting concept a loong time ago in in Dave Duncan's book "Strings" [amazon.com].

    Basically, the computer took a sample of your normal talking voice, then a sample of what they called 'command voice' or something. When characters were communicating to the central computer they'd simply use their command voice instead of their regular voice. The computer was able to tell which user was requesting what action based on voice identification, and would ignore regular speech unless instructed

  • I am having really for smt to log my daily actions and using the keyboard for that seems silly and unnecessary. F.e. I just want to say to my phone, "Log this, the gardener started working today". And 1 year later it can tell me "Increase his fee" Does this seem silly to you? I am so f. in need of smt. like this. Nexus S seemed like the sh. but it wasn't what I was hoping for. But some good devs can extend this usage. Oh shit, I just gave an excellent idea to someone!
  • The environmental inputs change unpredictably, the data input rate is a snail's pace, there's too much variety in users' voices, accents, dialects, and vocabulary, and it requires massively more processing power than a keyboard or touch interface. It's as if your kid asked for a mouse for a pet and you gave them an schizophrenic incontinent three-legged elephant.

Understanding is always the understanding of a smaller problem in relation to a bigger problem. -- P.D. Ouspensky

Working...