Forgot your password?
typodupeerror
Communications Technology

Researchers Convert Mouth Movements Into Speech 154

Posted by samzenpus
from the mouthing-of-the-future dept.
andylim writes "According to Cellular News, researchers at Germany's Karlsruhe Institute of Technology have developed a method for mobile phones to convert silent mouth movements into speech. As recombu.com points out, the 'potential for secret conversations just got huge.' You could pass the time by making phone calls from the cinema without disturbing anyone. In noisy places like bars and clubs you could make yourself heard without having to shout."
This discussion has been archived. No new comments can be posted.

Researchers Convert Mouth Movements Into Speech

Comments Filter:
  • tap-proof? (Score:5, Insightful)

    by bwindle2 (519558) on Thursday March 04, 2010 @12:20AM (#31354390)
    From TFA: "For the transmission of passwords and PINs, for example, users can change seamlessly to soundless language and, hence, transmit confidential information in a tap-proof manner." Um, not if there is a lip-reader in the same room, like a hearing-impaired person.
    • Re:tap-proof? (Score:5, Interesting)

      by Dice (109560) on Thursday March 04, 2010 @12:27AM (#31354438)

      According to my ASL instructor, lip readers are rarely more than 50% accurate. Which makes me wonder about the alleged capabilities of this software, honestly.

      • Re:tap-proof? (Score:4, Informative)

        by ScrewMaster (602015) * on Thursday March 04, 2010 @12:37AM (#31354498)

        According to my ASL instructor, lip readers are rarely more than 50% accurate. Which makes me wonder about the alleged capabilities of this software, honestly.

        Hard to say. However, if you want true speaker-independent language recognition ... well, even using voice it's only so-so. On the other hand, if what you want is the ability to issue commands to the computer using a much more limited vocabulary, I'd think you'd have more potential.

      • Re:tap-proof? (Score:4, Interesting)

        by techno-vampire (666512) on Thursday March 04, 2010 @01:14AM (#31354696) Homepage
        I have some hearing loss, and went to a seminar at the VA once about adapting. I don't know how good lip readers get, but for me, at least, it's mostly useful if I have an idea what's being said and just need to fill in bits that I didn't quite catch. I suspect that this will need at least some training with the user, just like voice recognition software does, and that it's going to be a long time before it's good with anything but a very limited vocabulary.
      • Re: (Score:3, Interesting)

        by Jhon (241832)

        According to my ASL instructor, lip readers are rarely more than 50% accurate. Which makes me wonder about the alleged capabilities of this software, honestly.

        You might want to look at this [telegraph.co.uk].

        I don't think this technology is THAT new... or that it's that inaccurate.

        On a side note, I'm hearing impaired (car engine exploded a bit too close to my head). I *CAN* hear -- and that supplements the lip reading I *DO* do... and asking my friend who is totally deaf (and on AIM as I type this), I think that 50% esti

      • by jonbryce (703250)

        As an example, the lip movements for "nine" and "ten" are exactly the same, and it is pretty difficult to work out which one you intend to say from the context of the conversation as usually both could be equally valid.

        • by Orbijx (1208864) *

          It tends to be overcome with a simple r:
          'niner [wikipedia.org]', 'ten' are now completely different lip movements. (Additionally, it breaks a tonal similarity between 9 and 5 up for people who are listening to me.)
          This tends to be even easier to disambiguate in context.

          I use 'niner' on a regular basis in my line of work, in which I give and receive a lot of numbers over the phone, as well as names and locations.

          I'd think that this technology would use a similar method to disambiguate between 5, 9, and 10.

          (Of asides and the

      • Re: (Score:3, Funny)

        by Chapter80 (926879)

        According to my ASL instructor, lip readers are rarely more than 50% accurate. Which makes me wonder about the alleged capabilities of this software, honestly.

        Chat room evidence backs this stat up:

        Anytime you asked A/S/L, chances are less than 50% the answer is accurate.

      • by jelle (14827)
        The difference is probably that in this case the speaker knows lip-reading is being used, specifically wants to be understood, and probably is getting live audio feedback about how well it's working.
      • I would say it would still it may be kinda tricky.
        There is a lot of stuff going on in the mouth other then just your lips.

        For example when I say Fudge The Fu sounds is what I do with my lips and teeth dge sound I do with my tongue by moving it from the top of the inside of my teeth, and rolling it back a bit so the tip of my tongue isn't touching.

        If I rolled my tough just a further back by a minute amount and kept it I would make a K sound. A lot of this movement would be block by a camera plus the toung mo

    • Anytime a technology is a real turd with no use, the folks marketing it try to list as many uses as possible. It's like the ad for the GT Xpress 101 Countertop Grill, which can make omelettes, bake brownies, grill cheeseburgers, boil soup and starch your shirts.
    • Or, I don't know, someone using this same technology?

      • by Bakkster (1529253)

        Bingo. Now instead of eavesdropping being limited to hearing range (and masked by other noise), it's limited by visual range and technology.

        Basically, anyone with the technology to use this for private information, should know that anyone else with the technology can listen in and therefor won't use it for private information.

    • Um, not if there is a lip-reader in the same room, like a hearing-impaired person.

      You could just cover your mouth with your hand so that only the phone can see it?

      • by Golddess (1361003)
        I actually had a similar thought to GP. Sure, you could just cover your mouth such that only the camera can see it, and in those situations, it would work. But this technology can also be used by someone who wants to find out what someone else is saying from across the room, just speaking casually with another person. I'm sure there are situations where such a device would be more desirable than devices designed to pickup sound at a distance. That said, not advocating any sort of ban or anything, merely
    • or better, a mobile phone with this software and a telephoto lens.....

      it actually reduces security, because consumer grade lip reading-- means even if your phone does NOT have the software, mine might...

  • by Brett Buck (811747) on Thursday March 04, 2010 @12:21AM (#31354396)

    I said VACUUM!

  • Ja (Score:2, Funny)

    by Anonymous Coward

    Aber Ich kann nicht Deutsch gesprechen.

  • But given what I've seen, I doubt many would. I'm sure some of the people feel the need to 'share' with others.
  • It begins (Score:5, Insightful)

    by Quackers_McDuck (1367183) on Thursday March 04, 2010 @12:24AM (#31354412)

    Dave Bowman: Hello, HAL. Do you read me, HAL?
    HAL: Affirmative, Dave. I read you.
    Dave Bowman: Open the pod bay doors, HAL.
    HAL: I'm sorry, Dave. I'm afraid I can't do that.
    Dave Bowman: What's the problem?
    HAL: I think you know what the problem is just as well as I do.
    Dave Bowman: What are you talking about, HAL?
    HAL: This mission is too important for me to allow you to jeopardize it.
    Dave Bowman: I don't know what you're talking about, HAL.
    HAL: I know that you and Frank were planning to disconnect me, and I'm afraid that's something I cannot allow to happen.
    Dave Bowman: Where the hell'd you get that idea, HAL?
    HAL: Dave, although you took very thorough precautions in the pod against my hearing you, I could see your lips move.
    Dave Bowman: Alright, HAL. I'll go in through the emergency airlock.
    HAL: Without your space helmet, Dave, you're going to find that rather difficult.
    Dave Bowman: HAL, I won't argue with you anymore. Open the doors.
    HAL: Dave, this conversation can serve no purpose anymore. Goodbye.

    • by MichaelSmith (789609) on Thursday March 04, 2010 @12:46AM (#31354544) Homepage Journal

      HAL: Without your space helmet, Dave, you're going to find that rather difficult.

      Best musical comedy ever.

      • Actually, it’s not that difficult, as you can easily survive 30 seconds in open space.
        Several people already did it. NASA also has a FAQ about it.

        • Actually, it’s not that difficult, as you can easily survive 30 seconds in open space.
          Several people already did it. NASA also has a FAQ about it.

          Yes I am familiar with the subject. The only human who did it as far as I recall was a guy testing gear in an altitude chamber. His last recollection before being revived was the air rushing out of his lungs.

          I agree that it is generally believed that Bowman's jump out of the pod into the airlock is feasible as long as he could pull the lever (closing the door and flooding the lock) within 15 seconds.

          I am not sure about your 30 seconds. I don't believe anybody has remained operational for that length of tim

    • Re: (Score:2, Funny)

      Yes because the key ingredient of that whole story was that a computer could read lips. That's the one advancement that made killing all humans possible.

      Every year that goes by, that scene in Galaxy Quest where Taggart tells the kid the ship is real and he goes "I KNEW IT!" gets funnier.

  • by zill (1690130) on Thursday March 04, 2010 @12:24AM (#31354416)
    It's been almost a decade since hands-free headsets reached the market and its users still creep me out.

    I don't think I can ever get used to seeing the streets full of mimes.
    • You're right, it's changed it from being able to work out what someone is saying by simply watching their lips move to ... err ...

  • by GNUALMAFUERTE (697061) <.almafuerte. .at. .gmail.com.> on Thursday March 04, 2010 @12:26AM (#31354426)

    And I was just waiting for that sign, well hidden somewhere in the article, that this is just some beta concept that will stay as such forever.

    And then I found the photo of two guys with shitloads of cables attached to their faces.

    There's a huge difference between "cellphones convert mouth movements into speech" and "Guy with shitloads of cables on his face tracks the movements of his mouth muscles using 4 unix servers running a processor intensive application with an accuracy of 25%"

    The whole thing has nothing to do with cellphones. It's just yet another muscle tracking system, but used on the mouth instead of the hands, and tied to a TTS engine.

    • by Corporate Drone (316880) on Thursday March 04, 2010 @12:54AM (#31354600)

      And I was just waiting for that sign, well hidden somewhere in the article, that this is just some beta concept that will stay as such forever.

      And then I found the photo of two guys with shitloads of cables attached to their faces.

      There's a huge difference between "cellphones convert mouth movements into speech" and "Guy with shitloads of cables on his face tracks the movements of his mouth muscles using 4 unix servers running a processor intensive application with an accuracy of 25%"

      Yeah, you're right. We've never gone from a situation where we've had shitloads of hardware and cables, and been able to reduce that down to mobile devices. What were those researchers thinking? Dolts!

      (p.s., can you give a link to the "shitloads of face cables" story? Thanks!)

      • Re: (Score:3, Informative)

        by GNUALMAFUERTE (697061)

        Here is the link: http://www.kit.edu/english/pi_2010_767.php [kit.edu]

        It's right there in the article ...

        OTOH, off course we've been able to reduce the size and cabling of many inventions, but for others, it's impossible. Basically, when the technique itself involves cabling ...

        What I mean is: Sure, we've been able to reduce electrocardiograms from huge mechanical machines with shitloads of cables to small devices connected to a computer and only 5 cables, but it still involves connecting cables into your chest, and

    • by Dynedain (141758)

      There's a huge difference between "cellphones convert mouth movements into speech" and "Guy with shitloads of cables on his face tracks the movements of his mouth muscles using 4 unix servers running a processor intensive application with an accuracy of 25%"

      It wasn't that long ago that the same level of complexity was involved for locating faces in video (complete with UNIX servers, think SGI). Now handheld digital cameras can do it in real-time.

      Proof of concepts like this are the first steps. Then research

      • That is PRECISELY what I'm saying. In the future, other research, might turn this into a wireless technology, it might improve, and in probably 15 years we might have a better application. Then, in another 5 years, it might be applied to cellphones.

        So, RIGHT NOW, it has NOTHING to do with cellphones. So, what I said, is accurate.

  • by Anonymous Coward

    Tell me, Mr. Anderson... what good is a phone call... if you're unable to speak?

  • Psssshhhttt. Losers. (Score:4, Interesting)

    by The Wild Norseman (1404891) <tw,norseman&gmail,com> on Thursday March 04, 2010 @12:28AM (#31354442)
    Any serious geek has one of these. [thinkgeek.com]
  • by Anonymous Coward

    It's dark in most cinemas. Will the phone contain a light to shine on your face to annoy the sucker behind you? People txting in theatres annoy me too.

    Honestly, I HATE it when submitters need to think of an example, and then come up with a shit one. You're better off with no example that thinking of the first crap that comes into your head!

  • Cinema? (Score:5, Insightful)

    by Barny (103770) <bakadamage-slashdot@yahoo.com> on Thursday March 04, 2010 @12:34AM (#31354474) Homepage Journal

    You could pass the time by making phone calls from the cinema without disturbing anyone

    No, never and fuck off come to mind. Using a mobile phone in a cinema is one of the least considerate things anyone can do, they create light pollution distracting other patrons from what they are paying for and are absolutely not needed (the exception, emergency staff on call, and they usually just leave their phone on vibrate + silent) let alone any audible noise from them, can't you seriously just disconnect for an hour?

    In short, No.

    In long, Nooooooooooooooooooooooooo-ooooooooooooooooooooooooooooooooooo-oooooooooooooooooooooooo :)

    Also in USA at least its illegal (federal law) to operate any video recording device in a cinema.

    yes, blatant ZP rip-off but its needed.

    • Re: (Score:3, Interesting)

      by PPH (736903)

      Using a mobile phone in a cinema is one of the least considerate things anyone can do, they create light pollution

      One could make such a phone with a 'dark mode' and equip it with IR illuminators and camera.

      • by Barny (103770)

        Yeah, but your at this problem from the wrong end, I think it would be better to make more films you would want to turn your phone off to enjoy undisturbed for an hour or two.

        • by PPH (736903)

          I think it would be better to make more films you would want to turn your phone off to enjoy undisturbed for an hour or two.

          There's no hope of that happening. Jenna Jaemeson retired.

      • Um, yeah: IR blazing out of your phone AND an activated camera. Good luck explaining that one as you are being chucked out of the cinema/hauled away for 'filming the show'.

    • Not all phones emit light whilst the user is talking -- iPhones for example turn off their screens as soon as they get close to your face, so you could easily cover the screen with your hand and then put it to your face to avoid causing light pollution. Methinks it is a moot point though: I don't see why we couldn't have a bluetooth device that does this same thing without the need of a screen...
  • this would be difficult for any nationalities whose population has a physical tendency not to form words all that clearly... us Australians for example - classics at speaking without moving the jaw and lips much at all. Half of us could be mistaken for ventriloquists. And I can't imagine how they'd be able to adapt this technology to Asian folks who typically use very different physical movements to pronounce some english words/letters... case in point: they seem to have issues with pronouncing words conta
    • case in point: they seem to have issues with pronouncing words containing the letters L and R from what I've heard.

      That makes sense to me as both typically involve the same lip movements. However in context of a word (or even an entire sentence), I would imagine you should be able to make a fairly decent guess at which it is.

    • by deniable (76198) on Thursday March 04, 2010 @01:20AM (#31354740)
      Try it with older people from the bush. They speak without opening their mouth to keep the flies out. Some move the lips but keep the teeth together.
      • Re: (Score:3, Funny)

        by Opportunist (166417)

        So if they went into politics they'd be lying through their teeth?

        (sorry, couldn't resist)

        • by mcgrew (92797) *

          That's why when I go to a Thai restaraunt I never order fried rice. I order cowpot instead ("cowpot" is Thai for "fried rice")

  • by HockeyPuck (141947) on Thursday March 04, 2010 @12:35AM (#31354484)

    I seem to recall that mouthing "vacuum" and "f*ck you" look the same.... ah the joys of being 10...

     

  • And this is how it starts...
  • by The Wild Norseman (1404891) <tw,norseman&gmail,com> on Thursday March 04, 2010 @12:42AM (#31354522)
    <Stephen Hawking Voice>

    Can you steer me how?

    Can you beer me cow?

    Clan ewe fear be now?

    </Stephen Hawking Voice>
    • by Drethon (1445051)
      There was a pretty good user friendly commic on this a few years back on speach recognition:

      I think Steff gave up on the voice recognition. How can you tell? The screen says "Cod am pizza ship".
  • Impressive (Score:3, Insightful)

    by SlappyBastard (961143) on Thursday March 04, 2010 @12:43AM (#31354528) Homepage
    Especially when you consider the number of people who constantly move their mouths and say nothing.
    • by Kitkoan (1719118)

      Especially when you consider the number of people who constantly move their mouths and say nothing.

      More fun when you think of the things people mutter only to have the said out loud for you now. 'Stupid son of a... WAIT I didn't mean for it to say that'

  • by Lucidus (681639) on Thursday March 04, 2010 @12:49AM (#31354562)
    Apparently, the writer at recombu.com is one of those annoying people who fail to recognize that, whether or not you make any sound, opening your phone in a movie theater is extremely disturbing to everyone sitting in the rows behind you. The glowing screen is like a beacon inside the darkened room.
    • Re: (Score:3, Funny)

      by deniable (76198)
      Yeah, but the light makes targeting easier.
    • Not an issue if the device is bluetooth connected to your phone and has no screen... Secondly, if you're using an iPhone, the screen turns off when it's near your face, so you could cover it with your hand then put it to your face.
  • by jdb2 (800046) * on Thursday March 04, 2010 @12:53AM (#31354590) Journal
    NASA has been working on "sub-vocal" speech recognition wherein sensors pick up nerve impulses to various parts of the mouth and face but in this case all it requires is one to just *think* about speaking -- *no mouth movement.*

    Here are some previous /. stories on the matter :

    http://science.slashdot.org/article.pl?sid=04/03/18/0132222 [slashdot.org]

    http://tech.slashdot.org/article.pl?sid=05/04/10/1417250&tid=215&tid=14 [slashdot.org]

    jdb2
  • Find out what someone is saying across the room. See what people are talking about that they don't want you to hear. Or just be nosy. Sure, the camera probably has to be really close to a mouth to work correctly, but that doesn't prevent a determined snoop to surreptitiously video someone's face and then use some editing software to zoom in on the mouth and/or get rid of all the other useless information.
  • by BoydWaters (257352) on Thursday March 04, 2010 @01:29AM (#31354780)

    Fifteen (!) years ago, I took a UC Extension class on Neural Networks taught by Stanford professor David Stork. He had developed a lip-reading system for communication in noisy environments, such as an airplane-repair facility. If you could do it 15 years ago with workstation-class desktops, I suppose you could do it with a smartphone today.

  • Other uses (Score:3, Interesting)

    by NewsWatcher (450241) on Thursday March 04, 2010 @01:32AM (#31354796)

    "In noisy places like bars and clubs you could make yourself heard without having to shout."

    Or more likely, used by men in conjunction with Babel Fish [yahoo.com] to chat-up women who don't speak English.

    • by liquidsin (398151)

      do you have any pamphlets? did you secure vc yet? you need to get on this, man; you're sitting on a goldmine!

  • You could pass the time by making phone calls from the cinema without disturbing anyone.
     
    NO!
     
    It's not only the noise that you make talking; it's also the light from the phone.

  • Some researchers at Flinders Uni in South Australia did something similar in 2003. Their system used video to enhance the reliability of the speech recognition software. I'm not sure if they have taken it any further, but it's a great concept. Here's one of their Papers [acs.org.au] [220KB pdf].
  • Passing the time? (Score:3, Insightful)

    by Anonymous Coward on Thursday March 04, 2010 @02:03AM (#31354954)

    You could pass the time by making phone calls from the cinema

    I've always thought that the best way to pass the time in a cinema is to watch the fucking movie.

  • Yeah, just ask someone to say that while you try to read their lips...

  • How will I mutter under my breath about what an idiot the person I'm talking to is?

  • What an excellent way for me to stay in touch with my friend Jane.!

  • Normally it takes some talent or a directional mike to pick up a distant conversation, these guys would have just automated long distance bugging. All you need is a decent telelens. It means any boardroom conversation will now require closed curtains.

    • .. that is, of course, after they get rid of the need for reading muscle tension by electricity. That is a matter of optical analysis so I guess that will be step 2.

      Side note: I am very wary of devices requiring direct electrical contact with my body..

  • Just went down the tubes, if this is real. All i have to do is point my phone towards that guy across the room and 'hear' everything hes saying trying to convey.

  • Accuracy will suck, even with a trained human.

    Hence a running joke in the Deaf community about the saleswoman peddling beauty aids with "Olive Juice". (it lipreads as "I love you" ) I believe it was a "sunshine II" skit. Yeah, I was an interpreter for like 10 years.

We warn the reader in advance that the proof presented here depends on a clever but highly unmotivated trick. -- Howard Anton, "Elementary Linear Algebra"

Working...