Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Patents Microsoft Technology

Microsoft's Acoustic Caller ID Patent 185

theodp writes "A new patent granted to Microsoft Tuesday for automatic identification of telephone callers based on voice characteristics covers constructing acoustic models for telephone callers by identifying words or subject matter commonly used by callers and capturing the acoustic properties of any utterance. Not only that, it's done 'without alerting the caller during the call that the caller is being identified,' boasts Microsoft in the patent claims."
This discussion has been archived. No new comments can be posted.

Microsoft's Acoustic Caller ID Patent

Comments Filter:
  • by Anonymous Coward on Wednesday June 13, 2007 @07:49PM (#19499727)
    The only difference here (aside from what agencies have been doing since the 1960's) is that this analysis seems to be done in real time, rather than offline? I mean, haven't monitoring people been able to tell who is speaking based on sound synthesis since forever?
    • Re: (Score:3, Funny)

      by dreamchaser ( 49529 )
      I dunno how useful this is. I usually just recognize the voice myself. Our wetware has some wonderful capabilities.
      • Re: (Score:3, Insightful)

        It might not be useful in a home environment, but how about in an office where after the initial greeting the customer details are popped on-screen without you typing anything?
        • Re: (Score:3, Insightful)

          by Ctrl-Z ( 28806 )
          Isn't that why they ask for my account number?
          • You haven't had to ask a lot of people for account numbers have you?
            • Re: (Score:3, Insightful)

              by Nasarius ( 593729 )
              I have. As I remember, it's one of the least painful parts of working tech support.
          • by omeomi ( 675045 ) on Wednesday June 13, 2007 @11:07PM (#19501093) Homepage
            Isn't that why they ask for my account number?

            Good Lord, no. They ask for your account number just to irritate you because both you and the person you're talking to know damn well you had to key in your account number just 2 minutes ago.
            • I know this is off topic, but I have always wondered why they do this? I once asked the operator and they said I have no number...

              Is this a mystery like the missing sock in the laundry?
          • Chances are, the call center rep is already looking at your account when he/she asks for it. I goofed around with computer telephony a bit back in the early 90's and I remember that already then the tech was pretty standard to pull the number and do the database lookup automatically. Saving the call center a few seconds per call adds up. The reasons they ask for it are a) people found it disconserting to have the call center rep greet them by name before they said anything and b) you might not be calling
    • NSA has had real-time voice ID since before '96 and possibly longer. How MS got this patent is beyond me. Our system is soooooooo broken
  • by grahamsz ( 150076 ) on Wednesday June 13, 2007 @07:49PM (#19499731) Homepage Journal
    Anecdotally I feel like some companies answer the phone quicker if you talk to their automated system in an irate and condescending manner. Could just be me though :)
  • Why? (Score:5, Insightful)

    by Aoreias ( 721149 ) on Wednesday June 13, 2007 @07:50PM (#19499747)
    What's the purpose of caller ID after I've picked up the phone? I'm not going to talk to some challenge response bot if I'm someone who needs to be IDd and screened anyway.
    • by Nymz ( 905908 ) on Wednesday June 13, 2007 @07:58PM (#19499815) Journal

      What's the purpose of caller ID after I've picked up the phone?

      If someone had acquired some of your personal information, and then tried to impersonate you, an automated voice recognition system could be useful by raising an alarm, or at least giving a percentage of how much their voice is like yours.
    • by taniwha ( 70410 )
      this is to shunt the really annoying tech support callers quickly to the waste bin ....
    • I'm sure identifying the speaker phone conversations in live phone conversations isn't the only use. It probably works on any sort of audio - radio broadcasts, recordings, etc.
    • I think the purpose is so you can have a contact lookup while you are on the phone. Your computer would show you relevant details about the person. For example, if you were an account manager, you would get information about the client's account. The automated caller ID via voiceprint would avoid the need to type in information while you were on the phone
    • One word: Wiretapping Now they can verify in real-time whether they're listening in on the right person.
    • What's the purpose of caller ID after I've picked up the phone? I'm not going to talk to some challenge response bot if I'm someone who needs to be IDd and screened anyway.

      Identification of who is talking on a conference call would be extremely useful. Especially since a lot of people sound the same as I have the memory of a goldfish. When someone speaks you could have a little display that tells you their name and the company they work for.

    • In case they're a fox?
  • that when someone calls me and says "Hi, this is John Smith," I will not be able to use that info to figure out that he's John Smith without violating Microsoft's patent? (Ditto when someone I know well says "Hi, it's me.")
  • You must have done that for longer than that, but YOU NEVER TELL the OTHER PARTY, you are doing that?

    Or we have assume that long before we make the call?
  • by Penguinisto ( 415985 ) on Wednesday June 13, 2007 @07:54PM (#19499789) Journal
    ...they're looking to patent-troll the CIA!

    Brilliant!

    /P

  • by Ngarrang ( 1023425 ) on Wednesday June 13, 2007 @07:55PM (#19499793) Journal
    I read the patent, but I guess I don't get it. How is what Microsoft claiming to do different from existing voice recognition systems?

    You have to train current voice systems so they recognize your voice pattern (or, acoustic ID) and translate it to text or action. Take that and add a system that keeps profiles for a more advanced version of caller ID. It seems like a natural evolution of the technology.
    • Yes, but their system will come pre-programmed with the important voice signatures.

      Bill Gates calling...
      Caller ID displays: God

      But, if there is ever an open source implemenatation of this, it will change to the following...

      Bill Gates calling...
      Caller ID displays: Don't even THINK about installing Windows(TM) on this caller ID
    • "How is what Microsoft claiming to do different from existing voice recognition systems?"

      Existing voice recognition systems might be more acurately called speech recognition. They don't recognize the voice (who is speaking); they recognize the speech (what is being said). They can be categorized as speaker dependent or speaker independent.

      Speaker dependent speech recognition (type 1) requires complex training by each user. It needs to know all the ways a person pronounces every possible phoneme. During
      • by cnettel ( 836611 )
        There are existing voice recognition systems as well. Some are used to choose which speech recognition profile to use, while some are used for other applications, similar to the one in the patent.
      • You're talking out of your ass.

        This is not speech recognition it's speaker recognition, and is nothing new.
    • That makes no sense. Just because you can train a system to work better at converting speach to text if it knows your voice pattern, doesn't mean that it can uniquely identify someone from the voice pattern. Those are two different things.....you can't just tell it to run the algorithm in reverse and expect there to be enough information. In fact, you aren't even running it in reverse if you don't have the text version of what they said.
  • Err (Score:4, Insightful)

    by OverlordQ ( 264228 ) on Wednesday June 13, 2007 @08:02PM (#19499833) Journal
    Wont this most likely violate wiretapping laws in two-party states?
    • yes, that is just what i was thinking, in order to do this acoustic caller ID thing they would have to record the caller's voice and recording people in telephone conversations is illegal unless the caller is notified of the call being recorded...
      • Not necessarily. The only "recording" going on is a few kbytes of data temporarily buffered for analysis: if it is immediately discarded and never available to be listened to by a human, I doubt there'd be a problem. The RIAA tried to make a deal out of the temporary storage of music data in a satellite receiver as being a "recording" but that didn't fly either, if I remember correctly.
        • Depends on the statute. The way the one in Pennsylvania is worded (it came up in a /. article a day or two ago), no actual recording has to be made - "interception" is sufficient to run afoul of the law.

          • Depends on the statute. The way the one in Pennsylvania is worded (it came up in a /. article a day or two ago), no actual recording has to be made - "interception" is sufficient to run afoul of the law.

            Here's the link to the PA statute: CHAPTER 57. Wiretapping And Electronic Surveillance [aol.com] It DOES sound like it could be considered "intercepting" and also "intentional use". Especially if some DA can interpret it to mean that videotaping in a public place falls under the same law. Of course, IANAL, YMMV,

      • By that logic, VoIP would be illegal too. I don't know what the US wiretap laws are like, but I doubt it's illegal to record into temporary buffers.

        If they were suggesting recording conversations for later identification, then I imagine there'd be an issue. This is doing the identification on the fly, so is unlikely to be an issue.
  • For real? (Score:2, Funny)

    by Anonymous Coward

    "Developers, Developers, Developers. I love this company, yeah" ** Sounds of flying chairs **

    Welcome to Microsoft patented caller Identity v1.0 beta
    Caller Identified: It's Steve... again
  • by Anonymous Coward
    I had no idea someone I might call might be able to indentify me.

  • The sort of processing this patent covers is something that hasn't been possible until recently, but I think, in principle, is something absolutely necessary for robust AI, and that is doing recognition simultaneously on both low level features and high level features of data and on intersections of the two.

    By "high level" I mean things like word choice, language etc. By low level I imagine they mean things like the specific resonance characteristics of a voice. In voice there are intermediate levels of f
    • When machines become capable, for the first time, of being social or moral each basic step toward that will be patentable as well. We will have a patents that covers not-being-evil and one on not-being-an-asshole.

      It's a good thing we don't have that sort of problem with children, such that only one family can have children that, say, know the difference between right and wrong and since they patented that no one else is allowed. Or only one family that has children that have a sense of rhythm.

      But as compu
      • >We will have a patents that covers not-being-evil and one on not-being-an-asshole.

        AHA! That explains Bender. I guess Farnsworth couldn't afford to license the necessary patents at the time.

        Always wondered about that.
    • The sort of processing this patent covers is something that hasn't been possible until recently, but I think, in principle, is something absolutely necessary for robust AI.

      Do you know if there are medical applications for tech like this? For example, could it warn "life-line" support for seniors, the 911 dispatcher or EMT of patterns or changes that are probably significant but not obvious to the layman?

    • by donaldm ( 919619 )
      Do a Google search on "voice recognition" and as a starting point try http://en.wikipedia.org/wiki/Speech_recognition [wikipedia.org] however I don't think a patent is justified since a quick Google search with "patent" added on will give you 1,140,000 hits. Still it appears if you patent anything in the US and have the money it normally gets granted.
    • by rbanffy ( 584143 )
      "We will have patents on a machine not being stupid."

      Yes, but would an intelligent machine have the right to violate patents in order to preserve itself?
  • Comment removed (Score:3, Interesting)

    by account_deleted ( 4530225 ) on Wednesday June 13, 2007 @08:07PM (#19499873)
    Comment removed based on user account deletion
  • Does an ear count ? Seems like human being having doing this for ages. Wait I will patent the act of refreshing oneself with ones arm whilst bent between hmmmm 0-90% - that should cover most beer drinkers, I want a tax from all pub's ......

    • by EvanED ( 569694 )
      Stop being dense. You don't patent a result -- you patent methods. (At least for utility patents. Design patents and biological patents are different, but neither of those apply here.)

      So unless their system works by intercepting acoustic waves with an eardrum that vibrates tiny bones that move a liquid that triggers tiny hairs which send electrical sigals to a mass of neurons which somehow figures it out, no, the ear isn't prior art. Considering that we have not much better than "not a clue" how the brain a
      • Considering that we have not much better than "not a clue" how the brain actually associates the sound you hear to memory, I am skeptical that this is how their approach works.

        But "not a clue" is exactly what executives, patent lawyers and patent judges know about how software and say, mathematics, work, so how is this any different? They wrote a patent on something they don't understand and will approve it without understanding it. They might as well be patenting life - oh wait they do that too,
    • Does an ear count ?

      I assume you mean "does the human brain count" as the ear doesn't identify sounds. There is a lot of research into the human brain, and how it does what it does so well, but I doubt MS's latest innovation would match the intelligence methodology of the human brain.

      Remember, patents require more than an idea, otherwise every Sci-Fi movie in history that has an AI identify the main character when they use a phone would be prior art. You must also explain how it's done.

  • by theantipop ( 803016 ) on Wednesday June 13, 2007 @08:10PM (#19499909)
    /. should just put an RSS feed to newly issued patents on the front page. Would cut down on the number of stories per day though.
  • Wiretapping law (Score:3, Insightful)

    by w9ofa ( 68126 ) on Wednesday June 13, 2007 @08:12PM (#19499927) Homepage
    It is my understanding that recording a telephone conversation is against the law in most states, without notifying the other parties on the line.

    Thus, a practical device for this patent would most likely be illegal.

    • by xigxag ( 167441 )
      That's not the case. In most states [callcorder.com], you only need the consent of one party to tape record the call. Hence in most states, you can tape your calls without notifying others.
    • It is my understanding that recording a telephone conversation is against the law in most states, without notifying the other parties on the line.
      Thus, a practical device for this patent would most likely be illegal.

      Do you have to notify a caller that you are using caller ID? Do you have the right to make an anonymous phone call?

      This guide for journalists may be helpful: "Can We Tape?" [rcfp.org] But I am not sure that any existing law is a good fit for this new tech.

    • Re: (Score:2, Informative)

      If it processes it in real time, it doesn't need to record it, really. Just pass through in and out.
      • ...in order to IDENTIFY you correctly after a few calls it's going to have to record how you say your words and have it in a quickly-accessible database. Otherwise what's there for it to rely upon in identifying the person, a magic pixie?
    • Would you say something differently if you knew it was being recorded? Why?

      Years ago, I put up a sign in the lunch room where I worked, it said "wash your own dishes. Even if no one is looking."

      Seems to me the same principle applies here... Eh, what do I know? I hardly say anything to anyone, and when I do, I say what I mean.

      On the other hand, in today's world of digital recordings, cut-n-paste, out-of-context quotes, etc. I think "I never said that" should have the same legal weight as a "recording"

  • Microsoft, Sun, Apple and General Motors announced today they've also patented a talking TV-like device known as a "telescreen" that not only shows entertaining DRM'd media, but also reminds the user when they are behaving badly, eating poorly or being potentially offensive politically to others.
  • I know that by the way the article is written we're supposed to think it's an evil invasion of our privacy but honestly this sounds kind of cool.
  • by killmenow ( 184444 ) on Wednesday June 13, 2007 @08:33PM (#19500065)
    To patent anything, follow these steps:

    1. Choose something already being done in the real world, anything really
    2. describe it with maximum verbosity
    3. add "on the Internet" at the end

    Tada! PATENT!
  • by Anonymous Coward
    The keywords being:

    'without alerting the caller during the call that the caller is being identified'

    Don't we have laws against doing stuff with voices without informing people first? And since when is sampling audio, and then converting part or all of the audio to a format based on, and unique to the original, not an act of recording?
  • Hasn't this technology been explained over and over again in big-screen depictions of the NSA's technical capabilities?

    Maybe someone from the /. community should just patent 'patent trolling' and put an end to all this FUD.
  • "Do you have the box?" 5+ geek creds to anyone who also immediately thought of the same movie :-) Remember, kids. They're the US government. They don't DO that sort of thing. But they'll try.
  • that everytime I recognise the person on the other end of the phone by recognising their "voice characteristics" I have to pay Microsoft tax?

    "Hi mom! oh damn..., I mean, hi stranger whose voice I don't recognise but I am wildly guessing is probably my mother..."

  • What this amounts to is the ability of MS to tell people they have to pay a royality if they identify who they are talking to upon receiving a phone call.

    Ring Ring

    joe: hello

    Hello joe.

    joe: Who is this?

    You know who this is, so hows it going joe?

    Joe: Who is this?

    Stop fooling around Joe, Are you going to visit soon?

    Joe: Who is this?

    Well if you don't want to talk then good bye.

    click

    From the other end. My own son doesn't recognize his own mothers voice...

    From Joes end: Must have been some crazy lady with MS stock
  • Inventors: Arthur C. Clarke and Stanley Kubrick

    First publication: 2001 A Space Odyssey (Released 1968). Heywood Floyd checks in to the space station:

    Female voice: "Thank you. You are cleared through Voiceprint Identification."

    http://www.imdb.com/title/tt0062622/quotes [imdb.com]
  • So ... (Score:3, Interesting)

    by Shadowlore ( 10860 ) on Wednesday June 13, 2007 @09:42PM (#19500569) Journal
    According to this:
    Not only that, it's done 'without alerting the caller during the call that the caller is being identified,'

    They are describing a means to RECORD callers without their knowledge, and hence without their consent. So would this software be illegal in some jurisdictions? You bet yer ass it would be.

    Wonder how it handles people who say "uhm" or "uhh" a lot. ;)
    • If they just process the stream, without saving, it is not "recording".

      N,IDNRTFA.

      • by Aladrin ( 926209 )
        The only way to 'process' the stream is to record it, even temporarily, to the memory of a computer.

        You can't work on data without putting it in memory at some point. This is even more so with data that is being analyzed, because it needs to deal with pieces of the data, not just the current position of the live stream, which is only milliseconds long to a computer.
  • Sneakers (Score:3, Funny)

    by Loconut1389 ( 455297 ) on Wednesday June 13, 2007 @09:55PM (#19500665)
    My name is Werner Brandes, my voice is my passport. Verify me.
  • by TheTranceFan ( 444476 ) on Wednesday June 13, 2007 @10:11PM (#19500759) Homepage
    ...it's done 'without alerting the caller during the call that the caller is being identified.

    ...Sometimes...when the phone rings...
    ...I answer it...and just listen...
    ...I hear the caller's voice and identify them by their voice...
    ...Then hang up without saying anything.

    How insidious!
    What. Is. The. Difference.
    • I'm just wondering who talks on the phone if no one has answered the phone? If the phone is still ringing then are you really talking?

      Isn't caller ID good enough? And if someone blocks their phone, isn't waiting till they leave a message to pick up acceptable? Why do I need this on my answering machine?

      I guess I could see this useful for telmarketers. They would then be able to tell who answered and say hello is your mother home.

      • Think automated systems and fraud-detection / warning systems. Caller ID identifies the phone in a spoofable manner, if memory serves; it does not identify the person using the phone, nor is it useful if a trusted person is calling from a different phone.

        This might be useful for low-security automated systems where having people key in passcodes or account numbers isn't necessarily appropriate. It might also be useful for warning a human recipient when something seems not quite right -- imitating somebody
      • Good point. I guess one difference is that Caller ID can be blocked on the caller's end, but you can't use the phone without using a voice. A boon for those providing voice-obfuscation technology? (Never worked when I tried to call the office to get out of school back in the day ;-) )
    • Re: (Score:3, Funny)

      Them: "Is [insert partners name here] home?"
      Me: "Oh, hi [insert partners' friends name]. I'll go get her."
      Them: "How'd you know it was me?"

      Sheesh do anything with computers or on the internet and you can patent it.

  • by fyrewulff ( 702920 )
    I have Caller ID so I know who's calling BEFORE I pick up the phone, not afterwards.
  • 4th Amendment? (Score:3, Insightful)

    by ivanmarsh ( 634711 ) on Wednesday June 13, 2007 @10:57PM (#19501033)
    Should I even ask? Does the 4th Amendment mean anything anymore?

    Cops bust a guy for video taping them and charge him with wiretapping and Microsoft is going to be recording my voice and compiling a profile of me and that's okay?

    Words I'm guessing it will be looking for by default: bomb, liberal, weed, nuke, bush, 1st Amendment.

    My tinfoil hat is starting to look stylish.

  • I guess the NSA will come after them with prior art :-)

  • "Hello, this is Bill Gates. I know who you are."
    • "Hello, Bill Gates here again. I just wanted to announce some exciting new technology we are on the verge of patenting with our partners at Microsoft. To all of my friends, please do not take this for a junk call. Just Bill Gates sharing his fortune. If you hang up, you will repent later. Microsoft and AOL are now the largest Internet companies and in an effort to make sure that Internet Explorer remains the most widely used program, Microsoft and AOL are running a voice beta test.

      "When you repeat this call
  • Speaker identification has been researched for decades. Microsoft isn't offering a breakthrough solution to the problem, they are instead trying to patent the whole field.

    This is roughly the equivalent of trying to patent "3D graphics acceleration" or "data compression".
  • Hey, if Audible Magic and fingerprint and identify a copyrighted song regardless of compression or transcoding, why not this?
  • ...every time we recognize someone's voice on the phone.
  • by gringer ( 252588 ) on Thursday June 14, 2007 @04:08AM (#19502569)

    A method and apparatus are provided for identifying a caller of a call from the caller to a recipient. A voice input is received from the caller, and characteristics of the voice input are applied to a plurality of acoustic models, which include a generic acoustic model and acoustic models of any previously identified callers, to obtain a plurality of respective acoustic scores. The caller is identified as one of the previously identified callers or as a new caller based on the plurality of acoustic scores. If the caller is identified as a new caller, a new acoustic model is generated for the new caller, which is specific to the new caller.
    Hrm, sounds familiar for some reason. Oh, wait... there's a phone call. I'll just go and pick it up.

    me: hello?
    caller: Hello, I'm Suzi Cheatem from Dewey, Cheatem, and Howe. I was wondering if you'd like to answer a few questions about your behaviour while using the Internet.
    I think hrm, this sounds like one of those annoying telemarketers
    me: Sorry, I'm not interested in speaking to telemarketers
    caller: It seems like you have identified me from a previously identified acoustic model. I'm afraid I'm going to have to tell Microsoft that you have stolen their idea. You can expect a bill from them within two weeks.
    <hangs up>

    Gosh, those telemarketers get stranger every time they call me.
  • this section
    Not only that, it's done 'without alerting the caller during the call that the caller is being identified,' boasts Microsoft in the patent claims.
    would probably run afoul of wiretapping laws...I know that if I had money, I would probably be willing to push a test case...
  • Ah, soon Drew Barrymore and friends will be pulled from their movies and put on the case. Cool!
  • ... if it works with heavy breathing?

    Honest, I was just wondering. No, really.

  • The only flaw found so far is that it can't identify Steve Ballmer because voice recognition software isn't able to make sense of the sound of chairs smashing into things.
  • Considering that a company must notify a consumer when they are being recorded, having tech like this out in the wild raises serious privacy concerns.

"The great question... which I have not been able to answer... is, `What does woman want?'" -- Sigmund Freud

Working...