Forgot your password?
typodupeerror
Government Social Networks The Internet

When Metadata Analytics Goes Awry 88

Posted by samzenpus
from the six-degrees-of-separation dept.
jfruh writes "When blogger Dan Tynan started seeing lots of Latvians in his LinkedIn People You May Know list, it was pretty funny, considering he'd never been to Latvia or ever met anyone from there. But now that shadowy spy agencies are using algorithms similar to LinkedIn's to see if we're terrorists, mistakes like this are a lot scarier. From the article: 'More than ever -- and online in particular -- who you know can be more important than who you are. In fact, who somebody thinks you know may be more important than who you are, especially if that somebody is a faceless government bureaucracy with limitless power to izjaukt savu dzvi (mess up your life).'"
This discussion has been archived. No new comments can be posted.

When Metadata Analytics Goes Awry

Comments Filter:
  • by 140Mandak262Jamuna (970587) on Friday July 19, 2013 @07:14AM (#44326209) Journal
    I created a new gmail id to get price quotes from auto dealers. And now Google keeps telling me I might now someone named Steve Lexus and wants me to add him to my circles. Well, at least they seem to have filtered out Jane Honda and Palvayantheeswaran Toyota and Poponopoulous Mitsubishi.
    • Re: (Score:3, Funny)

      by Anonymous Coward

      I don't have a facebook account, but a few months ago someone mistakenly sent me an invitation to join. I didn't think much and deleted that mail, since I didn't knew the person and couldn't even read it since it was in arabic. Ever since then facebook has been spamming once or twice a week with useless stuff about some arab dudes I never heard of. It's really annoying, I have no fucking idea who these people are, I've never set foot anywhere even remotely close to that part of the world, it's absolutely im

      • Hmm, I can hear Agent Smith musing: "I see that you have a large number of 2 hop associates in the middle east. That doesn't look good, Mr Coward."
      • by AmiMoJo (196126) *

        Their incompetence earned them a free procmail rule.

        And put you in the terrorist watch list. Now bend over and prepare for your cavity search.

      • by fuckface (32611)

        Their incompetence earned them a free procmail rule.

        Yeah, man! Fight the power, hit 'em where it hurts. Accept their email and use your electricity and CPU cycles to process it and pipe it to /dev/null. That'll show 'em!

      • May be Facebook knows you are not arab. But it gets paid to hose you with arabic ads. It probably knows your pain tolerance well too. It know how much it can pelt you with ads and make money before you decide to give up in disgust and go away from Facebook. May be by staying with Facebook even after being pelted with stupid ads for weeks, you tole Facebook algorithms, "ok this guys is good for at least two weeks ad blasts. May be more. Next time let us try three weeks". May be Facebook is not the chump her
        • by bonehead (6382)

          Someone apparently didn't read the very first line of his post where he clearly stated that he does NOT have a facebook account.

    • Re: (Score:3, Funny)

      Your loss. Jane Honda is really cute.
    • by TarPitt (217247)

      I set up a Facebook account for my dog. Female, 5 years old so I figured that was 35 years old in human years. Neutered, so obviously single and never married.

      Amazing the number of invites she gets from lesbian singles.

      • by Anonymous Coward

        I set up a Facebook account for my dog. Female, 5 years old so I figured that was 35 years old in human years. Neutered, so obviously single and never married.

        Amazing the number of invites she gets from lesbian singles.

        You did mention she's a bitch, right?

    • I have heard it said, perhaps apocryphally - If you look at the birth and death records for the State of Florida, you will conclude that a majority of people in that state are born Latino and die Jewish. Having reams of data is a start; but you must also have an accurate model.
  • by wbr1 (2538558) on Friday July 19, 2013 @07:15AM (#44326211)
    Facebook and google+ may recommend possible people you may know, based off of degrees of separation, contact lists, etc. Most of the time I do not know any of the people they suggest.

    If the NSA just reverses a similar algorithm, what happens when it says that Mahmoud Ahmadinejad may know me? Especially if I have access to centrifuges.

    Then I have to prove a negative, that I do not know this person. All their evidence points to the opposite. "He was in New York at the same time!" (BUT I LIVE THERE) "Doesn't matter". "Your fathe'rs, cousin's, uncle's former roomate went to Iran as an exchange student", etc, etc.

    • by Anonymous Coward

      By the time you have to prove yourself, its already too late - the FBI has already planted a GPS tracker on your car, without a warrant. The proof that they're keeping America safe will come when they bust you buying a bag of weed from a totally unrelated person in a sting operation.

      -- Ethanol-fueled

      • by Anonymous Coward

        So what?

        He's probably guilty of something, and busting down the doors of a few extra people isn't a big deal.

        God Bless Ronald Reagan!

    • by Chrisq (894406) on Friday July 19, 2013 @08:22AM (#44326639)

      Then I have to prove a negative, that I do not know this person. All their evidence points to the opposite. "He was in New York at the same time!" (BUT I LIVE THERE) "Doesn't matter". "Your fathe'rs, cousin's, uncle's former roomate went to Iran as an exchange student", etc, etc.

      That's an excellent point - its the classic example of the Prosecutor's fallacy [wikipedia.org]. By definition anyone found through a meta-data search will have strong evidence against them. If someone was caught independently plotting terrorist activities then it would be valid to say that it would be very unlikely for them to have a lot of connections with known terrorists. Trawl through databases and find someone who has a lot of known connections and it doesn't say a lot. Its like if you have evidence that someone tampered with a lottery and won you could say the chances of winning are one in 14 million (or whatever), but if you look for people who have won then it is not valid to say that they must have cheated as the odds against winning are so low - because someone will through chance!

    • It's also amusing the ads and pages FB recommends to me. They seem to think I'm a black latina (I'm white, non-Hispanic).

  • by cold fjord (826450) on Friday July 19, 2013 @07:17AM (#44326221)

    Time to worry about the real problems [clutchfans.net] affecting people's lives.

    • I suppose I should point out that this is just another instance of metadata analysis gone awry.

    • by gronofer (838299)
      What's wrong with being associated with Latvians anyway? I don't believe they are any more evil than the average Europeans.
      • No genuine disparagement of Latvians intended, just a reference to the article, and a joke. The Latvians are a fine people, too long oppressed by the Soviet Union, but who are now building a modern free state. I wish both them and you well.

        Having written that, you may want to view the link as well.

  • by Aguazul2 (2591049) on Friday July 19, 2013 @07:23AM (#44326245)

    When FB or Amazon recommends something/someone, I can usually see some sense behind it. LinkedIn is just plain random. I don't know 95% of the people it seems to want to connect me with. It is a joke.

    • That's odd - on mine it's mostly second and third level connections from my area, but, from TFA:

      nor do I have any recollection of saying yes, but thatâ(TM)s just the way it goes in todayâ(TM)s wacky world of social networking.

      No it's not. That's the way lazy people do social networking. I certainly don't approve connections from people I haven't met - Linkedin in has built-in support for finding people through your network - trying to artificially pad it out isn't going to be of any real help.

    • Please remember it is you that is the product that Linkedin and other services are advertising and selling so play the game and get linking, liking and friending.
    • by Kjella (173770)

      You must be providing Facebook with more data than I am, I just keep an account because some insist on using it as their RSVP system. The people it recommends are typically totally random people who have one friend in common with me.

      • by Aguazul2 (2591049)

        You must be providing Facebook with more data than I am

        Maybe that's it -- I have never given LinkedIn much data, so the results are crap. But still turning up random unknown people constantly just makes an even worse impression, and makes it even less likely I'd put information into it.

    • For me, LinkedIn mostly recommends people who are two or three degrees of separation away... but two of my favorite website operators are consistently in my top hits. I have no idea how or why; I haven't connected to them because 1) I don't know them that personally and 2) it's just plain eerie.
    • by gl4ss (559668)

      linked in's success metric for their internal analysis is number of connections made.

    • by godel_56 (1287256)

      When FB or Amazon recommends something/someone, I can usually see some sense behind it. LinkedIn is just plain random. I don't know 95% of the people it seems to want to connect me with. It is a joke.

      There's also this piece from the same author of TFA where he suspects Linkedin is mining your Gmail contacts.

      http://www.itworld.com/it-managementstrategy/254094/wtf-linkedin-doing-my-data

  • by macbeth66 (204889) on Friday July 19, 2013 @07:28AM (#44326269)

    Stop using social media. Some of the crud I've seen on LinkedIn is as bad as Facebook and I do not want to be associated with it.

    Side Note: If you do use LinkedIn; it is not a dating site. Some of my female colleagues have started complaining about unwanted attention. Just because she met you at that training class last month, and accepted your connection, does not mean she is interested in 'knowing' you. Sheesh.

  • Noting some key facts about the US terrorist watch list:

    Terror watch list grows to 875,000 [washingtontimes.com]

    As of December 2012, a factsheet from the center states, TIDE contained over 875,000 entries. Each one represents a known or suspected terrorist and includes all their known aliases and spelling variations on their name, the official said.

    Less than one percent, or fewer than 9,000, were Americans, including both citizens and legal permanent residents, he said, adding the center does not release exact numbers.

    So if there are only 9,000 known or suspected terrorists in the US out of 310,000,000+ Americans, how much impact is that likely to have? I wouldn't necessarily expect terrorists to be highly connected to people outside of their purpose.

      • The real question isn't who is connected to terrorists, but rather, "Who are the terrorists and their support network?"

        The intelligence agencies are not going to be much interested in accumulating every possible association, but rather in narrowing it to the people of interest for the purpose at hand. If I knew the pilot that flies Prime Minister Cameron and his guests, I could be connected to many of his guests with 2 hops. Lets say one of those guests was Angela Merkel. Anyone that cared to look would

        • by AHuxley (892839)
          technically exists or be possible, but as a practical matter it is pointless?
          Most groups in the 1960-90's had perfect papers - cleared by an embassy or state backed.
          Just like the freedom fighters for Libya, Syria, Chechnya, ~Yugoslavia, Iran.
          Funny how they and their weapons move with such a total lack of understanding of their support networks.
        • by sjames (1099)

          Except they just acknowledged that they DO look at every association out to 3 hops.

    • by Qzukk (229616)

      So if there are only 9,000 known or suspected terrorists

      So?

      Terror watch list grows to 875,000

      Guess what, THAT'S THE LIST THEY USE. If your name is on the 875,000 list, you don't get to fly, even if you're a Senator.

      • Only 9,000 or so of the 875,000 are US citizens or residents.

        • by Qzukk (229616)

          My bad, I misunderstood the 9000 part, but the issue remains: that's not 9000 people, that's 9000 names, or else we wouldn't have so much trouble with the no fly list.

    • by AHuxley (892839)
      Cold different groups within the US gov look at different people. But making their lists over many years might not be very hard.
      http://en.wikipedia.org/wiki/Main_Core [wikipedia.org]
      When seeking funding they like to quote big numbers, when caught by the press, they like to quote any smaller database.
      Recall the http://en.wikipedia.org/wiki/Information_Awareness_Office [wikipedia.org] and withdrawal of funding?
      Transferred to other government agencies...
    • by gmuslera (3436)

      I remember an episode of ST:DS9 (paradise lost, i think) when just 2 "terrorists" that couldn't be found and could be anyone almost turned a government respectful of their citizens into a deep police state, even with the good will of the ones doing it. And in this case we can't assume the good will of everyone in the "law" side.

      The problem is that you don't need to have "real" connections to expand the circle of watching a lot. Assumed, misidentified (you know, all those biometric tests that have 90% accur

    • by sjames (1099)

      You call a plumber, terrorist calls the same plumber. Boom, you are connected to a terrorist at only two (out of three the NSA examines) hops. Better hope you don't have the same mechanic too.

  • by Anonymous Coward

    Seriously, if you have provided incomplete and inaccurate information to LinkedIn, how is it their fault that they suggest you only bullshit? Oh, and the NSA will blame that on you too if they shoot you "just in case". Oh oops, we were 99.99% sure that he was a terrorist. What an idiot, why didn't he fill out the facebook profile completely. We did nothing wrong. :-(

  • by Coeurderoy (717228) on Friday July 19, 2013 @07:42AM (#44326381)

    Originally when the concept of "degree of separation was invented" the idea was that everybody was connected to everybody through 6 degree of separation.
    At the same time people think that are "the good guy" who does not keep "bad company".

    With social media the length of the separation chain has considerably shrunk.

    Add to this that most people who "do something interesting" (like making really nice flower arrangement for instance) will tend to travel and meet a "much smaller" crowd of people who "move around".

    In this "smaller world" you can make "very short chains" to quite shady people. Actually it is trivial to create a chain from any US politician to "big list of officially evil guy" that at most 4 level deep. (For instance Ex HP Head Carly Fiorina went to KSA and met large HP clients including the heads of SBL managed by the brother of that really bad guy who did get some support from the ex President(s) Bush when he was against the Soviets...
    And now comes the "suspicion creep" if you know Fiorina and one or tow of the Bushes, then you know 2 suspicious characters that are 3 or less level away from Really Suspicious guy.
    So "one" could be ok, but 2 humm very bad...

    So unless you take great pain to avoid anybody that might "be out of the ordinary", you imediatelly are 100% sure to become somehow "in contact" with somebody "suspicious".

    Or seen another way, being not completely boring gets you something like 200 contacts, among which you can expect at least 3 "super connectors" who do not really overlap, particularly if you are travelling, so taking in account diminishing returns it is hard to avoid having less the 3M "level 3 contacts"
    or 1/1000 of all adults in the world

    the probability that less than 2 are "bad guys" is quite low.

    so be boring or be afraid, very afraid...

    • by Anonymous Coward on Friday July 19, 2013 @08:31AM (#44326725)

      My work requires me to keep up to date with the computer industry. This means I must be connected with the hacker sites and ipso facto it also demands I am 1 degree separated from Mr. Snowden and many others who the US Government takes a dim view of. Get real people, mere contact isn't criminality, it is in the case of the investigator necessity. This is why the whole concept of Probable Cause is such a necessity!

    • Re: (Score:3, Insightful)

      by OrugTor (1114089)
      "You" might "want" to "back off" on the "quotes".
    • by Anonymous Coward

      This would be easy to abuse:

      1. Become someone's friend. Politicians certainly wants to have lots of friends. NSA agents are probably a lonely bunch...
      2. Make contact with known criminals, mafiosi and terrorists. (Not that hard, terrorists like those who sympathize with their cause...)

      Your marks are now connected to all sorts of low life through only one link - you! A fake profile might be handy if you don't want yourself associated with terrorists...

    • by dpilot (134227)

      So if I were a terrorist network, I would take a few new recruits, preferably of the "innocent-seeming" sort and dedicate them to fouling the system. I would have them keep their innocence, send them to no training camps, rather send them out onto social media and have them be as social and friendly as possible. I'd have them make as many connections as possible and get well entrenched in the system.

      Next I'd make sure they make online associations with known terrorists - enough online associations so it c

      • Except you would not need to do anything, nobody lives in a vacuum, so just the "natural" connectors as sufficient to "poison" the network.

  • by Stolpskott (2422670) on Friday July 19, 2013 @07:43AM (#44326383)

    As I used to work for an American company who have an office in Dubai (full of people with Arabic names, and lots of Muslims), a working team in India (very close to Pakistan, never mind the fact that the two countries hate each other almost as much as Chicago Bears and Green Bay Packers fans), and a development/support team in the Phillippines (close to China, with a similar relationship to India and Pakistan, and with their own domestic terrorism issues), clients in sub-Saharan Africa, Russia and Texas, my LinkedIn and Facebook profiles are full of people in those areas.
    Given that the NSA does not stop at analyzing your own contacts [slashdot.org], I am apparently a person of interest if one of my contacts has any dubious friends, or if one of my contact's contacts has any dubious friends.
    Kevin Bacon is indeed going to be screwed, we might as well just lock him up and start waterboarding him now, and save the NSA the trouble.

    • by sjames (1099)

      clients in sub-Saharan Africa, Russia and Texas,

      Wow, you're screwed! :-)

  • real question (Score:3, Insightful)

    by beefoot (2250164) on Friday July 19, 2013 @07:43AM (#44326391)
    If you truly concern about this problem, the real question to ask is why on earth do you sign up with linkedin (or g+ or facebook).
  • Well.... Duh! (Score:4, Interesting)

    by plopez (54068) on Friday July 19, 2013 @08:18AM (#44326595) Journal

    This is what I, and a host of others, having been screaming about for years. People are blindly using analytics and "big data" to make important decisions decisions about health care, insurance, credit ratings, terrorist affiliations, etc. I have encountered so much bad data in my career the thought that it is take as "gospel" makes me sick. Bad data are out there and cleaning up a polluted data stream, when possible, is expensive and takes a long time.

    Then you add in the use of NoSQL databases engines such as MongoDB which are not ACID compliant. You are virtually guaranteeing data will be corrupted. But then again, maybe I "just don't get it". But personally I think contributing to bad data is unethical.

    • But even if the data is not corrupted or polluted, there's an even more basic problem: people misunderstand statistics. Even statisticians will frequently misunderstand statistics. Good statistical analysis often runs contrary to our natural sense of understanding, and statistics, by its nature, does not provide certainties.

      • by plopez (54068)

        And the first thing they told me in Stats 101 was that you should not infer from the population or sample to the individual. THis stuff is bad on so many levels it is mind boggling.

  • Purely speculative and all conjecture. I know nothing of the algorithms involved and make the following assumptions about the meta-data and the algorithms.

    Meta-data
    1: Geo-location of the event/person
    2: Time of the event/person
    Algorithm
    3: Compare location +1 correlation.
    4: Compare time +1 correlation.
    5: Location compares street.
    6: Time within 1 day.

    Given these very simplistic assumptions. We have two people. We'll call them Good Steve and Evil Steve. They have never met, never seen each other. One lives at S

  • by Anonymous Coward

    They are searching for patriots who may decide that they prefer some form of actual democracy to oligarchy.

  • It's not as if it's binary. It's not a know vs don't-know that makes for a link in this maze. There are weights attached to all links in graphs and what determines these weights is what makes the algorithm -- not the mere presence or absence of a link. As an extreme case, what if you considered yourself linked to everyone, just with links of weight 0? This whole "we are all connected" crap is not very meaningful without the subtle answer to the question "how much?"
  • The "mess up your life" translates as "sabojat tavu dzivi". Right now it says "mess up ones own life". And no, slashdot is still not friendly to non-latin alphabets.

  • ... its the metadata. Its too easy to spam some systems with false links. Some Latvian CS graduate wants to pad his/her resume. So they write a script to find other people on some board with similar credentials and fire off a request to connect. If even a small fraction of the recipients blindly accept, suddenly they have an impressive list of professional contacts for some HR department to see.

    TLAs that do their own analysis have some experience at separating random links from meaningful ones. Many terr

  • I find the situation where a government treats the whole world as suspect quite objectionable.

  • ....it's worse than it has been portrayed:

    http://www.privacysos.org/node/1122 [privacysos.org]

Maternity pay? Now every Tom, Dick and Harry will get pregnant. -- Malcolm Smith

Working...