Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Privacy

'Anonymized' Credit Card Data Not So Anonymous, MIT Study Shows 96

schwit1 writes Scientists showed they can identify you with more than 90 percent accuracy by looking at just four purchases, three if the price is included — and this is after companies "anonymized" the transaction records, saying they wiped away names and other personal details. The study out of MIT, published Thursday in the journal Science, examined three months of credit card records for 1.1 million people. "We are showing that the privacy we are told that we have isn't real," study co-author Alex "Sandy" Pentland of the Massachusetts Institute of Technology, said in an email.
This discussion has been archived. No new comments can be posted.

'Anonymized' Credit Card Data Not So Anonymous, MIT Study Shows

Comments Filter:
  • As one who hot tired of high fees, I dropped the use of credit/debit cards. I used a gift card for an online purchase. Nothing annon about it. Has my name and address on the order.

    • Not sure what you're talking about. My credit card has no fees, I pay the balance monthly so no interest, and I get 1.25% of everything I spend back in cash. I try to use it whenever I can.
      • by Anonymous Coward

        Read between the lines, he got into credit card trouble, can't get anymore credit and has probably gone through or going through bankruptcy or has some sort of agreement with the CC companies to pay them back. So he has to use visa gift cards to purchase online items.

        If you work in retail you learn to recognize this behavior. A customer always paying in cash by digging out of their envelope from the bank after they cash their pay check. It's because that's the only way they can budget their money and the

        • by StormUP ( 892787 )
          Does it count if my cash comes from my wallet instead of an envelope? Does the retail setting matter? Perhaps I dislike the high % cut that credit card companies take from retailers and philosophically prefer cash. Cash also works when the credit transaction system goes down, though that is rare.
      • by mjwx ( 966435 ) on Thursday January 29, 2015 @07:56PM (#48936367)

        Not sure what you're talking about. My credit card has no fees

        It has no fees you know about... And banks want to keep it that way. When you pay for something by credit card, the merchant pays 3% or more for accepting the card. This means they have to pass the cost onto you in the form of higher prices.

        You didn't think the bank gave you free money did you?

        Its Machiavellian in its brilliance, you're robbing yourself of 3% in order to give yourself 1% and you're so enamoured with it, you're trying to do this as much as possible.

        • by Anonymous Coward

          That 3% covers processing and fraud prevention. If you pay with a check, a guarantee service costs the same. Services cost money. Besides gas stations, I don't know of any retail establishment that will give you a 3% discount for using cash. Why would they, they would rather pocket the difference.

          If I was running a retail establishment, I would encourage credit and debit transactions. It is much easier to deal with plastic than cash, less exposure to risk for the merchant. After working in retail managemen

          • by sjames ( 1099 )

            Actually, many businesses had a credit surcharge for a while. Then the credit cards added a no surcharge clause to the merchant contracts. So they hiked their prices and offered a cash discount. Then the credit cards added a no cash discount clause.

            Yes, services cost money. That's no excuse for hiding how much it costs and forcing it to be paid for by people not using the service (for example, everyone that pays cash).

            Many merchants prefer cash because cash can't be charged back after the fact. For example,

        • Even if I were to spend $10000 on a credit card in a year that 3% doesn't even represent a full day's pay. I think I'll live somehow.

        • When you pay for something by credit card, the merchant pays 3% or more for accepting the card. This means they have to pass the cost onto you in the form of higher prices.

          Yes. But if they're like most merchants in the world (with the exception of some gas stations and a random shop here and there), they pass that cost onto YOU too, even if you don't use a credit card.

          Its Machiavellian in its brilliance, you're robbing yourself of 3% in order to give yourself 1% and you're so enamoured with it, you're trying to do this as much as possible.

          Umm, well again if it's like most merchants in the world, you and I pay the same price if I pay by credit card and you pay by cash.

          The difference is that they're "robbing" 2% from me, while they rob 3% from you.

          Thus, I win if I use the card in the current system.

          Convince more merchants to offer cash disc

        • you're robbing yourself of 3% in order to give yourself 1% and you're so enamoured with it, you're trying to do this as much as possible.

          As opposed to your plan of not using credit cards while paying the same price and getting nothing back. You are so SMRT [youtube.com].

        • And yet a merchant who doesn't accept cash is liky to have lower costs at the end of the day.

          You don't think balancing the till, counting the money, trips to the bank, storing and maintaining a float, and dealing with cash in general were "free" and didn't include a whole boat load of inefficiencies for trading did you?

          Its like when I expense things for work I only ever do so on a company credit card because the end of the day I don't need to keep records, I don't need to fill out paperwork, I don't actuall

    • Re: (Score:2, Informative)

      As one who got tired of high fees, I dropped the use of credit/debit cards.

      What? Debit cards don't have fees. Credit cards are usually available with no fee, or with benefits (such as airline miles) that more than compensate for the fee. There may be good reasons to not use credit/debit cards, but "high fees" is not one of them.

    • by mjwx ( 966435 )

      As one who hot tired of high fees, I dropped the use of credit/debit cards. I used a gift card for an online purchase. Nothing annon about it. Has my name and address on the order.

      Its less about the order itself, more about credit card companies selling the data to advertisers and other dodgy organisations. They claim the data is anonymised (which means they remove names from the orders) but its trivial to de-anonymise the data.

      This is one of the reasons I use cash for most purchases.

    • by ruir ( 2709173 )
      Well, forget about being anonymous in the Europe Community, this year saw the passing of a law that any Internet sale has to pass a receipt to the buyer. The real idea is to 1) register who the buyer is 2) cross-reference data for sales evasion 3) cross-reference data for expenditures to catch people spending more than they earn
  • When NSA collects 'metadata', it's disturbing but also difficult to see how they benefit from corrupt use of the data. But corporate 'big data' just has many ways to make money off of it. Where is the Snowden of Citibank?
  • by jklovanc ( 1603149 ) on Thursday January 29, 2015 @05:29PM (#48935297)

    Where is the link to the actual study?

    • by cOle2 ( 225350 )
      I believe this is the study [sciencemag.org] in question.
    • Re:Study (Score:4, Informative)

      by Anonymous Coward on Thursday January 29, 2015 @05:51PM (#48935469)

      http://www.sciencemag.org/content/347/6221/468.full?intcmp=collection-privacy

      The published article the clickbait was based on has much better information. For instance: the transactions for a person all still shared a unique ID#. "All that remained were the metadata: amounts spent, shop type—restaurant, gym, or grocery store, for example—and a code representing each person."

      If you don't cycle the code per person regularly of course correlation attacks will always work.

  • by turkeydance ( 1266624 ) on Thursday January 29, 2015 @05:30PM (#48935303)
    Staff Sergeant Obvious reporting for duty.
    • by Livius ( 318358 )

      This isn't actually privacy, and it's sad that people aren't clearer about what is and isn't privacy.

      Though still a bit troubling.

    • Spafford, who wasn't part of the study, said it makes "one wonder what our expectation of privacy should be anymore."

      Privacy can't be monetized and retailers can't profit from privacy so therefore we know how much privacy we have; it's the small fraction left after they collect everything useful. This will continue this way until we have laws that make data retention and privacy violation such a legal liability hot potato that businesses will be tripping over themselves to delete data and avoid unnecessary

    • Damn! You were demoted?

  • by Ichijo ( 607641 ) on Thursday January 29, 2015 @05:49PM (#48935445) Journal

    ...using a fingerprint database to show that cash isn't anonymous.

  • When i make purchases with my credit card, i'm not worried about someone knowing it was me, Shadowrat, who made the purchase. When did people claim that you could anonymously buy anything with a credit card? Obviously that's stored in lots of places. I buy something online, the vendor needs to know where to ship it, my credit card company knows who to bill, amazon knows because they are passing the info on.

    What i worry about is someone stealing my number. This Honestly, i don't even worry about that so m
    • What you are saying has no relationship to the article. The article is talking about the supposed "anonymized" data given to marketers about your purchases. It's about thinking you were hiding yourself from the business you were buying from.

  • From TA

    It's easier to identify women, but the research couldn't explain why, de Montjoye said.

    Could it be that men tend to shop a lot less than women!?

    • Could it be that men tend to shop a lot less than women!?

      Men are either more likely to buy what everyone else is buying, or more likely to buy based on logic and not emotion. Or both.

    • by kogut ( 1133781 )

      Could it be that men tend to shop a lot less than women!?

      Or maybe men are identified by the proxy billing info for porn sites.

    • Best guess?

      The number of women buying unique items (i.e.: that one purse that's so cute) is 4-5 points higher then the number of men doing the same thing, which would mean a given data point is 4-5% more effective if the shopper is female.

  • by Wycliffe ( 116160 ) on Thursday January 29, 2015 @06:10PM (#48935665) Homepage

    The article says it can identify someone in as few as 3 transactions.
    But they aren't really identifying them, they are just showing that no other person hit the same exact set of shops.
    Well, they also mention that they get a datestamp with the transaction so assuming that datestamp has minutes
    or seconds then it should only take 1 transaction or 2 at the most. That being said, you really haven't identified
    this person as you don't know who they are in the real world just that they have a unique shopping pattern as
    everyone does.

    • Re:Why even 3? (Score:4, Insightful)

      by Courageous ( 228506 ) on Thursday January 29, 2015 @06:38PM (#48935853)

      This article isn't scary. What should be scary is that cell companies cell anonymitized _geolocation_ data. That data can be used to deterimine: A) who you are, B) where you live, C) where you work, and D) who your friends are. Step #1. Look where the phone is, regularly at midnight. Step #2, cross reference with public records databases on property ownership. That get's 65% of Americans right there. Now check where it parks every day at noon. Place of work found. And so forth.

      • by suutar ( 1860506 )

        combine the two and now they know that the person who was at shop A at time X, shop B at time Y, and shop C at time Z also appears to live at address Q and work at address R, and there you go: anyone who can get the "anonymized" data knows where you live, and that you just bought not only new living room electronics but also airline tickets.

        • combine the two and now they know that the person who was at shop A at time X, shop B at time Y, and shop C at time Z also appears to live at address Q and work at address R, and there you go: anyone who can get the "anonymized" data knows where you live, and that you just bought not only new living room electronics but also airline tickets.

          and then......?
          They send a salesman to your house from shops A, B, and C trying to sell you something?
          How often do you buy a lot of living room electronics, then go on vacation?

      • by Imrik ( 148191 )

        Should probably use 3am rather than midnight, far fewer people away from home. 10am or 2pm would probably be better than noon, people tend to eat around then and frequently leave their workplace.

    • Re:Why even 3? (Score:5, Informative)

      by Not_Wiggins ( 686627 ) on Thursday January 29, 2015 @07:03PM (#48936013) Journal

      The article is misleading. It talks about how it can be used to "identify someone." And with all the talk about privacy, it implies the identification of an individual.

      But, reading through it closely, they aren't talking about identifying a specific someone; the information isn't enough to say Not_Wiggins made these purchases.
      Instead, it focuses on identifying characteristics of purchasers and then extending it to see what other behavior purchasers in those groups would make.

      In the article example, they talked about someone making a purchase at both a bakery and a restaurant within a short time period. Finding that they had one such instance, named him Scott, then looked to see what other behaviors "Scott" had. By extending that logic, they are saying "look at the group of people who typically shop at a bakery and a restaurant... then you know those people are typically also interested in shoes."

      The example is a bit silly, but that's what they're saying.

      They're talking about documenting patterns of behavior on purchasing decisions.
      This article really isn't about loss of anonymity. It is about using anonymized credit card transactions to develop definitions of "user groups" and predicting their shared behavior pattern.

      To me, it seems more like the equivalent of last.fm... tell us what music you like, we'll compare it against what others who also have the same "likes" have said, and give you options for things that might fit your tastes.

      In this instance, it is: tell us what purchases you've made, we'll compare it against similar purchases that others have made, and we can predict what other purchases you might want/like that you haven't made yet.

      • by Anonymous Coward

        I think you missed the point. The point is that there's enough information in the time and place of our transactions that knowing only 3 places/times a person went shopping is sufficient to identify them. This means that if I saw you at the grocery store on Monday night, had dinner with you on Wednesday, and saw your facebook posting about getting gas on your road trip, and I have access to the "anonymized" data, I can reconstruct your complete credit card statement, and so discover your subscription to cat

        • Wow. Someone mod this up. I didn't get this from the summary or the article but this is dead on
          and kindof scary. It would be fairly simple to find 3 transactions for a real life person and then be
          able to cross reference it especially if you could do it over several months. Ypu could possibly
          even trick someone in doing a couple purchases and then just wait for the data.

        • by Imrik ( 148191 )

          To do that, however, you need both the information on the transactions and the information on the person, which is twice as much as advertized.

        • Comment removed based on user account deletion
        • by banda ( 206438 )

          In my experience the "timestamp" portion of the transaction date coming from merchant banks is pretty useless - often truncated to midnight, frequently transposed into the timezone of a corporate office... it's kind of garbage, and not nearly as useful as you think.

    • by mjwx ( 966435 )

      The article says it can identify someone in as few as 3 transactions.
      But they aren't really identifying them, they are just showing that no other person hit the same exact set of shops.
      Well, they also mention that they get a datestamp with the transaction so assuming that datestamp has minutes
      or seconds then it should only take 1 transaction or 2 at the most. That being said, you really haven't identified
      this person as you don't know who they are in the real world just that they have a unique shopping pattern as
      everyone does.

      Actually its a lot more in depth than that.

      Also consider the class of store you visit, You hit up a hardware store, then an auto supply store and a Micky D's on the way home. They have a reasonable idea what you ordered at McD's from the price and a good line of where you live from the trail of stores you visited.

      Of course the "as little as 2 or 3 stores" is a bit of a misnomer, same as when a teclo advertises "up to 4 mbs", only a few can be that easily identified but realistically they dont need to

    • Re:Why even 3? (Score:4, Informative)

      by NicBenjamin ( 2124018 ) on Thursday January 29, 2015 @07:57PM (#48936371)

      And this only works if you have a lot of other data in your data set. If you don't know who Scot is, then you can't figure out he's the only person who could go to the bakery on that one exact day and that particular restaurant the next.

      I don't think anyone is particularly sanguine about the future of privacy if big companies manage to figure out a way to profit from combining their multiple massive databases. This is particularly true in the US, where it would be virtually impossible to stop the police from using said databases with our warrants. Or worse, using info that the big companies forwarded them as the basis for warrants.

      If Apple or Google can silence one of it's critics by figuring out he was paying a hooker with his supposedly anonymous Mastercard gift card, that is a really fucking bad thing.

  • by eepok ( 545733 ) on Thursday January 29, 2015 @07:53PM (#48936341) Homepage

    I don't know about you, but I think it's pretty fair to say that a record without any information directly identifying the subject is "anonymous".

    The ability to complete an analysis of multiple records and data sources thereby reasonable guess (90% accuracy) of who the subject might be is insufficient to remove the title of anonymous.

  • A mathematician could easily prove 2 = 3, for large values 2.

    For loose definitions of "identify" they could find sets of credit card transactions that would meet the given "pieces" of information. If Detective Paul Drake is looking for someone who went to a particular restaurant one night and then bought cake from some bakery next day, and Della Street knows the same person paid for toll the same evening, the super duper algorithm will tell Perry Mason all the sets of transactions that would match the give

  • Can't these articles link to the journal's entry for the paper. This is of professional interest to me and I'd like to read the abstract at least, maybe even purchase the damn thing.
  • The fact that someone calls him/her "Sandy" isn't useful information to me since we're not going to hang out and shoot the shit. Trim useless information from the summary.
  • They did NOT show that, from 3-4 transactions, they could provide your name, address and phone number, or even that if you have 3-4 transactions in a million transaction anonymized data set they can find out anything about you personally *unless they know you first*.

    What they did is show that if they know that you, personally, had 3 to 4 types of transactions on specific dates (you went to a grocery store and a gas station today, and a restaurant yesterday), they could identify which anonymized data set you belong to. Their discovery requires specific outside knowledge not contained in the data.

    This only matters if, say, a third party could identify specific purchases and dates - they could then comb the records and find the rest of your transactions on that specific card. IOW, someone has to be looking for you, and know at least something about you, to even start the search.

  • Comment removed based on user account deletion
    • by moeinvt ( 851793 )

      I *think* what they're saying is that if they know it was you who bought the gas, took out the cash and ate at the restaurant, they can figure out that it was also you who went to the supermarket, and identify all of your other purchases for which they have records.

  • by gstoddart ( 321705 ) on Friday January 30, 2015 @08:18AM (#48938925) Homepage

    "We are showing that the privacy we are told that we have isn't real"

    Of course it's not bloody real.

    For us to believe this data has been 'anonymized', we have to assume that a) the company is qualified to do what is required to anonymize the data, b) that they actually give a shit, and c) that they bear any penalty if they do a terrible job.

    Entrusting these companies with this data in the first place is the problem. Allowing them to share it all over the place for profit and with no restriction is a terrible idea.

    This is precisely why sane countries have data protection and privacy laws -- because corporations are greedy, self serving entities, who won't give a crap if the collateral damage of their stuff is to damage the privacy of everybody they deal with.

    And this is precisely why all of those analytics companies in web pages are just parasites and not to be trusted.

  • From what I can tell, they first need to know the identity of the individual who made those 3 particular purchases. From that, they can link the individual to the entire set of his/her purchases in the "anonymized" CC data.
    I'm very concerned about privacy issues, but this doesn't really surprise or disturb me. It would be quite a coincidence for another person to engage in transactions at the same three places I did and at approximately the same times.

Avoid strange women and temporary variables.

Working...