Slashdot is powered by your submissions, so send in your scoop

Ask Slashdot: Best Practices For Collecting and Storing User Information? 120

Posted by Unknown Lamer on Tuesday September 11, 2012 @12:05AM from the design-by-committee dept.

New submitter isaaccs writes "I'm a mobile developer at a startup. My experience is in building user-facing applications, but in this case, a component of an app I'm building involves observing and collecting certain pieces of user information and then storing them in a web service. This is for purposes of analysis and ultimately functionality, not persistence. This would include some obvious items like names and e-mail addresses, and some less obvious items involving user behavior. We aim to be completely transparent and honest about what it is we're collecting by way of our privacy disclosure. I'm an experienced developer, and I'm aware of a handful of considerations (e.g., the need to hash personal identifiers stored remotely), but I've seen quite a few startups caught with their pants down on security/privacy of what they've collected — and I'd like to avoid it to the degree reasonably possible given we can't afford to hire an expert on the topic. I'm seeking input from the community on best-practices for data collection and the remote storage of personal (not social security numbers, but names and birthdays) information. How would you like information collected about you to be stored? If you could write your own privacy policy, what would it contain? To be clear, I'm not requesting stack or infrastructural recommendations."

This discussion has been archived. No new comments can be posted.

Ask Slashdot: Best Practices For Collecting and Storing User Information?

Load All Comments

Search 120 Comments Log In/Create an Account

Comments Filter:

Just don't do it (Score:5, Insightful)

by sublayer ( 2465650 ) writes: on Tuesday September 11, 2012 @12:09AM (#41296233)

Best practice from my perspective: do not collect the data at all.

Share
twitter facebook
- Re:Just don't do it (Score:5, Insightful)
  
  by puterguy ( 642044 ) writes: on Tuesday September 11, 2012 @12:17AM (#41296277)
  
  If you really feel the need to collect personal data and you *truly* care about the privacy concerns and needs of your customers, then don't go burying such disclosures in a privacy statement that the average user is unlikely to ever see let alone read.
  If you truly care about privacy, then either require the user to *opt-in* to such sharing or prominently display the lack of such privacy on the initial splash screen.
  Burying the collection of personal data in the middle of some lawyerly gobblygook privacy statement is like mortgage lenders burying key terms in the middle of 100's of pages of documentation. Yeah, it's legally there but no one is actually going to read or understand it.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Insightful)
    
    by philip.paradis ( 2580427 ) writes:
    
    Alternately, people could simply take responsibility for themselves and choose to avoid services which require agreement to miles of terms. Given your attitude on the topic, you probably haven't even bothered to read the terms of service for anything you're using right now. It seems you're trying to divert responsibility for yourself onto the backs of the service organizations you choose to deal with. Again, note the word "choose."
    You've also managed to miss the opportunity to discuss where data goes and ho
    - Re:Just don't do it (Score:5, Funny)
      
      by davester666 ( 731373 ) writes: on Tuesday September 11, 2012 @02:26AM (#41296845) Journal
      
      Yes, just store the data in plaintext, in a mysql database connected directly to the internet.
      Bonus points if you create mysql users for each unique user and use their username/password to authenticate connections to the database.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by rwise2112 ( 648849 ) writes:
        
        I was going to say "email it to Anonymous - they'll back it up for you too", but your method will be just as effective!
    - Re:Just don't do it (Score:4, Interesting)
      
      by CodeBuster ( 516420 ) writes: on Tuesday September 11, 2012 @03:08AM (#41296949)
      
      Whenever I'm signing up for a new site or using a service for the first time, I always do a recon of their sign up procedures using a fake name / email address so I can see what sort of information they "require" before I even get started and even then I only give up what I absolutely have to. If I can get away with using the fake information permanently, then I do that. I keep track of all my fake identities in an encrypted file container by site name so that I can be consistent with my aliases. This strategy works well for me and I'm sure that I can't be the only person out there who does this. As Robert De Niro's character, Jack Byrnes, said in Meet the Fokkers (paraphrased), "If you're outside the circle of trust, you're on a need to know basis and right now you don't need to know."
      
      Parent Share
      twitter facebook
    - Re: (Score:1)
      
      by maxwell demon ( 590494 ) writes:
      
      Alternately, people could simply take responsibility for themselves and choose to avoid services which require agreement to miles of terms.
      Unfortunately that would mean having no internet access (good luck finding an internet provider without a big list of terms and requirements).
  - Re:Just don't do it (Score:5, Interesting)
    
    by jittles ( 1613415 ) writes: on Tuesday September 11, 2012 @10:05AM (#41299211)
    
    >Burying the collection of personal data in the middle of some lawyerly gobblygook privacy statement is like mortgage lenders burying key terms in the middle of 100's of pages of documentation. Yeah, it's legally there but no one is actually going to read or understand it.
    When I bought my house, I spent about 3 hours at the title company reading and signing the mountain of paperwork. I would never commit myself to 30 years of anything without knowing and understanding the details. I will say that the notary was pissed. After 30 minutes she said "Are you really going to read the entire thing?" And later "I have an appointment, you're going to make me late." My responses were "Yes, I'd be stupid not to." and "You scheduled this entire block with me, its not my fault you double booked yourself, you'll have to cancel your other appointment."
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by AwesomeMcgee ( 2437070 ) writes:
      
      Hah +1, yeah I've come to take joy in the pain of those who present me legal documents for signing. They never expect you to read the bloody thing and always get all cranky about how long you're taking. Apt leases, car loans, new banking accounts etc every single person who's handed me one of these after about 30 minutes of me reading it just looks so dejected, plus all the questions I ask as I go, and sometimes even demand addendums. They shouldn't be handing out legal paper work for signature if they don'
    - Re: (Score:2)
      
      by nullchar ( 446050 ) writes:
      
      That's great and all, but what happens when you read page 200 and say "uh, I don't agree with this"?
      Response: "Sorry, no house for you!"
      - Re: (Score:2)
        
        by jittles ( 1613415 ) writes:
        
        That's great and all, but what happens when you read page 200 and say "uh, I don't agree with this"?
        Response: "Sorry, no house for you!"
        Well, if the document does not match the preview doc they sent you, or match the terms and rates that they promised you (you get that in writing before you get the contract), then they have to update the contract. There are some crooks out there that will tell you one interest rate and slip another into the docs. You really need to trust your mortgage broker. I used a friend's dad, thankfully. He was very helpful, and I knew him to be honest. He even used his commission on my loan to buy me some points
    - Reading it. (Score:3)
      
      by hendrikboom ( 1001110 ) writes:
      
      Here in Quebec, the notary actually reads the entire document to you and asks you enough questions that he is sure you've understood it.
      - Re: (Score:2)
        
        by jittles ( 1613415 ) writes:
        
        That is probably how it should be. Most people have a hard time understanding the kind of language they use in contracts anyway. I had the advantage of working for lawyers for most of my college career first doing help desk for the Attorney General and then later doing transcription as a psuedo-legal secretary for a large bankruptcy firm. Otherwise, the contract might have seemed like Latin...
- Re: (Score:2, Insightful)
  
  by hutsell ( 1228828 ) writes:
  
  Best practice from my perspective: do not collect the data at all.
  Exactly: "Put the Database down now, and step away from the Internet."
  Sorry, but my interest in giving beneficial doubt to the question's possible sincerity was lost when reading the part about the unoriginal solution for insuring honesty and transparency -- the solution being hidden in (the lawyer make-work terms of) "our privacy disclosure".
  - Re: (Score:2)
    
    by isaaccs ( 1854142 ) writes:
    
    There were little to no details given as to how the privacy disclosure would be phrased or provided to users. As it were, your assumption is wrong. There is no desire to squirrel away anything in legalese. Indeed, the question asks: "If you could write your own privacy policy, what would it contain?". You describe the "hidden" (which you've assumed) solution as unoriginal, but provide no alternative suggestions (which was the point of submitting the question to the community in the first place).
- Re: (Score:3)
  
  by fm6 ( 162816 ) writes:
  
  So, Slashdot made a mistake in allowing you to create an account?
- Re: (Score:3)
  
  by c0lo ( 1497653 ) writes:
  
  Best practice from my perspective: do not collect the data at all.
  More detailed:
  Rule 1. don't do it
  Rule 2. if for some reasons, rule 1 cannot be followed, collect them but discard them immediately
  Rule 3. if for some reasons, the prev 2 rules cannot be obeyed, after collection put them on a WORN storage (that is: "Write Only, Read Never" media)
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  I hate to IANAL here but here goes:
  In your country of origin you have legislation that you have to prove compliance to should your respective government body find out if you are collecting user information. Personally identifiable stuff (Name, address, Phone Number, E-mail) is considered sensitive, Personally Identifiable sensitive stuff (Social Insurance, Health Records, Employment History, Criminal Records etc. ad. nosium,) comes with hefty legislation like HIPPA for each type of stuff. Again the parent h
  - Re: (Score:2)
    
    by sapgau ( 413511 ) writes:
    
    Mod Up +1
    No matter what you want to do with PI you must check first that is legal, first on your jurisdiction (state or province), then your country (countries) where you expect your customers to reside.
    It doesn't matter what good intentions you have, it might not be enough to keep you out of trouble.
    For example, if your jurisdiction forbids you from keeping DOB then make sure you are clean.
- Re: (Score:1)
  
  by Eadwacer ( 722852 ) writes:
  
  I think it was Robert X. Cringely who compared personal user data to toxic waste. You don't ever want to produce it. If you do produce it, it's your responsibility forever because you don't know where an undiscovered drum of it is hiding. If it touches something, that something becomes toxic also. Finally, the legal implications of it getting out into public are capable of destroying your company.
- - Re: (Score:1)
    
    by maxwell demon ( 590494 ) writes:
    
    Who rated this post "Insightful"?
    Someone with mod points.
    Why don't I see any rating buttons?
    Because you only get the option to moderate if you (a) are logged in (you cannot moderate as Anonymous Coward), (b) have enough Karma (which basically means your posts have been moderated up often enough, and certainly more often than down), and (c) happen to have some mod points (even if your Karma is high enough, you'll only get mod points every now and then, and if you don't use them, they'll expire in a few days)
risk vs. investment tradeoffs (Score:4, Informative)

by noh8rz10 ( 2716597 ) writes: on Tuesday September 11, 2012 @12:10AM (#41296235)

I think your mind is on the right track in identifying your resource limits (i.e. no tip-of-the-spear experts) and the sensitivity of the data (i.e., it's not all nuclear bomb codes). That is the first step. Next, think on the exact types of data that you're collecting, and try to group like data together, for example, all text data, screen caps, keylogging, audio or webcam video if you have it, and find a way to store them in an efficient structure while everything stays linked together. Finally, if possible, associate all data collection events with time (timestamp) and location (gps). this will allow a more complete analysis on the back end.

Share
twitter facebook
- Re:risk vs. investment tradeoffs (Score:4, Insightful)
  
  by SomePgmr ( 2021234 ) writes: on Tuesday September 11, 2012 @12:47AM (#41296453) Homepage
  
  Finally, if possible, associate all data collection events with time (timestamp) and location (gps).
  It started getting a little creepy there at the end, bud. ;)
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by asylumx ( 881307 ) writes:
    
    If you think that's bad, keep reading:
    this will allow a more complete analysis on the back end.
    He wants to analyze users "back ends"!!!
- - Re: (Score:1)
    
    by noh8rz10 ( 2716597 ) writes:
    
    Good point. White hat root kits!
Don't store the data. (Score:1)

by micheas ( 231635 ) writes:

Just don't.
When you get the expertise to store the data securely then consider it.
Once you get into the habit of justifying everything that you store you will be less prone to the woops! plain text password/username/real-name/creditcard table being found by intruders.
- Re: (Score:1)
  
  by isaaccs ( 1854142 ) writes:
  
  To start, I do appreciate the spirit of the comment - as a professional in a field, it's an argument I make often. But I don't totally agree in this context. It would proove extremely difficult, for example, to build a search engine such as Google without collecting or correlating user information. To build Instagram without collecting pictures (which I'd very much consider private user data/personal identifiers) might also prove vexing. The question wasn't "Should I collect user information?" but "How can
  - Re: (Score:2)
    
    by sapgau ( 413511 ) writes:
    
    I didn't see gp post as condescending, I think he is trying to make the point of how serious private information storage is.
    I did cringe on his reference of putting password/username/cc in one table, even encrypted. I suggest to use hash values to replace those real values and mask CC numbers. So even if the encryption is broken a hacker would not be able to identify the person.
    Doing this doesn't limit your innovation in any way. It's actually a burden we all have to deal with to avoid a legal bomb landing
Let me have a login? (Score:1)

by aliquis ( 678370 ) writes:

Let me have a login for the benefit of having my data saved?
If I don't log in then don't store my details.
As for the rest whatever. Hash + salt or whatever?
If no-one can reach / use the data for anything then maybe say just e-mail address or something such as identifier.
- Re: (Score:2)
  
  by ThatsMyNick ( 2004126 ) writes:
  
  Well, if you are looking for developer/legal opinions there are better forums, but if you want legal, developer and user opinion (and a discussion based on them), slashdot is not bad. Besides you dont really know that OP has not also posted in a better developer/legal oriented forum (and I find it strange that you mention that you wouldnt post on slashdot, buy fail to mention the forum that is appropriate for this question (unless you yourselves were just trolling)).
  - Re: (Score:1)
    
    by Osgeld ( 1900440 ) writes:
    
    I dont know the appropriate forum, as I am not an experienced web developer, nor would I expect any serious answer from slashdot when I do need it, I develop electronics, I dont post which FET has the best ESD damage resistance on slashdot, nor would I expect anything but random opinion from it.
    when your serious, you get the data from people who have been down that road, and test it yourself, not post to some news recycler and hope for the best.
    - Re:I'm an experienced developer (Score:4, Insightful)
      
      by SomePgmr ( 2021234 ) writes: on Tuesday September 11, 2012 @12:55AM (#41296491) Homepage
      
      I'd give him the benefit of the doubt, and assume this isn't the only place he's looking for best practices.
      Meanwhile, "I'm an experienced developer, I'm familiar with all the general rules for securing customer data, but I'd like to hear of any 'gotchas' that you know about"? That seems like a reasonable thing to ask.
      Again, assuming this isn't the one-and-only source. So instead of grabbing our pitchforks, maybe someone has some examples of what he asked about?
      
      Parent Share
      twitter facebook
      - Re: (Score:1)
        
        by Anonymous Coward writes:
        
        There's the blatantly obvious stuff: keep the data heavily encrypted on a back-end d/b or file store, on a server nowhere near a public-facing interface (or DMZ); obfuscate and/or consolidate the individual, personal data as soon as you gather it, assuming you don't need specific per user info to be retained. Needless to say, keep all your OS/software/services/apps/etc patched with latest security on a weekly, if not daily basis, FFS!
        Also, invite some wannabe hack-meisters you can kind-of-trust to try &
        
        Re: (Score:2)
        
        by Electricity Likes Me ( 1098643 ) writes:
        
        Isn't your first bit of advice right there a classic gotcha?
        Encryption doesn't mean anything unless the access routes to that encrypted data are well defined and understood - since at some point it has to be unencrypted to be used. So who's doing the unencrypting, who holds the keys etc.
  - Re: (Score:1)
    
    by isaaccs ( 1854142 ) writes:
    
    In this forum, I submitted to seek the opinions of a community of technically minded individuals on a question that hinges on broader social concern. I did/do not expect a uniform or comprehensive answer. I expected to hear the voices of different people who have thought about, dealt with, or otherwise concern themselves with data collection. I am much aware that this is not a legal or technical venue - and I appreciate your acknowledgement that this may not be the only avenue I've pursued to inform myself.
- Re: (Score:2, Interesting)
  
  by Anonymous Coward writes:
  
  Agreed. People mistake this for a technical forum.
  - Re: (Score:1)
    
    by isaaccs ( 1854142 ) writes:
    
    The question specifically says "I'm not seeking stack or infrastructural recommendations." This is not a technical question. The question is posed to the community as it bears on *social* issues.
- Re: (Score:1)
  
  by noh8rz10 ( 2716597 ) writes:
  
  thank you. I'm updating my sig with your quote.
- - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    the problem is not that /. has both good and bad, it is if you don't know the answer then how the hell do you think they will know enough to sort the good from the bad. reading through he has already gotten a mix of both for this topic.
- Re: (Score:2)
  
  by bloodhawk ( 813939 ) writes:
  
  Not sure why you got marked flamebait, Even as a developer I find your comments spot on. If you are not experienced enough to know the answer to this topic then /. is not the place to be asking as you won't have the knowledge to sort the garbage from the good advise. Incidentally I would love to know the name of this new site as I think it is one I would avoid for my own safety.
Don't (Score:4, Informative)

by SmartyPants ( 27576 ) writes: on Tuesday September 11, 2012 @12:22AM (#41296317) Homepage

honestly... try not to store it.
You need to examine why you actually need the data, and if you can't think of a good reason (except it might be valuable in the future), then don't store it.
If you do need it for analysis, machine learning apps, etc, try to anonymize it as early as possible, and not to keep raw data longer than you need it. (say raw data for 3 months, then just store aggregate info).
also.. for behavior.. you don't need years of information, studies have shown people change, so make sure the things people do recently are more important, and the old stuff gradually decays.

Share
twitter facebook
- Re: (Score:2)
  
  by mcrbids ( 148650 ) writes:
  
  As a counterpoint, Don't process all that data.
  At my company, we store everything. Every click, every bit of data, nightly snapshots of all data, etc. Forever. This results in stupid amounts of data about our users and we pretty much don't bother to try to correlate the data, we just provide it upon request of the customer.
  Why try to correlate it, when our customers are eager to pay us to do other things with it? Just because you have the data, doesn't mean you have to be devious with it. Save everything re
- Re: (Score:2)
  
  by stephanruby ( 542433 ) writes:
  
  If your purpose is really just for "analysis and ultimately functionality, not persistence" then there is really no reason to keep an email or a name. Just assign a unique identifier, and then you're done.
  So if for some reason, the user wants to get in touch with you to file a bug report, or what not, then assign a unique identifier for the device to the bug report (in case you get other bug reports coming from the same source), but don't ask for his/her contact information unless the user ticks a box askin
Start reading about PII (Score:3, Informative)

by Anonymous Coward writes: on Tuesday September 11, 2012 @12:24AM (#41296331)

Wikipedia (http://en.wikipedia.org/wiki/Personally_identifiable_information) is a good start.

Share
twitter facebook
Break the association (Score:5, Insightful)

by cheros ( 223479 ) writes: on Tuesday September 11, 2012 @12:31AM (#41296377)

If at all possible, stay away from personally identifiable data. If your aim is to use identity as an index, work out a way in which you can translate an identity into an an index or hash value (i.e. one way). This is not going to be perfect (there will be about a million "John Smith"s out there), but if you have a consistent pair such as name and phone number, turn that into a hash and use it as data index.
That means you can still do correlations, but a leak will not result in exposure of personal data.
However, first of all, look at what you're holding on personal data and simply assume you got hacked and it's "out there" - plan for that crisis first because there is one question you need to answer:
If you cannot afford to pay for security advice, can you afford to pay for the inevitable consequences?

Share
twitter facebook
- Re: (Score:1)
  
  by noh8rz10 ( 2716597 ) writes:
  
  If you cannot afford to pay for security advice, can you afford to pay for the inevitable consequences?
  put another way: if you can't afford to do it right, how will you afford to do it again?
  - Re: (Score:1)
    
    by isaaccs ( 1854142 ) writes:
    
    There is validity to this point, but followed to it's conclusion, many of the great boot-strapped startups of our time wouldn't exist. As your exposure and user base grows, so does your ability to consult with specialists and experts - but everyone must start somewhere.
- Re: (Score:2)
  
  by dgatwood ( 11270 ) writes:
  
  Or keep personally identifiable information separate from everything else. Ensure that you cannot get to one data set from the other and vice versa. Use login information as a hash into the identity database and the behaviors database. If you must store any time stamps on database records, make sure you do so in a way that prevents using them to easily correlate the two data sets (e.g. update the time stamp on the personal info record only when the user changes his/her password, address, or whatever, ra
- Re: (Score:2)
  
  by Lorens ( 597774 ) writes:
  
  If your aim is to use identity as an index, work out a way in which you can translate an identity into an an index or hash value (i.e. one way). This is not going to be perfect (there will be about a million "John Smith"s out there), but if you have a consistent pair such as name and phone number, turn that into a hash and use it as data index.
  Bad idea when you get a hash collision. Account numbers do not have to be seen by the user, but there aren't (m)any useful ways of avoiding their use internally.
  If OP is storing data for analysis and not for immediate reuse, there are some often overlooked but stupidly easy things to do like making sure that the user-facing machines collecting the data only have append/insert access to the data (no read, no modify). Analysing the data would be done from another machine/subnet/database account whatever.
  - Re: (Score:3)
    
    by cheros ( 223479 ) writes:
    
    He said he had little money available, so I figured I gave him something that was easy vs. perfect. The key question is if the delta introduced by the odd hash collision is actually significant in the volume of data he is planning to process. If it isn't, I would not try to develop perfection - he can use his little funding better elsewhere..
    In other words, in theory you're absolutely right, in practice I suspect there is little difference. But my favourite way of avoiding issues with personal data is si
- Re: (Score:1)
  
  by fa2k ( 881632 ) writes:
  
  Great idea for some cases. If you need "telemetry" data to understand how people are using your application, assign each session a unique ID and don't store which user did it. It also works for some other statistical data. The argument against is that you may need the correlation between sessions later.
  Depending on the application, you could have a hierarchical system of databases where the lowest level contains session information, the next contains persistent user information but not personally identifiab
  - Re: (Score:1)
    
    by fa2k ( 881632 ) writes:
    
    Regarding my second paragraph, an important part was not obvious: Each session in the session database has a unique ID, and each anonymised user in the middle database has a list of sessions, and each user in the top database points to an anonymised user.
Collect as little as possible, throw it away... (Score:5, Interesting)

by IBitOBear ( 410965 ) writes: on Tuesday September 11, 2012 @12:36AM (#41296407) Homepage Journal

I have been toying with a site idea. Your account name is your public key fingerprint. You public nicname is whatever you use in the message. Your login is validated because everything you send is signed wiht the key that matches the fingerprint (and encrypted with my public key for transmision). Input to user form is constrained and validated within those constraints (to prevent padding attacks).
I would then have a database "key x","paid through date y".
Sure, I couldn't sell any farmed data a-la facebook, but suppoena requests woudl be a breze... "here's your hex dump..."

Share
twitter facebook
- P.S. (Score:2)
  
  by IBitOBear ( 410965 ) writes:
  
  Return email will be sent, if necessary, to whatever address(es) are registered in the public key database for that fingeprint, encrypted with that key.
  Obviously I have no control over your passphrase and can do nothing to help you "recover your password" or whatever. Please see your GPG or PGP documentation for a better explanation.
  Your account will not be "renewed" past the key expiration date.
- Re: (Score:2)
  
  by Rob Kaper ( 5960 ) writes:
  
  I have been toying with a site idea. Your account name is your public key fingerprint. You public nicname is whatever you use in the message. Your login is validated because everything you send is signed wiht the key that matches the fingerprint (and encrypted with my public key for transmision). Input to user form is constrained and validated within those constraints (to prevent padding attacks).
  I would then have a database "key x","paid through date y".
  Sure, I couldn't sell any farmed data a-la facebook, but suppoena requests woudl be a breze... "here's your hex dump..."
  If you accept payments, wouldn't those keys still be linked to contact information and/or payment transactions?
  - Payment Recepits (Score:2)
    
    by IBitOBear ( 410965 ) writes:
    
    Not for any longer than necessary. Likely I would make that opt-in.
    I would have a payment history (bob paid x dollars for y time) as an atomic event. Bob could check a box to say "remember this for me", or not at the time of payment.
    At the time of payment I would also send Bob a receipt. That recept would say "Bob paid for a service". The receipt would also contain a dot-splash (e.g. Qr Code a linear 2D barcode, depending on how much info space I turn out to need) that was the "proper join record for the da
- The little nicities (Score:2)
  
  by IBitOBear ( 410965 ) writes:
  
  There would be other little niceties.
  Agressive use of POST instead of GET messages on all forms so that pin-trap requirements, if levied, would be largely moot. as in user XXXXXX did POST to "/" at this site on these dates and times. [POST data is not legal to collect in PIN traps in the USA as I understand the law.]
  Services a site could sell? POST the URL you want as part of the encrypted blob you sen to this site, we will retrieve it, scrub it and send its content back to you encrypted to with your key.
  Pa
Give me control and earn my trust (Score:4, Insightful)

by johnnick ( 188363 ) writes: on Tuesday September 11, 2012 @12:37AM (#41296411)

The short requirements:
1) Explain what you're collecting in real-time at the moment when you give me the option whether or not to permit you to collect it. Tell me what you will use it for, when you will delete it and the consequences if I don't give it to you. People don't read privacy disclosures. Give notice and ask permission at the moment of proposed collection. Make it opt-in, not opt-out.
2) Only request the information required to perform the service I've requested. Use the information I provide only to provide the service I've requested. Only share the information I provide with third parties to the limited extent necessary to provide the services I've requested. Obtain contractual commitments from those third parties that cause them to protect my information and delete it as soon as they've done what's required to provide the service I've requested. Keep information only as long as necessary to provide the service I've requested and delete it after you've done what's required to provide the service I've requested.
3) Protect my information. Encrypt in transit and at rest. Delete thoroughly and don't give in to the urge to collect and keep information just because it might be useful some time in the future. You can't lose what you don't have.
You say the collection "... is for purposes of analysis and ultimately functionality, not persistence." That seems inconsistent with the collection of name and email address. I can't think of too many use cases where you're collecting my name and email address and don't plan to keep it (and use it for marketing or otherwise share it in some way). If you need to contact me or I need to create a user-id that is my email address, you don't need my name.
Your privacy policy is your contract with your user. It is an operational document that must be consistent with your practices. The privacy policy should be consistent with your policies and procedures. If the information you collect, or the way you handle it changes, you must change your privacy policy.

Share
twitter facebook
- Re: (Score:2)
  
  by TheDarkMaster ( 1292526 ) writes:
  
  I think your answer is the best I've seen for the issue.
Support OpenID (Score:2)

by interval1066 ( 668936 ) writes:

...and let your users, investors, and you sleep easier at night. Don't store anything at all except a few prefs.
You can't afford it, by your own admission. (Score:4, Insightful)

by VendettaMF ( 629699 ) writes: on Tuesday September 11, 2012 @01:40AM (#41296699) Homepage

If you can't afford the expert then you can't afford to collect such data. Move away from this project to something you have the ability to do.

Share
twitter facebook
- Re: (Score:3)
  
  by Mike610544 ( 578872 ) writes:
  
  If you can't afford the expert then you can't afford to collect such data. Move away from this project to something you have the ability to do.
  I'm surprised it took this long for someone to say that. The people who will exploit your system and extract something valuable from it can afford those experts.
OWASP (Score:5, Informative)

by FormOfActionBanana ( 966779 ) writes: <slashdot2@douglasheld.net> on Tuesday September 11, 2012 @01:42AM (#41296709) Homepage

OWASP has guidance; for instance, here: https://www.owasp.org/index.php/IOS_Developer_Cheat_Sheet#Insecure_Data_Storage_.28M1.29 [owasp.org]
From https://www.owasp.org/images/5/5e/Mobile_Security_-_Android_and_iOS_-_OWASP_NY_-_Final.pdf [owasp.org]
2. Insecure data storage
Solution
Avoid local storage inside the device for sensitive information
If local storage is “required” encrypt data securely and then store Use the Crypto APIs provided by Apple and Google
Avoid writing custom crypto code – prone to vulnerability

Share
twitter facebook
- Re: (Score:1)
  
  by fa2k ( 881632 ) writes:
  
  Avoid local storage inside the device for sensitive information
  That does make sense, but it still feels like I've fallen into opposite land.
  Avoid writing custom crypto code – prone to vulnerability
  Yes! I'll repeat it a couple of times
  Avoid writing custom crypto code – prone to vulnerability
  Avoid writing custom crypto code – prone to vulnerability
Book of best practices (Score:5, Insightful)

by Okian Warrior ( 537106 ) writes: on Tuesday September 11, 2012 @02:01AM (#41296751) Homepage Journal

In the US, we have the National Electrical Code [wikipedia.org] which explains in clear detail how house wiring is constructed.
Following the code a legal requirement in many (most?) states, but from the point of an electrician it's a "book of best practices". Use this gauge wire for this current, staple the wire within 6" of the box, and so on. The code gets revised and added to over time as questions crop up and new technologies get added and people get more experience.
There's a reason for everything. For example, the light in a bathroom should be on a separate breaker from the outlet next to the sink. It makes sense in retrospect, but this is not something that is obvious beforehand.
It's very detailed, but also very clear. Homeowners routinely understand the instructions and are able to make simple repairs and modifications to their home wiring which conform to the code.
We throw a lot of "best practices" around here as if they were simple and obvious at the outset, but maybe they're not. Hash your passwords, salt the hash, sanitize the form inputs, don't keep CC info... lots of best practices which in hindsight make sense but which aren't necessarily obvious beforehand.
Most web apps have common requirements for login, identity management, privacy, various forms of functionality, and so on.
Should we have a "book of best practices"?

Share
twitter facebook
- Re: (Score:2)
  
  by fuzzyfuzzyfungus ( 1223518 ) writes:
  
  I suspect that the big problem with that analogy is that data collection(unlike electrical wiring) is a substantially adversarial field.
  There is a certain amount of tension, (fast, cheap, good, pick any two, and the usual buyer/seller desire to not leave money on the table); but the buyer and the seller both share roughly the same ideal, though they may deviate from it out of laziness, cheapness, or incompetence.
  With data collection, the purely security/architectural aspects are somewhat similar; but there
  - Re: (Score:2)
    
    by khallow ( 566160 ) writes:
    
    The same tension exists in electrical wiring. But one can physically inspect the entire work. With data collection, it's pretty easy to hide what you are doing from the target of your collecting.
Aggregate Data (Score:2)

by Archangel Michael ( 180766 ) writes:

Aggregate the data as quickly as possible to anonymize it.
Collect "Mary did X, Y but not Z", but aggregate it to Three people did X, Two Y and TWELVE Z and drop Mary from the data. You don't need to know Mary did anything.
- Re: (Score:2)
  
  by AwesomeMcgee ( 2437070 ) writes:
  
  +1 this is exactly what I was going to say and what I have done in the past when presented with these situations. Best bet if you *must* have non-aggregated data is to simply identify each user by a guid that get's embedded in each client, with no identifying information.
  
  Also there are a lot of laws around the world regarding things like this which can and cannot be tracked *at all* that no amount of legal disclosure will make lawful in some places. Seriously, just avoid any form of identifying data (pre
What is is for? (Score:1)

by Silvanis ( 152728 ) writes:

You say you aren't interested in persistence, so I don't see any reason why the data needs to be personally identifiable. Whether your index is John Smith in Albany,NY or User #71829382 doesn't matter for usage analytics. Even demographic information can at least be stripped of things like name and phone number.
If you REALLY need to tie this information to a particular instance, then use a hardware key from the mobile device and not a user's information. A hacked phone is easier to deal with than identity t
Also consider TLDR-TOS (Score:3)

by Krishnoid ( 984597 ) * writes: on Tuesday September 11, 2012 @02:40AM (#41296875) Journal

This site [tos-dr.info] provides summaries of the terms-of-service policies for various companies covering privacy, retention, and use of user information. You can use it to compare your plans with those of major companies and identify privacy or TOS concerns you may have overlooked.

Share
twitter facebook
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
On a need to know basis only (Score:2)

by istartedi ( 132515 ) writes:

My car insurance company needs to be able to pull my DMV records, perhaps even periodicly. They could retain *none* of that information and ask me to visit a web site periodicly where the info gets enterred so they can do the query (and then forget the information required to perform the query). Most customers wouldn't mind them holding that information; but if I'm *that* security minded and they make it clear to me that I'll have to hit their site once a month to maintain my insurance... well... There a
Use (Score:2)

by MrKaos ( 858439 ) writes:

/dev/null
Read "Translucent Databases" by Peter Wayner (Score:2)

by cornicefire ( 610241 ) writes:

It explains how to store personal information so it can be used correctly. http://wayner.org/node/46 [wayner.org]
Collecting Personally Identifiable Information (Score:3)

by Rozzin ( 9910 ) writes: on Tuesday September 11, 2012 @07:46AM (#41297979) Homepage

On passwords, I liked Jeff Atwood's article, `You're Probably Storing Passwords Incorrectly' [codinghorror.com].
For Personally Identifiable Information (PII) [wikipedia.org], I liked Brian Danger Graham's article, `What's in a name database?' [blogspot.com].

Share
twitter facebook
Policies, Procedures, Standards, Trust all Useless (Score:2)

by anorlunda ( 311253 ) writes:

If your company goes bankrupt, or is sold to another, all it's assets become the property of someone else. That someone cannot be constrained to respect anything you have promised. You may not even have the opportunity to wipe disks or change passwords.
For example, a hospital failed to pay the rent on a warehouse storing patient records. The landlord seized and sold those records as scrap. None of the hospital's patient privacy obligations transfer to the landlord, or to the scrap dealer.
Heed th
Keep it on the user's computer, not in the cloud (Score:1)

by jbrohan ( 1102957 ) writes:

Obviously not the solution for everybody. We write apps for Android Tablets (for old people actually). All the data like Name, email, pictures, and messages are stored in the Android tablet and kept on the Cloud only until they are downloaded. They are encrypted, even the pictures, while waiting on the Cloud database. In the registration part of the app the user does type in his email, but we do not keep it. How to contact the user? We put a record in a table which is checked periodically by an active us
Google Mobile Analytics (Score:2)

by monkeyhybrid ( 1677192 ) writes:

Although you state you're not looking for stack or infrastructure recommendations, I'd still recommend having a look at Google Mobile Analytics [google.com]. They have an SDK for Android and iOS that makes it very easy to integrate in your apps.
It Is a Matter of How to Encrypt (Score:2)

by trydk ( 930014 ) writes:

I think everybody would agree that the data should be encrypted, but often the problem with encryption is access to the data. If the server-side application stores the encryption key, this key could potentially be found (maybe through a vulnerability) and thus give access to the entire database.

Best practice is to encrypt each record with a unique key. This key could be generated by some unique identifiers per user like Visible User ID (maybe E-mail address) and Password and Hidden User ID (different from
Don't. (Score:2)

by BVis ( 267028 ) writes:

Analyze data on a nightly basis. Store the results. Scrub database after results are stored. The asshole MBA that your startup hires because it isn't making enough money then has nothing to turn around and sell for a quick buck.
If you have to store *anything at all*, hire the expert. Can't hire the expert? Your startup is inadequately funded.
Some advise (Score:2)

by Minupla ( 62455 ) writes:

Disclaimer: I work in the field, but do not have nearly enough information on your particular situation, jurisdiction, etc to provide detailed recommendations. What follows is basic best practice stuff based on my jurisdiction and market sector.
* First, any sensitive information you are collecting, ask if you really REALLY REALLY need it. This stuff is toxic waste. Your first and best defense is not to store it if you don't need it.
* A hash of something like a SSN, Telephone number, etc is worthless in t

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Just don't do it (Score:5, Insightful)

Re:Just don't do it (Score:5, Insightful)

Re: (Score:3, Insightful)

Re:Just don't do it (Score:5, Funny)

Re: (Score:2)

Re:Just don't do it (Score:4, Interesting)

Re: (Score:1)

Re:Just don't do it (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Reading it. (Score:3)

Re: (Score:2)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

risk vs. investment tradeoffs (Score:4, Informative)

Re:risk vs. investment tradeoffs (Score:4, Insightful)

Re: (Score:2)

Re: (Score:1)

Don't store the data. (Score:1)

Re: (Score:1)

Re: (Score:2)

Let me have a login? (Score:1)

Re: (Score:2)

Re: (Score:1)

Re:I'm an experienced developer (Score:4, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2, Interesting)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Don't (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Start reading about PII (Score:3, Informative)

Break the association (Score:5, Insightful)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:1)

Re: (Score:1)

Collect as little as possible, throw it away... (Score:5, Interesting)

P.S. (Score:2)

Re: (Score:2)

Payment Recepits (Score:2)

The little nicities (Score:2)

Give me control and earn my trust (Score:4, Insightful)

Re: (Score:2)

Support OpenID (Score:2)

You can't afford it, by your own admission. (Score:4, Insightful)

Re: (Score:3)

OWASP (Score:5, Informative)

Re: (Score:1)

Book of best practices (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Aggregate Data (Score:2)

Re: (Score:2)

What is is for? (Score:1)

Also consider TLDR-TOS (Score:3)

Re: (Score:2)

On a need to know basis only (Score:2)

Use (Score:2)

Read "Translucent Databases" by Peter Wayner (Score:2)

Collecting Personally Identifiable Information (Score:3)

Policies, Procedures, Standards, Trust all Useless (Score:2)

Keep it on the user's computer, not in the cloud (Score:1)

Google Mobile Analytics (Score:2)

It Is a Matter of How to Encrypt (Score:2)