Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
Privacy Your Rights Online

Ask Slashdot: Best Practices For Collecting and Storing User Information? 120

New submitter isaaccs writes "I'm a mobile developer at a startup. My experience is in building user-facing applications, but in this case, a component of an app I'm building involves observing and collecting certain pieces of user information and then storing them in a web service. This is for purposes of analysis and ultimately functionality, not persistence. This would include some obvious items like names and e-mail addresses, and some less obvious items involving user behavior. We aim to be completely transparent and honest about what it is we're collecting by way of our privacy disclosure. I'm an experienced developer, and I'm aware of a handful of considerations (e.g., the need to hash personal identifiers stored remotely), but I've seen quite a few startups caught with their pants down on security/privacy of what they've collected — and I'd like to avoid it to the degree reasonably possible given we can't afford to hire an expert on the topic. I'm seeking input from the community on best-practices for data collection and the remote storage of personal (not social security numbers, but names and birthdays) information. How would you like information collected about you to be stored? If you could write your own privacy policy, what would it contain? To be clear, I'm not requesting stack or infrastructural recommendations."
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Best Practices For Collecting and Storing User Information?

Comments Filter:
  • Just don't do it (Score:5, Insightful)

    by sublayer ( 2465650 ) on Tuesday September 11, 2012 @12:09AM (#41296233)
    Best practice from my perspective: do not collect the data at all.
  • by puterguy ( 642044 ) on Tuesday September 11, 2012 @12:17AM (#41296277)

    If you really feel the need to collect personal data and you *truly* care about the privacy concerns and needs of your customers, then don't go burying such disclosures in a privacy statement that the average user is unlikely to ever see let alone read.

    If you truly care about privacy, then either require the user to *opt-in* to such sharing or prominently display the lack of such privacy on the initial splash screen.

    Burying the collection of personal data in the middle of some lawyerly gobblygook privacy statement is like mortgage lenders burying key terms in the middle of 100's of pages of documentation. Yeah, it's legally there but no one is actually going to read or understand it.

  • by cheros ( 223479 ) on Tuesday September 11, 2012 @12:31AM (#41296377)

    If at all possible, stay away from personally identifiable data. If your aim is to use identity as an index, work out a way in which you can translate an identity into an an index or hash value (i.e. one way). This is not going to be perfect (there will be about a million "John Smith"s out there), but if you have a consistent pair such as name and phone number, turn that into a hash and use it as data index.

    That means you can still do correlations, but a leak will not result in exposure of personal data.

    However, first of all, look at what you're holding on personal data and simply assume you got hacked and it's "out there" - plan for that crisis first because there is one question you need to answer:

    If you cannot afford to pay for security advice, can you afford to pay for the inevitable consequences?

  • by johnnick ( 188363 ) on Tuesday September 11, 2012 @12:37AM (#41296411)

    The short requirements:

    1) Explain what you're collecting in real-time at the moment when you give me the option whether or not to permit you to collect it. Tell me what you will use it for, when you will delete it and the consequences if I don't give it to you. People don't read privacy disclosures. Give notice and ask permission at the moment of proposed collection. Make it opt-in, not opt-out.

    2) Only request the information required to perform the service I've requested. Use the information I provide only to provide the service I've requested. Only share the information I provide with third parties to the limited extent necessary to provide the services I've requested. Obtain contractual commitments from those third parties that cause them to protect my information and delete it as soon as they've done what's required to provide the service I've requested. Keep information only as long as necessary to provide the service I've requested and delete it after you've done what's required to provide the service I've requested.

    3) Protect my information. Encrypt in transit and at rest. Delete thoroughly and don't give in to the urge to collect and keep information just because it might be useful some time in the future. You can't lose what you don't have.

    You say the collection "... is for purposes of analysis and ultimately functionality, not persistence." That seems inconsistent with the collection of name and email address. I can't think of too many use cases where you're collecting my name and email address and don't plan to keep it (and use it for marketing or otherwise share it in some way). If you need to contact me or I need to create a user-id that is my email address, you don't need my name.

    Your privacy policy is your contract with your user. It is an operational document that must be consistent with your practices. The privacy policy should be consistent with your policies and procedures. If the information you collect, or the way you handle it changes, you must change your privacy policy.

  • by SomePgmr ( 2021234 ) on Tuesday September 11, 2012 @12:47AM (#41296453) Homepage

    Finally, if possible, associate all data collection events with time (timestamp) and location (gps).

    It started getting a little creepy there at the end, bud. ;)

  • by SomePgmr ( 2021234 ) on Tuesday September 11, 2012 @12:55AM (#41296491) Homepage

    I'd give him the benefit of the doubt, and assume this isn't the only place he's looking for best practices.

    Meanwhile, "I'm an experienced developer, I'm familiar with all the general rules for securing customer data, but I'd like to hear of any 'gotchas' that you know about"? That seems like a reasonable thing to ask.

    Again, assuming this isn't the one-and-only source. So instead of grabbing our pitchforks, maybe someone has some examples of what he asked about?

  • by hutsell ( 1228828 ) on Tuesday September 11, 2012 @12:56AM (#41296507) Homepage

    Best practice from my perspective: do not collect the data at all.

    Exactly: "Put the Database down now, and step away from the Internet."

    Sorry, but my interest in giving beneficial doubt to the question's possible sincerity was lost when reading the part about the unoriginal solution for insuring honesty and transparency -- the solution being hidden in (the lawyer make-work terms of) "our privacy disclosure".

  • by VendettaMF ( 629699 ) on Tuesday September 11, 2012 @01:40AM (#41296699) Homepage

    If you can't afford the expert then you can't afford to collect such data. Move away from this project to something you have the ability to do.

  • by Okian Warrior ( 537106 ) on Tuesday September 11, 2012 @02:01AM (#41296751) Homepage Journal

    In the US, we have the National Electrical Code [] which explains in clear detail how house wiring is constructed.

    Following the code a legal requirement in many (most?) states, but from the point of an electrician it's a "book of best practices". Use this gauge wire for this current, staple the wire within 6" of the box, and so on. The code gets revised and added to over time as questions crop up and new technologies get added and people get more experience.

    There's a reason for everything. For example, the light in a bathroom should be on a separate breaker from the outlet next to the sink. It makes sense in retrospect, but this is not something that is obvious beforehand.

    It's very detailed, but also very clear. Homeowners routinely understand the instructions and are able to make simple repairs and modifications to their home wiring which conform to the code.

    We throw a lot of "best practices" around here as if they were simple and obvious at the outset, but maybe they're not. Hash your passwords, salt the hash, sanitize the form inputs, don't keep CC info... lots of best practices which in hindsight make sense but which aren't necessarily obvious beforehand.

    Most web apps have common requirements for login, identity management, privacy, various forms of functionality, and so on.

    Should we have a "book of best practices"?

  • by philip.paradis ( 2580427 ) on Tuesday September 11, 2012 @02:09AM (#41296783)

    Alternately, people could simply take responsibility for themselves and choose to avoid services which require agreement to miles of terms. Given your attitude on the topic, you probably haven't even bothered to read the terms of service for anything you're using right now. It seems you're trying to divert responsibility for yourself onto the backs of the service organizations you choose to deal with. Again, note the word "choose."

    You've also managed to miss the opportunity to discuss where data goes and how it's protected after it's submitted in the first place. Oddly enough, this is the essential question posed by the submitter in the first place, and regardless of what any given set of terms says, is actually the most important piece that very few people think about at all. In other words, you can trust an organization to high heaven based on what they say they will or won't do with your data, but if their infrastructure is a gaping mess of channels by which your information could get compromised, all of a sudden those terms don't mean much. I applaud the submitter for asking the right questions, and remind you to think more about your responses in terms of real wold data acquisition and retention mechanisms before posting again.

"There is no distinctly American criminal class except Congress." -- Mark Twain