Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Privacy

The Secure Public Data Repository? 175

jducoeur writes "So Hailstorm has died an unlamented death. But the demand for the idea of an information repository isn't going to go away -- users demand convenience, and this would be convenient. So here's a timely question looking for wild speculation: how would a truly secure, public data repository work? How would your data be stored? Would it be centralized or distributed? How would you grant access to specific elements within it? What would the business case for running such an archive be? Maybe if we can come up with a good design now, we can head off the next inevitable bad one..."
This discussion has been archived. No new comments can be posted.

The Secure Public Data Repository?

Comments Filter:
  • Ocean Store (Score:5, Informative)

    by nweaver ( 113078 ) on Saturday April 13, 2002 @02:37PM (#3335775) Homepage

    The Oceanstore project [berkeley.edu] at Berkeley is aiming to do just that: create a distributed storage model to provide a global, distributed, persistant storage resource.

    • Earth Encylopaedia (Score:3, Insightful)

      by Caltheos ( 573406 )
      I'm not sure I feel about having a public repository for private information, at least not until cryptography/system design has reached a level where hacking into the data becomes impossible without destruction of the data (i.e. quantum crypto). There are already a lot of "Online Harddrive Space" websites out there and for users who don't care about who sees whats on there thats fine.

      I think it would be the the earth's best interest to create a distributed but moderated and indexed galactic encylopaedia where information from astrophysics, zoology, political structures, history the whole shabang was to be found from one place. I know google is close, but structure would be nice.

    • No need for new projects -- already good distributed filesystems that you can set up big servers with
      afs [openafs.org]? (or here [stacken.kth.se])
      coda [cmu.edu]?
      intermezzo [inter-mezzo.org]?

      CMU, for example, uses AFS campus-wide. Your login scripts and dotfiles and whatnot all reside in your home directory (on AFS) so preferences migrate with you.

      You can make things world-readable, and because AFS has a global namespace, anyone can see them. If I do research at MIT as well, I just need to grab a Kerberos ticket from their KDC and start using my files over there.

      Just plonk a server in place, put an array of 100GB drives in place, make things readable by whomever you want, and you're good to go.

      If you want a system designed with fancy automated caching that people can use without dicking around with Kerberos, freenet [freenetproject.org]'s a good choice. Of course, there's no guarantee that the data will stay around, but cest la vie.
      • Re:Ocean Store (Score:2, Insightful)

        by willis ( 84779 )
        OceanStore is much more than what you suggest. It's self-routing/self-healing/self-caching/self-everyt hing -- it's designed to make things as low maintenance as possible. There are processes to defend against compromise (a small but sig. number of corrupted/hacked hosts can't bring it down). There are oceanstore processes that look into the oceanstore and make optomization decisions. (introspection, I believe).


        Check it out -- AFS is good for corporations/etc, but Oceanstore is somewhat viable for _everything_.

    • All users should definatley check out Stanford's IBE Secure E-Mail system (link [stanford.edu]) - AKA "IdentiCrypt". This would be a great use of such a distributed security model some people are proposing.

      With this system, email can be encrypted using an easily obtainable public key (no need to exchange keys beforehand) - the string "your@email.address". You can encrypt email to people that have not yet set up a key, just by knowing their email address. To decrypt, they grab their key from a server. You can request your key [stanford.edu] from Stanford's key servers. These would one day be replaced by a publicly-trusted resource.

      An elliptic curve variant of the Diffie-Hellman encryption model is used. A third party is necessary for the system and the distributed storage solutions being proposed could make good use of this technology.

      Read a technical description here [stanford.edu] or download here [stanford.edu].

  • by kjz ( 26521 ) on Saturday April 13, 2002 @02:39PM (#3335783)
    Why does the repository need to be public? In an era of very powerful client machines, why must we have a centralized database to make this work? Systems like Napster and Gnutella have already demonstrated the ability of end-user machines to distribute data effectively (though not always efficiently.)

    I belive the safest route would be to avoid the publicly accessible, centralized data store and focus on what has worked so well for the Internet in the past: standard communications protocols. By leaving the data on individual systems, we minimize the risk of exposing vast quantities of personal information as an attacker would need to go after millions of machines in turn. It's possible, but it wouldn't be easy.
    • Let me give you a simple answer: control. It's all about control.
    • by crimoid ( 27373 ) on Saturday April 13, 2002 @03:52PM (#3336035)
      Once mobile phones, computer, watches, toasters and everything else under the sun becomes net enabled the "powerful client" gets thrown out the window. The need then becomes one of availability. Needing to keep many of these gadgets "in sync" with one another (and your personal information) becomes hard. The easiest solution is one form of central repository, hence the "need".

      Now one might argue that in the future (present?) broadband will be able to allow everyone to "serve" their own information from their home PC (aka.. home server) but the infrastructure to do this in some sort of secure, standardized, highly-available way is more than "wouldn't be easy".

      For 99% of the population I'd imagine that their personal info would be safer in the hands of trusted professionals rather than residing on grandma's 486. The question will eventually come down to which professional do you trust the most.
      • A lot of credit card fraud is done by Merchants. That is because to deal with a lot of Merchants, you are supplying your credit card number. This is the direction that many banks want ot move away from.

        The banks would love it if they could have this information. However there is the possibility for data harvesting of information of their own.

        Seperate financial organisations are sort of in there, but they are in the same position as banks and merchants. Just ot have those companies there, has to impose some sort of fee on the transaction going past. Then there is is the potential for data harvesting.

        From the point of view of a lot of people at the moment, yes, there should be somewhere central that you can "trust", because many home users cannot keep their systems under control.

        However the next generation of children who are growing up right now, are growing up in a world where they are a little more knowledgeable about such things, as well as who they can trust with their information.

        Whatever centralised system that we come up with now, which could have severe flaws overtime, due to any number of unforseen circumstances is just going to be ignored, not only by many of the next generation, but also by a significant number of people now.

    • Why does the repository need to be public? In an era of very powerful client machines, why must we have a centralized database

      Why do you assume that public implies centralized? The article author certainly didn't; that was actually one of the questions s/he was asking? If you look at systems like OceanStore [berkeley.edu] or SFS [fs.net], or even Microsoft's own Farsite [microsoft.com], you'll quickly realize that your assumption is false.

  • As per many other postings here on /., we're hoping to make oNumber.net [onumber.net] a user controlled central repository. You create your entry, you manage it, you control who gets to see what and you can delete your listing anytime. There are built in features such as the SPACECARD and Resume generator that make it useful on it's own. People access your SPACECARD using the unique oNumber that identifies your entry.
    • This thing is frightening. Absolutely frightening.

      I never subscribed to Big Brother hysteria. But this is as close to it as it gets.

      -Martin

      P.S.: The mapping feature is lovely. This way, burglars know where you live when the indicators say you're away from home ...
      • Martin,

        That's why there is the guest list system. You can 'lock' any item (including any or all parts of your street address AND your current location) so that information is ONLY shown to guests, who would normally be family, close friends etc. We came up with the idea in 1992 (before the net actually!) and have put MUCH thought into the privacy aspects. It's not perfect, but we're adding (future) features that will also allow authenticated messaging (as ICQ was suppose to do, but doesn't seem that secure). Of course, there is nothing to stop someone from stealing our servers, but a) We don't allow you to store anything that personal, such as medical info or credit card details as yet, and b) I hate to say it, but most if not ALL of the info you can enter into our system can be obtained pretty easily through other methods from simply following someone, accessing government records, yellow pages. IE, putting little bits of information together to get the whole lot.

        Incidentally, we have some code that makes it hard to guess a person's password to and a few other tricks to deter hackers. That said, nothing is perfect. oNumber.net is voluntary and it's up to you!

        • by Anonymous Coward

          We came up with the idea in 1992 (before the net actually!)

          Um, dude, people I work with were playing with the Internet in the early 1970s, and http was first implimented/used in 1990. See http://www.www.org/History.html [www.org].

          What you meant to say was before "the dot-com bubble".

          • True. What I should have said was the world-wide web as discovered by the mainstream media and entrepreneurs. I am aware that it all started with Arpanet in the 1960s in fact. Brilliant invention.
    • I read your privacy policy, but some questions remained unanswered:

      -Who owns the information?
      -Are there any cases where a user must agree to release info?-How do you make money / pay for site+bandwidth? I saw no advertising (this may be the scariest part)
      -"The site www.onumber.net is running Microsoft-IIS/5.0 on Windows 2000." (uptime.Netcraft.com [netcraft.com]). ooooh. baaad. Sorry... Must... resist... hating... Microsoft...
      -but seriously, how protected are you? What firewalls/encryption do you use?
      -I didn't go to the main account thing, but do you use SSL?

      • a) YOU own the information.

        b) It's $29 to join - nothing else to pay. As we add more features (for users, not anyone else), the fee will rise, but we'll probably introduce a subsription system for those that join later.

        c) We don't plan to run on MS for ever. Be happy!

        d) When you join, and use credit card, we use Worldpay, who handle Amazon, so we have faith.

        e) As regards internal security, we have a few tricks, but I'm not in engineering (I'm the GUI guy) so I don't know, but will find out for another posting. One important point, we do NOT collect anything than sensitive. THE most sensitive items of information are Resume, Home street address, and date of birth.

    • Hey, I did some web contracting for oWonder a while back.. for their telephone-ICQ gateway, now extinct.

      It's a small web after all.
      • Who is you? The reason it is extinct is because AOL while initially enthusiastic with what we were doing (we still have their e-mails), had a change of heart and asked us (via a lawyer) to switch off. We respected their request, after all, it was their database - and in view of the popularity of our service, their loss. Hence, we're doing our own thing. What did you do for O'WONDER? Drop an e-mail to crew@owonder.com (include your name!) and we'll continue the dialog away from /.
  • by !splut ( 512711 ) <sput AT alum DOT rpi DOT edu> on Saturday April 13, 2002 @02:44PM (#3335801) Journal
    We already have a public data repository. Just encrypt all your important documents, post them to various usenet groups, and let Google permanently archive them.

      • We already have a public data repository. Just encrypt all your important documents, post them to various usenet groups, and let Google permanently archive them.

      Rather abusive of all those poor News Servers out there, isn't it?

      If you have access to a Web Server, you might be able to get the Wayback Machine [archive.org] to capture it for you.

      Or, if you can keep the web server up, just access it from your own Web Server. You wouldn't be screwed if your machine went down or became inaccessible as long as you could get to the Google Cache.

    • Lol! Nice Thinking.
    • As one man called Linus T. once said: "Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it."
  • Data haven (Score:1, Insightful)

    by Jaiden ( 64072 )
    Cryptonomicon anyone? How about sealand? Seems this has been tried before. People like to hang on to their own data, but most aren't qualified to keep it secure (run a secure server, etc). The problem is that no one trusts any big organization to keep their data for them. Especially microsoft. Perhaps what we need is an open source distributed encrypted system. multiple mirrors on regular pc's all sharing the collective data set, and all encrypted.
  • ...users demand convenience, and this would be convenient.

    "Convenience" and "security" can't really be used when describing something such as this. How many people use their/their kids/their usernames as passwords? IMO, there is nothing secure about something like that...
    • "Convenience" and "security" can't really be used when describing something such as this.

      Something such as this can only be described if "Convenience" and "security" is really used (to use your words in a different order). To say this is impossible right off the bat is like saying "public key cryptography is impossible" 10 years ago.

      tstock
  • No Way (Score:3, Insightful)

    by rjamestaylor ( 117847 ) <rjamestaylor@gmail.com> on Saturday April 13, 2002 @02:47PM (#3335812) Journal
    I will not have a single repository storing my information -- all my accounts and what not -- unless that repository is my brain. Period.

    Opposition to Hailstorm isn't an anti-Microsoft thing. As a matter of fact, most businesses want to have in their own domain the information provided by their customers, without a middle man.

    So, people (like me) and businesses (like mine) don't WANT a single repository, thank you very much. Forget this issue.

    • Opposition to Hailstorm isn't an anti-Microsoft thing.

      Sure it is.

      It is not the DATA that people worry about (hell any hospitol of decent size so muchinformation on its patients. . . ) so much as the people who HAVE the data.

      Microsoft is not exactly always open with how they use or collect Data, nor are they above taking actions for the top dollar.

      THAT is the problem that people had with Microsoft running a data system like this, not to mention that with how MS drafts their policies, they would likely have been able to shove a $50 per use surcharge on ya at any point in time without notice, heh. ^_^
  • What do you mean by truly secure, anyway? If you're always going to access the data from one computer, you might as well store it on that computer. If you are going to access the data from a multitude of computers, then you run the risk of a trojan horse on a public computer stealing your data (and this includes your encryption key if you encrypt the data on the public store).

    -a
  • Why Public... (Score:4, Interesting)

    by Peridriga ( 308995 ) on Saturday April 13, 2002 @02:50PM (#3335825)
    We already have systems such as SourceForge [sourceforge.com] to handle programs and other CVS systems exist...

    My data... public?

    I don't think so... I'll buy another 100gig drive before sending it off over the net to a public storage facility..

    If I wanted secure off-site storage, I would turn to Sea Land [sealandgov.com]

    20 Miles from anywhere and it doesn't respect any court of law in the world... So thats what I call secure (Even from the DMCA).
    • For those curious... A photo of the country of Sea Land...

      http://www.sealandgov.org/images/sealand_sm.jpg
    • 20 Miles from anywhere and it doesn't respect any court of law in the world... So thats what I call secure (Even from the DMCA).

      Except that they're not responsible to you for what they do with your data. They can look at it, parse it, copy it, distribute it. You store your neato new plans for a next generation personal mobility device on their servers and suddently you find a company called SLMovers that's beat you to the market with exactly your product.

      Hey! You can't do that! Oh wait. No. You can do whatever you want.

      Sweat
      • Yes but, I would trust them before an entire distributed community... Also if I were to store such neato new plans on a distrobuted system then I have lost my exclusive rights to it by distributing it.... Simple as that... Left with no legal recourse again
        • Except that just because you have your plans on a distributed system, does not mean that you have distributed your plans in such a way as to lose exclusive rights. They could be on any number of systems, but if they are secured in such a way so that they are not available to everyone, then you haven't lost any rights.
      • Heard of encryption?
    • this is from their acceptable use policy on sea land
      Unacceptable publications include, but are not limited to:
      Material that is unlawful in the jurisdiction of the server. For instance, if a customer's machine is hosted on Sealand by HavenCo, content which is illegal in Sealand may not be published or housed on that server. Sealand's laws prohibit child pornography. Sealand currently has no regulations regarding copyright, patents, libel, restrictions on political speech, non-disclosure agreements, cryptography, restrictions on maintaining customer records, tax or mandatory licensing, DMCA, music sharing services, or other issues; child pornography is the only content explicitly prohibited. At the present time, child pornography is not precisely defined; HavenCo is obeying rules similar to those of the United States, specifically a prohibition on any depiction of those under 18 in a sexual context.

    • ...and it doesn't respect any court of law in the world.

      That sword can definately cut 2 ways.
    • Very interesting. I had never heard of it.

      But like other posters are saying, courts and laws are also there to protect you.

      But let's assume the ISP is 100% trustworthy. There are still reasons to be worried:
      1) They have few non-technical means for taking action against hacking attacks (hobbyist, government, or otherwise).
      2) A small, independent island nation may not be immune to political and/or military takeover. This means a) they could be attacked, or b) big nations could get favors from them in exchange for continued protection.

      Secure, I don't think so. A safe haven for publishing things like copyrighted works--for now, probably.
  • Why ask Slashdot?

    Given the fact that security experts devote years to harden security to the point that it is usable, and safe enough that the cost of breaking security exceeds the value gained through such a breach. Why would you want to hear a bunch of uniformed nitwits such as Slashdotters to blather on about what we think of perhaps the most important security environment that could be placed on the Internet.

    Consider also, that if there were some informed response that could also be written by a slashdotter, there would already be hunders of misinformed and poorly argue response flooding the pipeline before the true gem of wisdom could be composed.

    And before everyone points out the security through obscurity is not the answer -- Just think how obscure the well-informed post would be on Slashdot.
  • Hailstorm (Score:3, Insightful)

    by igrek ( 127205 ) on Saturday April 13, 2002 @02:51PM (#3335829)
    In fact, Hailstorm was desgned well enough. It's not perfect, but htat's not the point. The problem was not on technical, but on the business side. How do you persuade online businesses to use third-party repository? That's the problem.
    • How do you persuade online businesses to use third-party repository?

      How were they persuaded to use banks? Oh, wait, that's where the users keep their money...

      Same thing here: why use crude means of tracking user's habits? Just get everything from a central location...
  • by Moonshadow ( 84117 ) on Saturday April 13, 2002 @02:52PM (#3335833)
    Well, there's some newfangled thing like that today. It's called the "Internet" or something like that. Supposedly, anyone can put anything they want on there! Imagine that!

    Seriously, though, the Net is a public data repository. Each node is as secure as its sysadmins, and information can be public or private. It's publically accessable, and you can protect whatever you want to protect from the public.

    Best of all, it's a network, not a centralized, attackable, censorable entity.

    Wheel, re-invent, why?

  • by logicnazi ( 169418 ) <gerdesNO@SPAMinvariant.org> on Saturday April 13, 2002 @02:52PM (#3335834) Homepage
    Okay so what features do we desire that this centralized repository is going to provide us? Presumably it will allow us to specify the amount of data released to third parties, charge fixed amounts without releasing our credit card numbers, and be portable. All of these problems are easily addressed with existing technology.

    Specifying how much data is released could be done quite simply with something as easy as a browser plugin. A company would include some code in the webpage to cause a request of certain information that you could then accept or deny. Charging fixed amounts is easily done through schemes like paypal, or even better some sort of digital cash technology. For conveince this too could be implemented as a browser plugin (as it would have to in either case).

    The only point where a centralized personal information database has any possible advantage is in portability. Even here though the advantage is fleeting, always on internet access for peoples home PCs is coming so fast that before long simply connecting to your home computer and requesting (possibly with various security levels) your profile will be a viable solution. This is essentially what all of us who ssh to our computers to check our mail are doing.
  • XNS by OneName (Score:4, Informative)

    by kindbud ( 90044 ) on Saturday April 13, 2002 @02:52PM (#3335836) Homepage
    Here's [onename.com] a model that is implemented and attempting to gain adopters. It supports:
    • User authentication and authorization across multiple trust domains
    • Automated exchange, management, and auditing of consumer information, based on permissions and in compliance with government regulations
    • Automated customer registration and updating
    • Automated management of public key infrastructure security solutions
    • Synchronization of permissions, entitlements, and other context-based user information
    They were fairly actively seeking clients during the Bubble Years, but understandably things are not rolling along so well these days. Anyone care to comment on what is available at their site? It seems to implement everything people say they want in a single-signon solution. That's probably why it hasn't been widely adopted, too much control is given to the owner of the information (that'd be YOU). :)
    • XNS is dead (Score:3, Informative)

      As part of the development of Genio [theoretic.com] which later turned into PingID [pingid.org] we looked at XNS extensively.

      However, their technology is deeply flawed, not just in an engineering sense but also a legal one: it is tied down by patents and IP disputes, and their system is essentially centralised.

      They also have almost nobody on board at all, you can get an XNS "agent" but not use it anywhere. The technology is ludicrously complicated, hidden behind masses of white papers that don't really tell you what to do in order to make an implementation.

  • We need an openly web accessible XML based repository of information where the DEPOSITOR of information is and is held responsible for its accuracy.

    Furthermore it can ONLY be entered with your knowledge and approval using a biometric key to access the information. No cheezy password scheme will do. Period. None. Fuggedaboudid Bub.

    The encryption/decryption of the data could be done using another biometric key. (Retinal pattern with fingerprints and DNA as backups. Use one to be sure, two to know or all three to be CERTAIN.)

    This way, the information is a shit-load harder to steal or forge. It also means that you KNOW what information's on there. You were present when it was recorded. And you know who has access because are present when its accessed.

    The rest is untrustworthy and therefore should be untrusted (trusted as far as the drive which stores it can be thrown.)

    That will take care of crap in your Experion or Equifax records.

    That will take care of bogus credit card transactions.

    That will take care of liens being slapped on people's houses because the previous owner took out a second mortgage and "neglected" to inform the buyer.
    • and why does it have to be XML? I think an SQL solution would be much more efficient and how exactly are you going ot encrypt all this biometric data? and if its stolen what do you use for authentication?

      me thinks this is a troll or someone reading one too many slashdoter posts that read (XML r00olz cuz IT F1x3s 7h3 int3rnet!
    • In terms of secure authentication biometrics are only usefull as an enhancement to other authentication means such as passwords and physical tokens (keys, smart cards etc). Retina and Iris scans are good, but not proven to be absolutely unique and equipment is not cheap. DNA could be absolute (hmm what about twins??) but is easily spoofed. Think of collecting a few hairs from someones head. Watch Gattaga. It might be a movie but it presents enough senerios to bypass most forms of biometrics.

      Finger print scans on the other hand are a poor form of authentication. Finger print scans suffer from a very high false negative rate. Back when American Biometric existed and were making the BioMouse [activcard.com] they were talking about a high secure mode of 1 in 10000 unique fingerprints, and a more resonable operating mode 1 in 5000 or lower. What that is saying is that given 5000 random finger prints (only 500 people!) one finger print will authenticate to the system as a false positive for a specific user. This is a result of a person's finger print scan changeing day to day due to the temperature, the humidity, the person's health, stress, heart beat, etc. If the system was absolutely secure the user would rarely be able to authenticate.

      Biometrics are good for some forms of authentication. Biometrics are great for quick and easy authentication where other access control features will mitigate some of the risk, or where strong authentication is overkill. Think of a door lock to a house. A finger print scan would be a quick and easy way for the owner to unlock the door. A burglar isen't going to try to bypass the finger print scan, they will throw a rock though the back window. Similarly for a private office finger prints can be used as other access control features such as a guard at the front gate will mitigate the risk of a couple hundred people walking up to a finger print scanner and trying to get in. When combined with a unique token such as smart card an attack against the biometric authenticator is harder as the attacker needs to steal the token (which should be reported by the owner so that the token is disabled) or the attacker needs to spoof the token which should be more effort then the gain of bypassing the authenticator.

      Banks would love to add iris and retina scans to their bank machines. However the cost of the machines is expensive. More importantly the general public is not cool with the idea of lights shinning in their eys to take pictures. This is over and above the privacy freaks who don't want to be tracked everywhere they go. Iris scans are the better of the two by far as they don't involve any bright lights and can authenticate people from a few metres (yards) away. However rris scanners are still a tough sell to the general public.

      Regarless of the type of biometrics used it still needs to be combined with a password for truely secure authentication. By today's standards strong authentication combines both "something your have" and "something you know." Biometrics, secure tokens, swipe cards, and cryptographinc keys are all something you have. A password is something you know. If you want the most secure authentication it will involve a password.

      The bottom line to all of thins is that biometrics aren't the most secure form of authentication. Biometrics are very convinient. A lot of people would prefer to use biometrics as passwords get written down and forgotten, and physical tokens get lost and stolen.
  • Must Read Cryptonomicon. (Neal Stephenson)
  • by Wonko42 ( 29194 ) <ryan+slashdot@noSpam.wonko.com> on Saturday April 13, 2002 @02:58PM (#3335860) Homepage
    Who demands convenience? I don't demand convenience. I *prefer* not having all my eggs in one basket. I like being able to choose which companies get to know which details about me. If I have a hard time keeping track of all my different passwords or user accounts, I'll write my passwords down and store them in a text file that's PGP-encrypted with a 4096-bit key and a passphrase that I know I'll never forget.

    I don't want to have to trust some company to store all my information for me. I also don't want to trust some open source project with that information. In fact, I *especially* don't want to trust an open source project with it. The only person I trust with my personal information is me.

    • ... most web users are exactly like you. They know how to encrypt files, and they don't mind typing in dozens of different passwords, or entering personal data over and over, or now being able to use sites that store it, and have no trouble identifying sites that store it....
    • this might seem like a stupid question but how do u encrypt something @ 4096 b? or in other words what program do u use?
    • Well, most users seem to disagree ( :-( ) given what's happened to PGP.

      Depressing.
  • I don't think this is the exact answer to the question, but I think it's related. This book [wayner.org] is just appearing. The FAQ [wayner.org] makes it clear that it's focuses on locking up some of the data but leaving some in the open, hence the title Translucent Databases .

  • Keeping lots of data safe in a central place is easy enough. Just encrypt it and give the key to whatever portion you want to reveal to whomever you like. But why?

    Say you want to keep your health info there so that your doctor can access it. You could maintain the data online and then give your doctor permission to access it. Fine, but if you can give permission, then you can just as easily supply the data yourself, perhaps on a little smart-card you carry around. There's no need for a centralized system.

    In fact, I can't think of any application for this that wouldn't be better served by me maintaining my own data.

    Something that would be useful is centralized authentication, and that's easy too, technically at least. Politically it would be very hard to get everybody to agree on a standard and on who would administer the system.

  • While I appreciate RAID, I've never been able to get very good performance from it. Maybe thats my fault, but ultimately my lack of ability is not the focus here.

    I've always gotten more from assembling JBOD's so that I could dedicate one disk to one task, and therefore one I/O stream.

    This has the consequnce of tuning things at an atomicity that i can understand.

    My point here is that there may be no one way to design this, there may be a number of components that are integrated, and used by the service on demand at the time that a user demands them.

    Certainly, LDAP is a very good infrastructure for access to naming and location of services, as well as authentication, and storrage of things like keys and such.

    After that, I think that files should be files, so I'd have to integrate DAV into an apache server, and back the auth. into the LDAP.

    There are places where users might wish to store relational data, and that is bit trickier. But allowing access to a database would certainly be required, hell it would probably serve as the backend to the LDAP service.

    I guess, If I were to implement asomething like this, here is what my goal would be initially.

    1) provide one-time registration and authentication for users - be a registration provider to many web sites and services.

    2) provide a place to store flat files, be a backup for your hard drive, sort of.

    Yes, a service like this would be a sinkhole for security attacks, but I think good initial engineering can provide good security.

    Ultimately, like I said, its going to take a componentized approach, I think all the tools are there, just waiting for someone to implement.
  • But the demand for the idea of an information repository isn't going to go away -- users demand convenience, and this would be convenient.

    How 'bout a harddrive as an "information repository."

    Noone is "demanding" centralized information repositories. WTH is an information repository anyway?

    The average Joe computer user does't need a centralized data area with version control and the rest of the buzz words. The few corporate needs are already fullfilled with things like CVS and ClearCase -- not to even mention group ware suites such as phpGroupWare.

    It's all buzzwords. Six months ago it was XML and Java this, CSS and JSP that. So today the buzz is dotNET and Hailstorm with their information repository, well, guess what? MS just found out that this particular buzzword is utterly useless and has dropped it.

    We would do better to just forget these words even existed instead of trying to breath life into something that was never meant to live in the first place.
  • Hyprocrisy (Score:2, Interesting)

    by devleopard ( 317515 )
    Secure servers require some type of resources to manage. Microsoft has more resources than most of us can comprehend. However, I still don't want my information stored with them. People don't like Microsoft because they don't like being controlled - and that's what MS does, attempt to control as much as possible in their own interests. So I don't care who has a repository - Microsoft, the US Government, the EFF - the bottom line is that the information is controlled by someone. I'm sure someone will quip in with a statement about some techie solution, like PKI - but that's not the point. You still don't control the information. If anyone in the Slashdot/OSS community advocates a central repository, they are advocating control, which violates every principle that the community stands for. I will take a Microsoft with no reigns (directly, they're only screwing other companies, whose bottom lines I could give a damn about) over a central repository (where I have a *huge* potential for getting screwed, big-time) any day.
    • Why should data ever be sent to these repositories in the clear? If you're concerned about the security of a piece of data that you decide to store offsite, shouldn't you encrypt it first before storing it?

      The only reason this "centralized control" is even an issue is that the data being stored there is given to the controlling entity as cleartext instead of an encrypted hunk of data.

      Give the data over in encrypted form and the only people who can access it are the people you give explicit authorization to do so. And so the only issue is one of data availability, which would be the only parameter left under the control of the owner of the repository.

      So here are the requirements for a workable centralized data repository as I see it:

      1. Freeform storage of data indexed by a unique ID that's assigned to the data's owner and a name (consider the ID to be the directory in which the data is stored, and the name to be the filename of the data).
      2. Multi-key encryption of the data made easy by the client. So if you want to give some entity access to the data, you create a key for them and add it to the keyring associated with the data. If you want to remove someone's access, you remove their keyring entry, reencrypt the data (so that a different, randomly generated, session key is generated for the data and stored in the keyring), and retransmit it to the repository.
      3. No offsite storage solution is a substitute for having your own copy of your data. You're a moron if you rely on someone else for the integrity of your data. Bottom line: keep a copy for yourself. You'll have to anyway in order to use the scheme outlined above, in order to revoke keys.

      The most worrisome situation will be the one where you authorize someone to access a piece of data and that someone turns out to be a Bad Guy associated with the repository's owner. This is a bad deal because they can collaborate on making the data unavailable (and thus impossible to rewrite) after you decide to remove the Bad Guy from your keyring. But the genie is out of the bottle by that time anyway: you gave the Bad Guy a key, so he had the opportunity to get the cleartext data. Once that happens, it's all over anyway (but see below).

      No storage scheme, including self storage, can change the nature of information: easily copied and impossible to control once it gets out. If you don't want anyone getting at a piece of your data, don't give anyone a key to that piece. Simple as that.

      One other thing: if a piece of information gets out of your control, the easiest way by far to put the genie back in the bottle is to make that piece of information irrelevant. That's why cancelling your credit cards and getting new ones issued works for dealing with credit card fraud (but not for preventing it).

  • I'm not even sure what Halestorm was supposed to do, but here's my guess at such a system.

    Wether the repository is distributed or central, all the repository should do is swallow and spit out data for authorized users of an account without looking at the format. The authentication at this level could be a password entered by the user, or stored in a device.

    All of the encryption would be done by the user in the form of a software program, hardware dongle, or whatever is most convenient to the user. The type of and strength of encryption would be up to the user.

    Then whenever some kind of service needs personal info, the user can plug in their dongle or enter log into their client program, see which information the service needs, and authorize it's transmission over a secure connection.

    I guess this would work with credit card numbers, medical records, whatever. It would keep the user in the middle of every transaction.

    If you were looking for a place to store your VCD collection, this wouldn't be it.
  • by iabervon ( 1971 ) on Saturday April 13, 2002 @03:22PM (#3335944) Homepage Journal
    What we need is not for someone to run a public data store, because whoever runs it isn't going to be trusted by some people. What we need is a protocol for getting data from such a store with the identity information in email address form. Then the users can put their data on a machine they trust, either one provided by an ISP or something or one of their own.

    For example, web sites should be able to authenticate users with a client certificate that the client provides when creating the web site account. This client certificate can be essentially anything, so long as it is how the client wishes to be identified. Of course, the client will want to be able to use a different certificate later (if the first one expires), so what the client really is identified by is the certificate chain, which has to have the same name up as far as the self-signed root certificate, and have the same root certificate.

    With a scheme like this, users need only find a certificate authority (or create one), and have a way to "log in" with the CA in order to get a client certificate (probably one which expires rapidly).

    The server that acts as a CA can also act as a store for other data. Ideally, the browser would be able to fetch form entries from the CA automatically, in response to the user requesting it after logging in. So you could move to the "credit card number" field, hit the "fetch identity value" button, type "CCN" (or whatever you've called it), and the browser would do a HTTPS request with your client cert to get that value and fill in the field with it.

    For most people, the CA and data store may be AOL or something, but there's no reason that the CA couldn't be your own machine. While you're at it, you could set it up to recognize other certificates than your own and provide the information you want to make available to these people. If you have a suitable field available to the right set of people, this solves the instant messaging location problem.
  • by sphealey ( 2855 ) on Saturday April 13, 2002 @03:22PM (#3335945)
    Microsoft announced that they were deferring for the time being the idea of Hailstorm as a fully, explicitly Microsoft-controlled depository in direct competition with their customers. They did not say that Hailstorm was going away, merely that it would now be broken up into multiple repositories managed in partnership with their customers (e.g. large banks and e-commerce sites). Which is not to say that (a) the concept no longer exists (b) the aggregate total will not be under Microsoft's control (c) they might not revive the central repository idea in the future.

    sPh
  • This isn't that hard (Score:3, Interesting)

    by Apreche ( 239272 ) on Saturday April 13, 2002 @03:24PM (#3335949) Homepage Journal
    You have many seperate databases with powerful encryption and a hardware firewall. Have a very short list of places that can get direct access. Those places will only be allowed access to the parts they need. Everyone else in the world goes to one of those places to get their stuff.

    So you have the central database. This database has different parts to it. One for financial info, one for government info, one for medical information, etc. In the center is a list of general information like name, address, age, phone #.

    Now you have a very small #, less than 20, of people who have direct access to this. Each of these places has access to different sets of information. So if one of them provides credit card verification they have access to only the parts of the financial database they need. Then places like ebay and amazon, and paypal go to them to verify credit cards.

    Another group would provide medical information. This group would give doctors offices acces to only medical records of their patients. etc.

    Now to make it extra secure everything is encrypted with the strongest encryption available. If someone wants to use less encryption or no encryption, tough. Everything on the drives in the central database is encrypted. Public key encryption is used for transmission of data to providers. New keys are made as often as is practical. Data is re-encrypted on their drives. Then sent to the users who can de-crypt only the parts they need (if for some reason they are accidentally sent something htey shouldn't see) and use the information.

    Of course all the standard security measures are taken such as putting the central database in a secure location. Firewalls. IT professionals working their 24/7. The works.

    This may not be the most efficient design. It may not be a very specific or detailed design. It may be a design that provides a small group of people with a lot of power. However, it is I believe the most secure design. Make a special law about trying to hack it too, that'll make it even more secure. The only problem I forsee is the constant need to up the encryption because of faster processors and decryption methods, and the constant need for end users to update their keys/certificates.

    I don't feel like deleting everything I just wrote, but I just improved my idea. End users create public and private key pairs. When they want to put their information in the central database they type their information into a very secure web form and off it goes, along with their attached public key. Now there is a central database of information that only the owner of that information can easily read. If I want amazon to get some of that information My computer will downloaded it, in encrypted form, decrypt the information I want to tell amazon, encrypt that with amazon's key and send it to them. Excuse my language, but ph33r that. Especially if you gave me the ability to change my key whenever I want.

    Only problem, getting home users to make RSA 4096 bit key pairs, or whatever the newest one is. That's security for you. Keep your information on someone else's computer, that's already incredibly secure, but only you can read it. Not even the guy who built the system can see what's in it. Except of course for his own info.
  • Realistically, there already are several companies in the world which know almost everything about consumers, one of the primary ones being TRW, as a fairly unregulated, and ubiquitous source of data.

    Personally, what I would have liked out of Hailstorm, is the ability for the consumer to manage their own information, and more importantly to know who is looking at their information. Now, I know that this is a huge task, but it beats trying to repair a credit report, or figure out if a sherrif's department in podunkville mistakenly put your SSN in on their most wanted list.

    The company that I used to work for transferred people every couple of years, and decided to make LDAP maintainable by the employees themselves. This resulted in a directory system that was really useful most of the time. The alternative system, waiting for HR to actually update the information in the corporate directory was a nightmare.

    Just my .5 cents
    Brian

  • by Radical Rad ( 138892 ) on Saturday April 13, 2002 @03:25PM (#3335952) Homepage
    I demand a centralized repository of my personal information because:

    __ I want every aspect of my personal life to be analyzed.

    __ I believe that all security exploits have already been discovered.

    __ My business is not my own. I submit to my corporate overlords.

    __ It's the only way to prevent another September 11th.

    __ Letting Mozilla's form manager fill in on-line forms is too hard.

    __ I want to be resurrected as a robot after my death based on all my personal info and preferences.

    __ Fashion their record needles into bones for CowbotRAD.

    Vote [ Results | Polls ]
    Comments:0 | Votes:1
  • Well, first of all, "truly secure" is impossible. All we can do is aproach secure and hope.

    It's difficult to tell what will be the attributes of any method that will exist, but it's not hard to give requirements. I'll use the word "spyee" to mean the person whose data is being stored.

    * First of all, it cannot be done without people's permission. Every single piece of info that is stored MUST be there with the spyee's knowledge and consent. If someone wants to store their sexual preference or medical records, etc. etc. let them, but don't reqiure me to tell you my SSN / Credit Card info.

    * Second: It MUST be distributed. This is because it can work iff (if and only if) the spyee retains ownership and complete rights to his data. Nobody else can even think for a minute that they own it. Even if they store it. It's paramount that each spyee's info be broken up and different chunks stored on different computers. In this sence, it would work like The Eternity Service [cypherspace.org] (here's even more info [cam.ac.uk]) or (my favorite), Freenet [freenetproject.org].

    *Third, Every piece of info must be stored encrypted. Let the user's browser have a session keys. Let the user have a few keys. That way, the user can access his data (with the help of front-end programs) and he can have a stupid form filler, but the company or Skriptkidd1e can't use it.

    *This MUST be a subscription service. I believe that it would be far too expensive for advertising to be the source of driving revenue. The storer MUST NOT be able to sell the data, thus depriving him of that form of revenue as well.

    *The user can pay the same way as payment worked in ZKS [zeroknowledge.com] FREEDOM [freedom.net] - The user bought an activation number and used it to buy the service - but the end user name _cannot_ be traced to the person who bought it (Hence "zeroknowledge"). It was awesome!

    This can be accomplished quite easily, and built in to any UI so that working it requires minimal gray matter. I think that the best way would be to store it on freenet [freenetproject.org]. It takes care of all the above problems, but introduces one of its own: data expiration.
    Reply and tell me what you think, this topic is fascinating.

    • Re:Freenet (Score:2, Interesting)


      I think that the best way would be to store it on freenet. It takes care of all the above problems, but introduces one of its own: data expiration.

      You can force any Freenet data to remain persistent as long as you periodically access it. Of course, the data may reside *only* on your node, but it will be as available (to the public) as your node is.


      I think that expecting somebody else to make your data available *forever* is an unrealistic expectation, regardless of the technology or circumstances.


      Even if I pay an ISP for secure webhosting with backups and everything, the most I can legally require is that they'll *TRY* to not lose my data.

  • after nine months of intense effort the company[Microsoft] was unable to find any partner willing to commit itself to the program.

    Microsoft tried this and it didn't work because no-one wanted it. Why is there an Ask Slashdot story asking people to come with ideas for a product that has been unilaterally rejected?

    Here's my design idea: How would a truly secure public data repository store data? By not storing data! The whole point of a public data repository is to gather, track and sell marketing information. User convenience is a cover.

  • Each time this topic comes back we need to be reminded that any uniform centralized information system is the first thing any "internal security" service puts in place. Why do we need to make it easy for them.

    There is a very visible patern there we play with user fear of attack / security to convince them that it would be convenient for a "reliable authority" to store their identity information, etc... and before you know it you have lost your privacy and your freedom.

    There are many way of doing this Hailstorm was one but the governement is also playing that game with social security numbers and identity cards.
  • For the life of me I can't see what's wrong with a glorified cookie in this case.

    Each user has a 'contact details' record, a 'financial details' record, and an 'identity' record on their machine, like a cookie, but digitally signed to say that it is actually theirs. When user visits a site, they get a digitally signed message saying "This is [X corp], we need your financial details to continue. We will destroy this info within 24 hours and will not pass it on. Certified by [Y regulatory body] YES OR NO".

    If a site wants identification (unified logons) the site gives the user a random string to encrypt to the site's public key to verify they are who they say they are.

    No more funny business with Big Evil Corporations knowing everything you do. No worries about people hacking the central repository and getting 10,000 credit card numbers overnight. No worries about people stealing your password, 'cause it's never transmitted - it's just used to encrypt the token to enter the site clientside.

    Can someone tell me where I've gone wrong?
    • Each user has a 'contact details' record, a 'financial details' record, and an 'identity' record on their machine, like a cookie, but digitally signed to say that it is actually theirs.

      This could work, however there needs to be a uniform standard for this stuff. One of the main problems at the moment is the decision and adoption of the standard, as well as how to upgrade the standard when there is a technical flaw that allows it to leak information.

      Then there is the overhead of continously sending 4K of extra HTTP headers with every single request. It will be at the stage where you transmission upstream is more than what you are getting downstream (in the case of 304 "you already have the most recent" responses).

      When user visits a site, they get a digitally signed message saying "This is [X corp], we need your financial details to continue. We will destroy this info within 24 hours and will not pass it on. Certified by [Y regulatory body] YES OR NO".

      The problem is that the credit card details are still turned over to the merchant. This is a large problem, because a lot of credit card fraud is actually done at the Merchant Level and not at the actual consumer. I seem to recall a figure of approx 90% about 10 years ago, and it appears that it is not that far off the value since. Cases of someone bombarding a merchant with "calculated" credit card numbers, as well as database hacking, does not contribute much to the overall fraud level, because the incidents are not high enough. The reason why it is considered so bad, is that they occur as a result of a "4th" party being involved who has nothing to do with the transaction (1, 2, and 3, being the consumer, merchant and credit card network supplied by a financial instution). Merchant level fraud is difficult to eliminate because the merchant will always be part of the transaction.

      If a site wants identification (unified logons) etc.

      There are two ways to appoach this from a "centralised" level. To eliminate merchant level fraud (from the perspective of the bank), it would be necessary to setup a mechanism to stop the merchants from directly getting their hands on the credit card numbers (including the expiry date). The other side of this is that it will be a lot worse if a bank would actually retreive the details of what it was that you were purchasing. This is because it would be possible for the bank to actually track the purchases that someone makes for several businesses. The desire then is for the bank to try and get as many merchants using their services, because the amount of tracking that they can perform would also be enourmous. This means that the bank can then construct profiles on people, etc.. It is worse at that level because of the amount of information that can flow past.

      No more funny business with Big Evil Corporations knowing everything you do. No worries about people hacking the central repository and getting 10,000 credit card numbers overnight. No worries about people stealing your password, 'cause it's never transmitted - it's just used to encrypt the token to enter the site clientside.

      The idea sounds good. However some thought needs to go into it. The idea is that only the financial instutions should be able to read your credit card details, the merchant should not. However having a constant fixed value is not good because it allows for the replay of the transaction (which is one of the aspects of Credit Card fraud that is the largest).

      What is needed is something that is more along the lines of "challenge/response". Also it means that the merchant can only process the transaction when there has been an "OK" back from the credit card interface to the local bank. The amount is used as part of the transmission (so it can be verified and not tampered with by the merchant).

      Now all we need is a secure method of pin entry, and you can use this system for debit cards as well as credit cards (or even pin verified credit cards). Acceptance of that is a long way off.

      • Thanks for your reply.

        I didn't originally see that the merchant having CC numbers was a problem, but if that could be eliminated (along with 90% of CC fraud), that'd be excellent. In the e-commerce systems I've written, it's always been a case of 'get the customers CC details as soon as possible and run them through the merchant account'. Nothing to stop the management of the company, or any of the technicians taking all of the numbers.

        Thinking about it, that problem can also be solved by digital signatures. If the bank has your public key tied to your account (presumably signed in person when the bank account is set up), the merchant can send the customer a receipt of the transaction with your account number and the total on it. This can then be digitally signed by the customer, so the bank can tell that they (and noone else) have authorized the transaction. Only then is the customer charged, the merchant informed, and the customer gets the product.

        Granted, there could still be instances of people's machines getting hacked, their private keys stolen and their passphrases logged, but it's much easier to get credit card numbers by other means at the moment.

        It's all technically possible, but as you say, the most difficult task would be getting the three parties (bank backend, merchant e-commerce platform & customer browser) to agree on a standard.

        I hope this will someday be realised, because the idea of central repositories controlled by compaines or goverments is just a bit silly. If any single human has access to the data (it's OK if it's committee access), that access will be abused.
    • Check out http://www.onumber.net. (I'm too tired to insert the html. Sorry!)
  • I don't even accept the premise here. Hailstorm failed because the concept sucks.

    Why would a central repository of my information be more convenient? I can understand if a company wants to keep a central repository of my software settings, customization preferences, interface options, and maybe even documents I create that I designate to be stored on the repository. That's about it. The only information they need about me for that is a user id, and a way to bill me.

    Why do they need to know my address, favorite color, aunt's middle name, bank roll, etc.? Frankly, when it comes to transactions (and separately, interactions with the government), I prefer the bureaucracy and inconvenience of having my information stored in different places. I don't everything linked together. The bureaucratic tape has a purpose: to make sure my life and information can't be altered without due process.

    The more eyes and ears that must be consulted, the better. The VISA/ credit card system is about as far as I'm willing to go.
  • As far as I can see, users aren't demanding this. Systems like hailstorm are technology/business strategy push, not user pull.

    Users don't demand things like a single logon. They just use the same password for everything (given a choice). Now we may think it better if this is a centrally administered login(especially if we get to be the the administrator), but users aren't asking for this. It just is not all that inconvenient, and the process is transparent to the user. I think if you ask, the idea of their being such a large data honeypot about them sitting on the internet is scary.

    That's not to say that tech push can't be successful. I'm old enough to remember having to go a human teller to get your money out of the bank. ATMS were pure tech-push. However, its rare.
    • Good point.

      As a semi-intelligent user, any hailstorm/passport/liberty alliance projects are not for me. It's too bad most people aren't aware of the security and privacy infringing implications or at least the potential for these things in systems like these.

      Everything I want as a user I've already got. You'll find all you need in terms of password and personal info management under the Tasks > Privacy & Security menu in Mozilla and Netscape 6. It's too bad the big bully on the block (MSIE) doesn't even try to offer these sorts of things, for fear of killing the need for a hailstorm/passport system, and the risk of losing all that precious user data.
  • Wait a minute... A secure and public data repository? I've gotta think about this one...
  • by vkg ( 158234 ) on Saturday April 13, 2002 @03:59PM (#3336063) Homepage
    Firstly, all standards must be open and unencumbered.

    Secondly, XML is the right way to do this for political not technical reasons. But still use XML.

    Thirdly, and very importantly, all information held in the system is (C) the user, licensed under strict contract to the Information Repository to use. This is a protection against somebody buying the system if it becomes successful and changing the terms of service.

    Fourthly, information has to be protected in three important ways:
    • Every piece of information about you has to be accessable without linking it to any other piece of information about you (i.e. no Unique ID) - more on the technical aspects of this later.
    • Every site/organization which wants access to your information must agree not to use it in conjunction with other public information to compile a profile of you.
    • You must be able to revoke any and all information at any point.


    Fifth, no unusual public key cryptography should be used in the system. SSH/SSL yes, PGP/GPG no - this is to protect from the government's ire. Symmetric key ciphers for protecting your own information (i.e. passwords) seem OK to me.

    Sixth, two different sites/organizations, both accessing the same data about you, should not be able to tell from that request that they are accessing information about the same person: i.e. if A asks for your DOB, and B asks for it, they should not both be accessing UID234234.DOB. One scheme for this is that "permissions" are given to different organizations, of the form:

    HASH (organization_pass_word + your_pass_word + your_unique_ID + index_of_data_you_wish_to_reveal + data_store_added_noise)

    This protects your identity and prevents cross-correlation of different databases.

    Seventh, the standard should work like email: standard infrastructure can provide a server, anybody can operate one, and you have control of your use of these systems. No single operator.

    Eighth, and most importantly, none of this is worth shit without a constitutionally guaranteed right to privacy. Without that, any scheme can be forced over time into revealing more about users than they wish to reveal, either by legal, economic, social or political means.

    Strong cryptography is nothing without strong laws, and strong laws are something without any cryptography at all. Support GeekPAC! (the Geek Political Action Committee [thelinuxshow.com]

    vkg.
    • Thirdly, and very importantly, all information held in the system is (C) the user, licensed under strict contract to the Information Repository to use. This is a protection against somebody buying the system if it becomes successful and changing the terms of service.
      Too bad so few companies would ever agree to or word a TOS this way...
      Eighth, and most importantly, none of this is worth shit without a constitutionally guaranteed right to privacy. Without that, any scheme can be forced over time into revealing more about users than they wish to reveal, either by legal, economic, social or political means.
      In Canada, there are new laws that are being slowly introduced that are much more oriented in favour of the individual, and that have pretty strong implications for businesses doing business here and collecting personal data. My lawyer's main area of interest is privacy law, and they have some good links on their site, including some papers he has published on the subject at: http://www.aikins.com/practice/tekno.htm [aikins.com]
  • There's plenty of projects underway trying to solve this for GNU.Net, already. Alot of them seem to be doing just fine.

    My ideal system would be where I keep my data locally, and if someone requests it (or if I want to send it to an individual, or just to the general network for accessability) I would use encryptions. Perhaps a seperate key for each entity wanting to see your data?

    I remember thinking up something similar a while back. It was a lot more vague, however. A sort of mass internet storage system.
  • What we need is an industry standard protocol that is accepted by major instituions, yet flexible enough to be used by smaller businesses and mom-and-pop websites. Then the larger institutions provide a service for their customers, creating transactions on the web.

    I imagine this would work something like how PayPal works with eBay. PayPal provides a service to their customers. To make a purchase on eBay, I can use PayPal's service as a trusted. PayPal takes care of all the little details so it's customers (in this case, both myself and eBay) don't have to worry about getting gyped.

    This is nice, but I'd like to use my credit union or credit card company directly instead of having to go thru PayPal. This is possible now, but I've run into a few folks who actually perfer to use PayPal rather than a Visa number, so I imagine there are a few kinks to be worked out.

    The credit union (or whatever institution manages your account for you) can then decide how to provide security and convenience. Do I allocate money in a special fund first, or is my checking account accessed directly? Do I preapprove transactions, or do I login and check them off manually before they can clear? How are PINS and passwords secured? These are all questions that the protocol must address, and allow the institution to configure.

    This is pretty similar to MS's new strategy. They are selling their Hailstorm package to other institutions so that those other institutions can provide the service. I like this a lot more as it gives me real choices about who I do business with.

    If there needs to be a central repository, it should be minimal. Like a trusted authority in the PGP protocol, it could just define who are the trusted institutions for the protocol, and a basic verification (public key?) for that institution. This trusted authority should be managed by an industry consortium of some sort (and not directly cost me any money).

    If other sorts of information are needed (medical records, consumer info, etc.), then that specific industry should work out their own protocol and how to manage it.

    Any online repository should be authorized by the consumer first. I should be able to enable or disable my online account with my credit union. Ditto with my medical records or consumer info. This should never be automatic with any service, and legally should probably require an explicict, written and signed document just for that purpose. That'll help keep the number of unwanted accounts down. (I can just see a lot of online consumer accounts being created automatically for your "convenience" as soon as you sign up for some minor service. Not good.)

    That's it. Something that's industry standard and managed by an institution I trust. I propose we call it "mtp (money transfer protocol)". *grin*

  • as long as there's an Internet connection to my servers. I can implement any level of connectivity and security I'd like using tools like iptables, ssh, and gpg/pgp. Sure, I've got to make sure that my stuff is accessible from wherever I need to be, and that I'm packing the right resources to utilize it at the access points, but other than that, why would I trust someone else to do something that (1) I can do for myself and (2) knowing that I'm looking out for my own self-interests, not relying on someone or something that doesn't take those interests to heart as much as I do.
  • I think the answers are quite obvious (unless I am missing something, of course):
    • To prevent central control you need to store it at the service provider of your choice. Who says that everybody's information must be stored at the same place?
    • In order to protect your privacy it must be encrypted. As bandwidth will matter you can not, of course, put all you personal data into a single file and encrypt it, you need a more clever scheme that allows random access to structures/blocks
    • Because it is encrypted and your provider is not able to decode it for, e.g., a web interface, you need smart client that understands, manages and displays the data to you.
  • I have accounts on dozens of web systems (if not hundreds), with slightly different user names and passwords (this one demands a number in my password, this one won't allow me to use a number, etc., etc.).

    I want a single way of proving who I am to all of these people. As an extra, I'd like to be able to have seperate additional identities, but I can live without that if necessary.

    Oh, and being the leftist that I am, I'd rather have the government provide a central id system (like it does the passport and driving license system) than have a company do it. At least I know how the government is likely to fuck me, I hate to think what companies will think of to do with it.
  • It is interesting to note that atleast in theory, this problem has been well studied. There is this concept of ``secret sharing'' and ``information dispersal'' in cryptography where any information can be broken down to k chunks. Out of k chunks it is enough to recover m chunks to reconstruct the original data. The caveat is this - anything less than m chunks would not reveal even a bit of information. The k and m can be chosen to be any arbitrary numbers (ofcourse m = k )

    In effect what this provides is redundancy (you can reconstruct the original data even if some links and stores are down) and security (not even a bit of data can be reconstructed without compromising atleast a particular number of stores) To make this practically possible we, as a community should have servers running in geographically diverse locations (just like the root servers) with many different flavors of OSes (so one exploit does not cause all the servers to be compromised) with strong authentication protocols.

    Just my 2 cents.

    -Dracken
    • Okay, I meant m less than or equals k. (Slash thought that my less than symbol is a html tag bracket) If you are interested you might want to check out This [nec.com] paper - which surprisingly is old (1988).

      -Dracken
  • Yep, definately something like PingID [pingid.org], which I'm now helping out with. These guys are smart, and have some big names involved. They want to do it right, the protocols involved with the Digital ID system we're developing will be submitted to the W3C.

    Anyway, I got involved through my earlier work with Genio [theoretic.com], which was a complete open source system not just for personal data storage but also single-sign-on, a la Passport.

  • I certainly hope it's possible to have a central place where people can store personal data and control exactly who's allowed to access it. This is the only way we can have an electronic "I am who I say I am."

    And without some kind of reliable identify mechanism, Spam is gonna be a permanent problem. As long as email is based on informal mutual recognition, we don't have any really good spam filtering mechanism. You can ban it (hard to enforce, and there are free speech issues), filter it (and miss a lot of legtimate email in the process), shut down servers that tolerate or support it (which I find disturbingly Scientology/Jack Valenti), and various other things that mostly just create new problems.

    That leaves being very careful who gets your email address. Which makes it nearly impossible for people to find each other. I really hate not having an email white pages!

    The only real solution involves a system where you limit your correspondent to a list of verifiably real people. People can ask to correspond with you -- if they can prove they're somebody you want to talk to.

  • I was given a name, a drivers licence, an id-card, several bank accounts, email-adresses, homepages, passwords and PGP key-rings and to top it all, I should create a 'secure' storage with additional keys and data to protect? Finally I'm no longer needed; I can simply and sliently die, because all my relevant data is allready handled in a unified, standarized, automatic system - in my electronic 'persona' online.

    No.

    A human is something different than a 'person'. Of this great important 'persons' we all have had enough - more than enough. The more important they are, the more wars or suppression or power-greedy games are on their account. Alexander "The Great", Bill Gates "The Billionair" and Osama Bin Laden "Fighter of the Holy War".

    Time for less pesonalities and more humaness - according to my taste. Let's bake a cake, go for a walk with the children or joke with the friends. How to store such things?

    Life cannot be stored - nor can I.
  • I think there are several different levels of personal data, which it makes sense to have different levels of security against.

    The lowest level of security would be unauthenticated attribution. i.e. someone quoting something I have written. You don't know if the quote is accurate, or even what the context is, so it would make as much sense for you to rely upon it as it would for me to encapsulate it in a gpg signature. One example would be a blog. While it is reasonable to assume that what you find in a blog is from the person attributed, it is rare indeed to find one gpg signed.

    Next up would be "for the record" personal data. This is data such as public keys, and personal data that I want publicly known. In this case the data should be stored in a manner that self corrects. gpg signing is only part of the solution, distributed storage similar to a raid5 storage of data across many disperse web servers, such that removing one server does not remove any data, and removing up to a fifth or potentially more of the servers would not prevent accurate data reconstruction, could be appropriate.

    From here we move into data that we do not want generally available, but may want to make available to specific people or groups of people. Examples include a wife making a grocery list available to her husband, my employer needing my home address, ssn, and bank account number (to insure that I am insurable, collect taxes, and pay me by direct deposit/debit, respectively.)

    Next up is data that I may want to maintain so that I can work with it as part of work, hobbies, or other things, that I do not think needs to be generally available, but would not be bothered if it were public knowledge. Raw un-filtered data, parts lists, etc.

    Then comes things like rough drafts of works I would like to publish, or incremental evaluations of results that are not complete. I don't know of an author around that wants to discover the second draft of their most recent book out on the internet. It could even cause them to be in violation of a publishing contract. Likewise research materials, general e-mail, personal diaries (not blogs) or journals. At this level you might find people questioning whether it is necessary to back up this data.

    The last level is for information that would be more expensive to be public than destroyed. Bank card PINs, Passwords, Private Keys, Love notes. At this level it may make sense to keep the specific data on a USB storage fob chained to your wrist, or secured by a program that maintains it's encryption key on such a device.

    I am aware of some people who would maintain that all data that you do not want to be publicly available should be encrypted. For a lot of people maintaining an encrypting infrastructure is beyond them. You or I might think it trivial to set up an encrypted file storage area using gpg, rsa, or mandrake, but then I doubt that my dad would be able to do so.

    Worse, the best known examples of private/secure local storage are easily broken into. For example you can encrypt documents, outlook.pst folders, and the like, only to discover that for $19.99 you can break into any of these files. (Even less if you can find and compile the code to break into these files yourself.)

    Until real security is made easily usable, and businesses and people begin to understand that just because they want to know something does not mean that they should be given or be able to purchase that piece of information, I think we are going to ultimately see more companies desiring to archive, and make public or available for purchase addresses for stars, embarasing gaffs of politicians, and people being fired for actions they unwittingly participated in before the rules saying that those actions are cause for termination are created.

    -Rusty
  • Most people here are talking about storing personal information on central business-run servers, central government servers, distributed server, servers, servers, servers...

    What we really need is a personal storage device that is in charge of handling all your vital information and is carried around on your person. It would be universally accepted at hospitals, drug stores, government institutions, shopping malls, you name it.

    Here's what it would look like:

    The device would be paper thin and easily carried in a wallet or purse. It would have an adapter to allow you to update information on it from a PDA or personal computer.

    The information on the device would be divided up into a couple of different areas, some that are editable by you and some that aren't

    - Medical information: known allergies, diseases, physical attributes that would be updateable by the individual and accessible to hospitals. Some of this information would be editable by you, some would be only editable by the hospitals. Copies of this information would be stored at your hospital and would be synched up anytime you visited. If you went to another hospital, the information would be immediately available.
    - Credit Card information: accessible to merchants. The card would have a touch pad screen to allow you to select method of payment, you'd swipe it at the POS and the sale would be complete. This information would be editable by the individual.
    - Identification: Some of this information would be editable by the individual, like address, phone number, email, etc. Government stored information, like driver's license number and social security number would not be editable and would be used by the government to verify your identity. Swipe the card at the airport and you are who you say you are.
    etc...

    Now, here's the cool part. The card could only be activated by the individual who's information is on it. When you first receive your card, your biometric information would be stored on it (nowhere else!), which means that unless you yourself are in possession of the card, none of the information on it would be available.

    This sytem requires no central repository for information. What is does require is a standard protocol for transferring data. No one agency would store all your information. Standard terminals everywhere would allow you to plug in and verify that you are the person you say you are. The division of information on the device would mean that only the information required by an institution would be available to them. Government bodies would not be able to access your hospital records unless you allowed them to. Merhants would not know your government information unless you specifically provided it to them. When shopping online, all you'd do is plug the card into your computer or PDA and make the transaction happen.

    Forget central databases. Put the information in the hands of the individuals themselves.
  • by Edmund Blackadder ( 559735 ) on Saturday April 13, 2002 @05:54PM (#3336508)
    I hate it when questionable statements are presented as undisputed facts:

    "But the demand for the idea of an information repository isn't going to go away -- users demand convenience, and this would be convenient."

    I cant see anybody other than advertising agencies or aspiring dictators demanding a central information repository.

    And yet the news story suggests that consumers are demanding it. I really really doubt that. Any customer convinience can be achieved if the customer data is stored at his/her computer and is completely under his/her control.

    This may be an interesting issue but is worded in a way that loads the question. Slashdot editors should be more careful.

  • so this is the new buzzword now? public data repositories? everyone is going to run around and find ways to do it without asking if they really need it? like the p2p frenzy. give me a break; when are people going to start solving real problems rather than just wanking. why not spend those research dollars on finding ways to improve the systems we know we need instead, rather than jumping bandwagon because everyone else is.

    why is it that the software world is so full of these obsessive notions that everyone has to use a certain technology, appropriate or not, for whatever they do in order to be cool. I know several examples of companies doing stupid products, just because they felt they had to do something that allows them to say they follow the latest silly trend in software.

    besides: we don't have public data repositories already? that is certainly news to me.

  • I think the only way this sort of thing would work, is through collaboration with the government. They already have most of our "private" information... and in the states they have pretty much obtained the right to confiscate/record any other info you haven't given them.

    Thing is, privacy is a fundamental human right, and most governments understand this. Most people running .com companies, however, have little to no education on the rights of we the people.

    I see this as a *huge* opportunity for the gvt. They could rent-out reliable, secure space to us, and in return, they would earn back the trust of their citizens... well until it gets hacked!!

  • ... a kernel module we could load in linux that would allow a certain percentage of cpu usage (determined by the Makefile) would be allocated to distributed services. distributed services would be the program that runs that gives your cpu, network connection, or/and disk space allocate a part of that gives those percentages to a global p2p network, like freenet.
  • Encryption is the problem.

    If there is a 'repository' then we need to be in control of the encryption we use.

    If the MS model is to use 56-bit then it's flawed. Hell, anything lower than 4096-bit isn't really all to safe.

    I'd just use PGP to create two keys with two different pass-phrases - put my secret keys on CD [floppy et. al.] - and then would I put the data 'out there'.

    It really doesn't matter who holds the data. The problem is what we use to protect the data. 128-bit isn't enough. RC5, DES [triple or not] and similar crappy encryption protocols is what makes us afraid of a central system.

"How to make a million dollars: First, get a million dollars." -- Steve Martin

Working...