Hugh Pickens writes "Information Week reports that the National Security Agency is taking a cloud computing approach in developing a new collaborative intelligence gathering system that will link disparate intelligence databases geographically distributed in data centers around the country. The system will house streaming data, unstructured text, large files, and other forms of intelligence data, and analysts will be able to add metadata and tags that, among other things, designate how securely information is to be handled and how widely it gets disseminated. For end users, the system will come with search, discovery, collaboration, correlation, and analysis tools. The intelligence agency is using the Hadoop file system, an implementation of Google's MapReduce parallel processing system, to make it easier to 'rapidly reconfigure data' and for Hadoop's ability to scale. The NSA's decision to use cloud computing technologies isn't about cutting costs or seeking innovation for innovation's sake; rather, cloud computing is seen as a way to enable new scenarios and unprecedented scalability. 'The object is to do things that were essentially impossible before,' says Randy Garrett, director of technology for NSA's integrated intelligence program."
When will people realise than having more data often makes it more difficult to find the needle in the haystack. I am all up for utilising raw power in distributed networks to gain insights into patterns previously hidden but given the sheer breadth of proposed data formats to be mined this effort seems doomed to producing no tangible results at all. Except the ongoing expenditure of tax dollars.
"Now how many crimes have you unknowingly broke today, citizen"
Well you've just committed one crime, you've shown decent, citizen, by joking about Big Brother;). So we need to add you to more of our lists. (After all, we need to know who to round up first, when we next want to distract political opponents).
"more data often makes it more difficult to find the needle in the haystack"
The more haystacks they look through the more needles they can find. It'll just cost a lot more to build such a big sys
Needles. Haystacks. How is it that it every government endeavor except intelligence agencies someone asks "Hey, exactly what is the cost per needle found in those haystacks?"
Why is there no commission that meets twice a year and announces to the public: we found 8 terrorists, killed 3. It cost 160 billion dollars. The commission should be composed of people the public knows and trusts. They can have their backgrounds examined by the agencies. They should give out as much information as possible w/o pu
You could say the same thing about search engines back in the mid-90s, before Google's PageRank. What's the point of indexing more and more information if it's impossible to find the relevant page?
Necessity is the mother of all inventions. The only regret here is that whatever they come up with, probably won't see daylight until someone outside reimplements it.
You could say the same thing about search engines back in the mid-90s, before Google's PageRank.
I couldn't agree more however the ramifications of inaccurate or misleading results are minimal - to the user if not Google's bottom line. What is at issue here is collating a resource from disparate sources, including vastly different formats, which would enable relevant agencies to better sort the wheat from the chaff. Having been part of a team aiming to standardise similar data sets to provide a search resource I speak from experience when I say this is no simple undertaking. Slashdot has recently [slashdot.org] poste
I guess it really depends on the details. The two bottlenecks are certainly worrisome: 1) the need for analysts familiar with both the system, and domain experts, to classify the data; 2) that the data is made available to a a wide range of users [wikipedia.org].
Let's hope the trial is realistic enough to bring up potential problems before real people get pulled in because of an overreliance on technology...
"The system will house streaming data, unstructured text, large files, and other forms of intelligence data, and analysts will be able to add metadata and tags that, among other things, designate how securely information is to be handled and how widely it gets disseminated. For end users, the system will come with search, discovery, collaboration, correlation, and analysis tools."
More data is only bad if the signal to noise ratio doesn't also improve an equivalent amount. It doesn't matter if you've got 4
Do the "Cloud Vendors" sell big cloud stickers to put over the data center portion of the design diagrams to hide the details from management?
This sounds like centralized computing and storage on dedicated servers. They are not going to buy a slice of some public cloud computing infrastructure.
Listen, shut up about the "clouds" already. Just because you don't understand the architecture doesn't make it a "cloud". That word doesn't mean anything. I'm sick to death of hearing it already.
Don't worry, if the Government wants to do it, then it's no longer hip. I expect the phrase "cloud computing" to go the way of "paradigm shift" and "mindshare" soon. Too bad the NSA will piss away a few billion before they realize they aren't cool.
. . . the Stasi http://en.wikipedia.org/wiki/Stasi [wikipedia.org] , the former East Germany's secret police, ended up collecting so much information on folks, that they couldn't process it all.
From the Wikipedia article: "When informants were included, the Stasi had one spy per 66 citizens of East Germany.[8] When part-time informer adults were included, the figures reach approximately one spy per 6.5 citizens."
Yo.
I guess the NSA thinks that they can do better than the Stasi with brute force computing power.
I swear to God, there has to be some extension of Godwin's Law for every time someone brings up the Stasi. What is the ratio of NSA informers to the American population?
I swear to God, there has to be some extension of Godwin's Law for every time someone brings up the Stasi. What is the ratio of NSA informers to the American population?
Exactly. Especially since the NSA is not (legally) allowed to analyze "signals" from US citizens, and the vast majority of their employees care about following that rule.
This would be really helpful for law enforcement with their need to be able to access data from other police counties. There is tonnes of cases where cases cross several jurisdictions and the police have trouble integrating the investigations.
No, this just means the data will 'disappear' into the cloud when some citizen comes walking up with a FOIA request to see the data. "Hey, all that shit got lost someplace on the net, we ain't got it!!"
Since the average shashdotter seems to think third party server clusters a.k.a. "The Cloud" are good and secure and useful for all us human beings and the intentions of said third parties are all well and good; I'm surprised that no one has suggested that the NSA should just use Google Docs or Facebook and get over the supreme silliness and unneccesary cost of private servers. Oo. Or what about Google Wave. I'm sure there's something in that for the NSA, it is the new cool thing after all.
I store my data in a VM on the 99th U of the Rack And I sit ~/ look at the windows Imaging the world cash crops Then in flies a guy who's all dressed up like a NSA Spook And says, I've won five pounds if I have his kind of data packet
I said, Hey! You! Log off of my cloud Hey! You! Log out of my cloud Hey! You! Log off of my cloud Don't login here two UIDs is a crowd On my cloud, Baby
The VOIP is signaling I say, "Hi it's me. Who is it there in the stream?" A voice says, "Hi, hello, how are you" Well I guess I'm doin' f
Money well spent? (Score:2)
Re: (Score:2)
When will people realise than having more data often makes it more difficult to find the needle in the haystack.
The goal of government data collection isn't to find a needle in the haystack.
The goal is to turn more hay into needles.
Now how many crimes have you uknowlingly broke today, citizen?
Re: (Score:2)
Well you've just committed one crime, you've shown decent, citizen, by joking about Big Brother
"more data often makes it more difficult to find the needle in the haystack"
The more haystacks they look through the more needles they can find. It'll just cost a lot more to build such a big sys
Re: (Score:2)
Needles. Haystacks. How is it that it every government endeavor except intelligence agencies someone asks "Hey, exactly what is the cost per needle found in those haystacks?"
Why is there no commission that meets twice a year and announces to the public: we found 8 terrorists, killed 3. It cost 160 billion dollars. The commission should be composed of people the public knows and trusts. They can have their backgrounds examined by the agencies. They should give out as much information as possible w/o pu
Re: (Score:2)
Re: (Score:3, Insightful)
You could say the same thing about search engines back in the mid-90s, before Google's PageRank.
I couldn't agree more however the ramifications of inaccurate or misleading results are minimal - to the user if not Google's bottom line. What is at issue here is collating a resource from disparate sources, including vastly different formats, which would enable relevant agencies to better sort the wheat from the chaff. Having been part of a team aiming to standardise similar data sets to provide a search resource I speak from experience when I say this is no simple undertaking. Slashdot has recently [slashdot.org] poste
Re: (Score:2)
Let's hope the trial is realistic enough to bring up potential problems before real people get pulled in because of an overreliance on technology...
Re: (Score:2)
"The system will house streaming data, unstructured text, large files, and other forms of intelligence data, and analysts will be able to add metadata and tags that, among other things, designate how securely information is to be handled and how widely it gets disseminated. For end users, the system will come with search, discovery, collaboration, correlation, and analysis tools."
More data is only bad if the signal to noise ratio doesn't also improve an equivalent amount. It doesn't matter if you've got 4
Centralized computing/storage (Score:2)
Do the "Cloud Vendors" sell big cloud stickers to put over the data center portion of the design diagrams to hide the details from management?
This sounds like centralized computing and storage on dedicated servers. They are not going to buy a slice of some public cloud computing infrastructure.
Distributed == cloud? (Score:5, Insightful)
The data you keep
The NSA
May get a peek.
Burma shave.
But seriously... Distributed data is now "the cloud"? Is my dirty laundry in "the cloud" because it is scattered in my bedroom?
Re: (Score:3, Funny)
Only if your bedroom is online. Or at a very high altitude.
Cloud this cloud that (Score:5, Insightful)
Re: (Score:1)
who are you talking to? Hate to tell you, Mr Preacherman, but the congregation is thataway. We're the choir.
Re: (Score:1)
+1
And while we're at it, let's bury map-reduce, too.
Re: (Score:2)
Don't worry, if the Government wants to do it, then it's no longer hip. I expect the phrase "cloud computing" to go the way of "paradigm shift" and "mindshare" soon.
Too bad the NSA will piss away a few billion before they realize they aren't cool.
The Stasi, revisited . . . (Score:1, Informative)
. . . the Stasi http://en.wikipedia.org/wiki/Stasi [wikipedia.org] , the former East Germany's secret police, ended up collecting so much information on folks, that they couldn't process it all.
From the Wikipedia article: "When informants were included, the Stasi had one spy per 66 citizens of East Germany.[8] When part-time informer adults were included, the figures reach approximately one spy per 6.5 citizens."
Yo.
I guess the NSA thinks that they can do better than the Stasi with brute force computing power.
Re: (Score:2)
Brute force computing power has certainly advanced a lot since the Stasi. Look at what Google does with a large volume of cheap hardware.
Re: (Score:2)
Re: (Score:2)
I swear to God, there has to be some extension of Godwin's Law for every time someone brings up the Stasi. What is the ratio of NSA informers to the American population?
Exactly. Especially since the NSA is not (legally) allowed to analyze "signals" from US citizens, and the vast majority of their employees care about following that rule.
Analysed Use of Intelligent Models on a Cloud NSFW (Score:2)
"have-fun-securing-that" (Score:2)
Google "air gap".
Neat software (Score:2)
Re: (Score:2)
Future cloudy, try again? (Score:2)
Glad they're behind the eight ball on this one.
why not google docs? (Score:1)
Good point (Score:2)
Excellent point, besides, I thought the banks already had something like this called G.R.I.D. (Global Regulatory Information Database) [rdc.com]?
Hey You! (Score:1)
I store my data in a VM on the 99th U of the Rack
And I sit ~/ look at the windows
Imaging the world cash crops
Then in flies a guy who's all dressed up like a NSA Spook
And says, I've won five pounds if I have his kind of data packet
I said, Hey! You! Log off of my cloud
Hey! You! Log out of my cloud
Hey! You! Log off of my cloud
Don't login here two UIDs is a crowd
On my cloud, Baby
The VOIP is signaling
I say, "Hi it's me. Who is it there in the stream?"
A voice says, "Hi, hello, how are you"
Well I guess I'm doin' f