Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×
Government IT Technology Your Rights Online

Whistleblower: NSA Is So Overwhelmed With Data, It's No Longer Effective ( 209

An anonymous reader cites ZDNet's Zack Whittaker report: William Binney, a former NSA official who spent more than three decades at the agency, said the US government's mass surveillance programs have become so engorged with data that they are no longer effective, losing vital intelligence in the fray. That, he said, can -- and has -- led to terrorist attacks succeeding. Binney said that an analyst today can run one simple query across the NSA's various databases, only to become immediately overloaded with information. With about four billion people -- around two-thirds of the world's population -- under the NSA and partner agencies' watchful eyes, according to his estimates, there is too much data being collected. Perhaps that's one of the reasons why NSA wants to dump the phone records it gathered over the past 14 years.
This discussion has been archived. No new comments can be posted.

Whistleblower: NSA Is So Overwhelmed With Data, It's No Longer Effective

Comments Filter:
  • by Tim the Gecko ( 745081 ) on Wednesday March 23, 2016 @02:06PM (#51762755)

    "Where is the Life we have lost in living?

    Where is the wisdom we have lost in knowledge?

    Where is the knowledge we have lost in information?"

  • by Anonymous Coward

    This is named infoxication and is known for decades.

  • Search Tools (Score:2, Insightful)

    by Anonymous Coward

    Sounds to me like their search and filtering capabilities are the problem, not the amount of data available.

    • Re: (Score:2, Insightful)

      by Anonymous Coward

      False positives, false negatives.

      If you have a correlation that gives an impossibly good 1% false positive rate and 1% false negative rate, you can expect that 1% of the subjects you are looking for will be overlooked and 1% of those who you are not concerned with will match. So, let's apply that to the current nuisance.
      1% false negative: for every 100 people with hostile intent, 1 will slip through the net and either bomb something or be stopped by civilians.
      1% false positive: for every 100 people without

      • You have a number of assumptions, the worst of which is symmetric false reports (1% and 1%). The more likely scenario, which is also tunable (gets better with more data) is that it is asymmetric in nature, and thus the conclusion is not only inaccurate but terribly so.

        It would have been MUCH better for you to not assume anything, and give general impressions of how the numbers might work. And 1% is way to high. You're more likely to see numbers in the "per 100,000" range (.000001 vs .01000), which is 1000 t

        • by TheCarp ( 96830 )

          Of course there is a question of what do you count as a "false positive". Every time you make someone toss a box cutter in the trash where you wouldn't have before. Every time you arrest somebody where you would have let it slide before, every time one of those people wasn't a terrorist, that is a false positive. Or at the very least a side effect.

        • Bayes FTW
    • The problem isn't merely the volume of data. After all, LHC produces terabytes of data with each run. The problem is one of volume and variety. Imagine tracking every phone call made in the US, out of the US and into the US. We're talking about everything from calls to working spouses to pick up bread on the way home to ordering of products to sex chat calls and thousands of other topics. Filtering and searching those calls and the metadata surrounding them would be a monumental job of incredible complexity

      • While it is difficult, the article addresses information 10 years out of date. With enough computational power and progress in algorithms this will likely become less and less of an issue. Also it is trivial to cut away data you may not be interested in once you actually have some focus rather than treating it as a black box pre-crime device. An example is the phone data - while it may be impractical to speech to text every call over the last 14 years and use it to determine future attacks, having a p
        • by paiute ( 550198 )

          having a pool of known terrorists

          If we know who they are, why do we need all that data too?

          • having a pool of known terrorists

            If we know who they are, why do we need all that data too?

            To find many more of them, any private or corporate or state supporters, and all associated people including family.

      • Well, I probably contributed in some small way when talking with my brother on the phone a few nights ago. We were talking about the Apple case, NSA surveillance, etc, and I mentioned how just by saying "allahu akbar!" we'd probably set off a flag and get our conversation flagged for automatic transcription and further analysis.

    • Re:Search Tools (Score:5, Insightful)

      by Aighearach ( 97333 ) on Wednesday March 23, 2016 @02:49PM (#51763191) Homepage

      Even with good search tools, signal to noise ratio is still important.

      Excess data with no correlation to the problems NSA is trying to solve (without getting into a debate over what those are) is simply noise.

      • Even with good search tools, signal to noise ratio is still important.

        The signal to noise ratio doesn't change when you merely use less data. The whole point of good search tools is to extract the signal from the data, and filter out the noise. If you believe that "less data, but better data" is the answer, then you should also believe that whatever algorithm you use to decide which data are "better" during the collection phase, can also be used to filter the existing data during the analysis. So collecting less data would not help.

        The NSA may be wasting resources by colle

        • Re:Search Tools (Score:4, Insightful)

          by Aighearach ( 97333 ) on Wednesday March 23, 2016 @03:37PM (#51763595) Homepage

          The signal to noise ratio doesn't change when you merely use less data.

          False. Your statement is not true by default; it requires all the data to be known to be of equal quality.

          Any time that some data is more strongly correlated than others, your noise is going to go down when you throw out the lower quality data.

          Don't wave your hands, think it through and make a logical, reasoned argument.

          • your noise is going to go down when you throw out the lower quality data.

            Except that if you have an algorithm for recognizing "lower quality data" then you can exclude that data from your search results. So it is not going to affect your results.

    • If only they had more money, they could solve the problem.

    • by rtb61 ( 674572 )

      The problem is errant and false data. No amount or search or filtering can clear up a poisoned data base. Bad data creates false association and relationships, which means bad results are generated by all searches. You can not clean up bad data without it also taking out good data, you can not get good data without also getting flooded out by bad data. If fact the best defence against those deep total data acquisition system is to simply generate false data and allow those systems to flood themselves with

  • by Anonymous Coward

    What if there was a way to mark the data in a stream, not storing it permanently but being able to refer to the markers during a specific period of relevance?

  • by Anonymous Coward

    ... they want a google database of peoples data/chats/records and behavior they can use against them at any time for political purposes.

    Our brains are much worse at reality and thinking than thought.
    Science on reasoning: []

    The (mass surveillance) by the NSA and abuse by law enforcement is just more part and parcel of state suppression of dissent against corporate interests. They're worried that the more people are going to wake up and corporate centers like the US and

  • Let's give them the "Big Data!" and "Analytics!" spiel that all the marketing wanks are cramming down our throats. Sounds bites and spending huge bucks on them is the solution!

  • Wait a minute... (Score:5, Insightful)

    by fuzzyfuzzyfungus ( 1223518 ) on Wednesday March 23, 2016 @02:16PM (#51762875) Journal
    Is this guy saying that the NSA used to be effective? I do remember them doing good work back when they emphasized playing defense; and they have probably assisted with some really juicy targeted attacks on specific people of interest(whether criminals or well-placed figures in governments we are interested in getting to know better); but has the Total Information Awareness/dragnet-all-the-data stuff ever shown the slightest evidence of providing useful data?
    • by Megol ( 3135005 )

      Who knows? Not us, it's classified. It may have helped much in whatever NSA is targeted to or have been totally useless. I don't think that question is that interesting though, the important question is what are the costs of the program. And I'm not talking money - trust, morality and freedoms are worth much more.

      • Re:Wait a minute... (Score:5, Interesting)

        by F.Ultra ( 1673484 ) on Wednesday March 23, 2016 @05:58PM (#51764879)
        I actually think that we do know because if they (either NSA och the CIA) would ever have found anything it would have been posted all over the media. To really win over the population and get even more funds all they need is that one true case, that they haven't announced that tells me that they have none to show and instead they play the "if only we could tell you what we know" card.
    • by gweihir ( 88907 )

      At least not against terrorism. These people would have turned up in the legal system if caught. AFAIK no actual terrorist act has ever been prevented by the NSA and there would at least be a few were we know about if they were actually effective in that area. On the other hand, their shoddy targeting information for state-sponsored murder has probably created a few terrorists.

  • Total lie (Score:5, Interesting)

    by axewolf ( 4512747 ) on Wednesday March 23, 2016 @02:17PM (#51762891)
    What are we supposed to think from this? That we need to pour more money into mass surveillance to aid data analysis to keep us safe? This is a obvious example of the ongoing damage control. All of the recent stories concerning the NSA seem to be dancing around the main point: our government has been proven to steal information from all of us. They have been monitoring and recording all electronic communications for years. This isn't just a breach of trust. This is a complete annihilation of trust for anyone who has the ability to reason. Nothing anyone says who is or was involved with intelligence is credible. The conclusion that must be drawn to preserve freedom is that the government is an mortal enemy to the vast majority of people. This bitter idea needs to be made palatable to everyone. Only then can reform be enacted.
  • by zenlessyank ( 748553 ) on Wednesday March 23, 2016 @02:18PM (#51762895)
    They need moar datas. Everybody call up someone in a country we don't like and ask them how their cat is doing. Ask for tech support from China for the missing buttons on that shirt you just bought from Wal-Mart. Find out when the lights change at an intersection in some obscure town.
    • Step 1: Develop open-source AI that can carry on rudimentary conversations - occasionally peppering in some words like "bomb" or "ISIS" to trigger NSA searches.
      Step 2: Have people register multiple VoIP accounts to run the AIs on.
      Step 3. AIs call each other, have conversations, and hang up to call the next AI. (This step repeats ad infinitum.)
      Step 4. Sit back and watch as the NSA's servers burst from too much data.

  • DUH! (Score:5, Interesting)

    by Lumpy ( 12016 ) on Wednesday March 23, 2016 @02:18PM (#51762897) Homepage

    The NSA and FBI etc are trivial to thwart.. I did it to my ex NSA professor at college.

    I bet him a solid 4.0 in his class that I could get an encrypted message past him and he would not be able to detect it. He agreed.

    I sent him 10 files 1 had a message that I encrypted. the other 9 had the contents of /dev/random encrypted into them that matched the same bit length message all encryption blocks were 100% identical in size.

    I won and was told I cheated.... I asked him if Spies follow rules and get in trouble if they cheat....

    • Interesting. Applying this example in a completely different area, I've said that we might not recognize signals from alien life because they would be using alien encryption/compression/protocols that might be indistinguishable from random "data." If your professor couldn't tell which of the 10 human encoded files had real data and what it was, what's the chance of us telling that some signal is actually an alien's video file in an alien codec using an alien compression/encryption algorithm?

      • by Lumpy ( 12016 )

        Absolutely, spread spectrum communication is very hard to detect, and we are currently only looking for intentional beacons on a specific frequency based on a scientific guess.

        Also looking for signals not beamed into space at insane power levels is pretty much impossible for us. even if a standard 100,000 watt omnidirectional TV transmitter was only 1 light year away we would never be able to detect it's signal because of the inverse square law will put the signal below the background noise well before it

      • I had thought it would be obvious to use deltas from true randomness to check whether something had info in it, but I don't think that's what they do.

    • Re: (Score:2, Funny)

      by Anonymous Coward

      I bet him a solid 4.0 in his class... He agreed.

      I'll take 5 mod points in "Things that didn't happen." Alex.

  • He wanted plenty of spies! They were far cheaper than tons of warriors and all their food supplies and equipment.

    • by Gryle ( 933382 )
      Generals throughout the years have understood the effectiveness of good intelligence. George Washington is said to have remarked "I do not fear the enemy; I fear his spies" based on effectiveness of his own spies, the Culper Ring.
  • by Narcocide ( 102829 ) on Wednesday March 23, 2016 @02:22PM (#51762927) Homepage

    You insensitive clods! The NSA is having trouble keeping up with all your jibber-jabber!

    • If they figured out that being "liberal" or supporting Free Software is just political speech that they should ignore, that would help them pare it down a little. :)

      Tracking the Linux Journal readers alone probably costs them a lot of storage and search noise.

    • I've been trying to be considerate to the NSA by just posting jibber and leaving out the jabber. Won't someone think of the NSA?

  • I heard the very same news on the radio today; except it was in France and was about French intel service. Either it's a coincidence, or it's yet another press release begging for more power for intel services, around the world.
    • They can't exactly delete it before they get in trouble for having it unless they get the others who have it to delete it also. ;) Just my one conspiracy angle on this.

  • Unless you are living in the cave, you should have noticed never ending AI advertisment from IBM: Hi, my name is Watson!

    Reality, is that it does not take Binney to say that having too much information is counterproductive. Thus be assured, that military versions of AI, are continuously are poring and monitoring through the dossier files, currently maintained as relationship databases.

    You can be assured that there is an automated never-ending surveilance and the code, the AI, the algorithms will get better o

    • Not even AIs are going to be able to overcome the nature of false positives. Even if you get them to the same discriminatory level a human filter can have, and presuming they're operating at millions of times the speed of any human filter, you still have to deal with the fact that the intrinsic nature of the data itself makes false positives inevitable.

    • Newsflash, you don't have to live in a cave to not watch TV advertising. Wow, talk about "living in a cave." You think everybody does things the same as you!

      5% of American households don't even have a TV; up from 1% 10 years ago. Of those who have a TV, many only watch public television, or only watch during 1 sport season.

      Many more use time-shifting devices that automatically skip ads.

      I've certainly never seen the ad. And I don't live in a cave; only rich people can afford properties with caves. Good luck

  • If an analyst is overwhelmed with data by querying any single person's name, I imagine it won't be difficult to charge anybody with anything at all just because "DATA". Talk about abuse of power and authoritarian regimes - pick a person, pick a crime, pick circumstantial evidence from the big bad pool of "content", as Snowden puts it, and there you have it: reasonable suspicion à la carte.
  • by xanthines-R-yummy ( 635710 ) on Wednesday March 23, 2016 @02:30PM (#51763017) Homepage Journal

    How does he know that the NSA hasn't hired more informaticists in the past 10 years? If I read TFA correctly, he's been out for over a decade. I kind of doubt he's privy to top secret (or higher) information like that, although civilians are granted security clearances too sometimes.

    I'm not saying he's wrong, I'm just not clear on HOW he knows what he's saying is accurate. Just so you know, I'm not fan of Patriot Act or the NSA's "hoovering" of data, meta or otherwise.

    • to be fair, the NSA has never been particularly effective at stopping terrorist attacks, so the conclusion is hardly surprising.....
    • He was in for 30 years prior. I'm sure he made some friends and still has contact with a few of them on the inside. They want to get the message out, they tell him, he tells us.

    • It's not a secret the NSA has been hiring more informaticists. He's saying you're not going to fix this with more informaticists, and while he's on the outside the little data he's got since (including from unofficial contacts of which there are a lot) confirms his opinion and his warnings. The article is a year old but his opinion hasn't changed the last 15 years. Data gathering should be targeted, not trying to do a wide sweep.Targeted spying is much more effective, it's more legal and it's more moral. Th

  • by Schmorgluck ( 1293264 ) on Wednesday March 23, 2016 @02:30PM (#51763033)
    Seriously, it's nice that the NSA comes out as overwhelmed with data it can't exploit (although, as some have already pointed out, that's not particularly new - see 9/11 for an example too obvious to pass), but every internal security agency in the West has been saying so for years (or rather, members of said organizations complained about it anonymously or through their unions). Intelligence requires data, but mass collection of data is of dubious help when the people in charge of examining it is already understaffed for exploiting classically collected data.
    • 9/11 was not an example of not being able to spot valuable information hidden among mountains of chaff. The CIA was actively investigating some of the conspirators prior to 9/11. The CIA managers in charge of that investigation weren't interested in sharing any of the glory from busting the conspirators and so when they entered the USA didn't inform the FBI. The CIA wasn't aware what the plan was but hoped that they'd be leaving the USA again soon so they could continue investigating them and eventually bus

      • My point was that the CIA had pointers (including some provided by allies), while the NSA was in the dark. I won't comment of what you say about what they did with said informations. It doesn't match what I remember reading on the subject, but I'm not very sure of my sources.
    • They should give Harold Finch a call.
  • by Beerdood ( 1451859 ) on Wednesday March 23, 2016 @02:53PM (#51763225)
    Maybe the NSA could be convinced to do a special TV show appearance on Hoarders. Have some other agencies come together in an intervention to help 'em let go.

    DOJ: So NSA... we've got some recorded phone calls here from August 3rd, 2003 between a Darlene [redacted] and her grandson [redacted]
    NSA: Yes.. she's born in 1948, lives in Arlington, TX and her SSN is [redacted]. I remember when we first collected those calls.
    DOJ: Well then, we listened to this a few times, and it sounds like some fairly innocuous conversation. Nothing criminal whatsoever.
    NSA: Right
    DHS: So... do you think we can delete these calls then? I mean, there's no..
    NSA: NOOOOOO!! There could still be connections to terrorism in those calls... somehow! You never know what we might find on meta-data analysis
    DEA: Look... we've identified all the phone references with mentions of drugs, and made copies of those for investigations. We never use the rest of those recordings, and I'm the only one here that really uses those at all. Maybe we could just.. y'know.. delete...
    NSA: Don't touch that data! It's mine! I own it!
  • The government propaganda machine is in overdrive trying to fool the citizens...
  • Big data is great when doing statistical analysis not so great for spear fishing
  • by Tungbo ( 183321 ) on Wednesday March 23, 2016 @03:42PM (#51763665)
    I've said this several years ago,  All this metadata collection is easily defeated when the culprits uses burner phone or sim cards.  That is exactly what they did in Brussels.   Just because one has a lot of data doesn't mean you can make sense of them.  Think of the Internet Search Engines before Google.  You get TONS of useless hits.  Google's result were better due to massive amount of other people's usage pattern.  Here the terror acts are so few, that they offer little to help train any software.  It is a very difficult problem that may not be solvable by Big Data.
    • All it takes is doing a few other logged activities at the same time you're carrying the powered-on burner phone and it's linked to you. That could be using any non-cash form of payment, driving home every day, or going to work/school.

  • Data is not information.
    Information is not knowledge.
    Knowledge is not wisdom.

  • between data and information.

  • by littlewink ( 996298 ) on Wednesday March 23, 2016 @06:29PM (#51765109)

    This data should be released to the world for all to see along with search tools to suit.

    Sociologists and citizens alike could plumb the depths of human behavior for years and finally, for once, get a clear view of political, economical and social alliances in all their (formerly) clandestine glory. Some changes might even result.

  • It doesn't matter now. They'll store it until they have the capability and need to mine it properly. Data never really goes away, it will all come back to bite us later.... and if they do dump something, it's because its worthless and they probably have something juicier to replace it with. Then again, I'd be surprised if there isn't a backup somewhere. These things have a way of popping back up, long after you had forgotten about it.

    It also wouldn't surprise me if this is disinformation designed to put eve

  • The NSA is looking for needles and all they did is employ warrantless wiretapping to increase the size of the hay stack making their own work more difficult. I have the impression that the NSA and many other three letter agencies run these useless programs solely for the sake of running the programs. It is a self-fulfilling activity solely for the reason of asking Congress for more power and more money, essentially wrestling away any control Congress should have. There is a fix for that: cut the budgets for

egrep patterns are full regular expressions; it uses a fast deterministic algorithm that sometimes needs exponential space. -- unix manuals