Forgot your password?
Government Software United States

Snowden Used Software Scraper, Say NSA Officials 227

Posted by timothy
from the what-would-christopher-boyce-do? dept.
An anonymous reader writes with this excerpt from the New York Times: "Intelligence officials investigating how Edward J. Snowden gained access to a huge trove of the country's most highly classified documents say they have determined that he used inexpensive and widely available software to 'scrape' the National Security Agency's networks, and kept at it even after he was briefly challenged by agency officials. Using 'web crawler' software designed to search, index and back up a website, Mr. Snowden 'scraped data out of our systems' while he went about his day job, according to a senior intelligence official. 'We do not believe this was an individual sitting at a machine and downloading this much material in sequence,' the official said. The process, he added, was 'quite automated.'"
This discussion has been archived. No new comments can be posted.

Snowden Used Software Scraper, Say NSA Officials

Comments Filter:
  • Re:Amused (Score:4, Informative)

    by drinkypoo (153816) <> on Sunday February 09, 2014 @10:01AM (#46202499) Homepage Journal

    Oddly, government is complaining that people will be able to take the various facts that he assembled and figure out what we're really up to. You know, the kind of thing they say they can't do with our metadata.

  • Re:Stunning. (Score:5, Informative)

    by Jane Q. Public (1010737) on Sunday February 09, 2014 @02:31PM (#46204337)

    "Slightly more powerful than wget to me is a wrapper around wget. Perl and Bash scripts are way beyond the average users. To politicians scripts can be used to claim "voodoo" or "saintly" depending on who writes the scripts. The NSAs scripts are obviously saintly, while anybody else is probably voodoo."

    Even funnier is the assertion that such "web crawling" would be easy to detect. As someone who has done remote automation and data scraping for a living, I can tell you that it doesn't look any different than any other web traffic.

    About the only way to detect it is to do traffic analysis, to see if the same IP address is hitting nodes a lot, or hitting many nodes in a short period of time, and especially if they are rapid-fire.

    But the latter is easy to get around. I won't say just how here, because even if it's not hard to figure out it's still something of a trade secret.

When I left you, I was but the pupil. Now, I am the master. - Darth Vader