Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
United States The Internet Your Rights Online

White House Website Limits Iraq-Related Crawling 837

oscarcar writes "Dan Gillmor is reporting on the White House website's use of its robots.txt file to disable search engines from crawling certain material. Many excluded items in the robots.txt file involve mentions of Iraq, possibly to prevent people from finding changes to past statements and information when archived elsewhere."
This discussion has been archived. No new comments can be posted.

White House Website Limits Iraq-Related Crawling

Comments Filter:
  • by rot26 ( 240034 ) * on Monday October 27, 2003 @05:27PM (#7322178) Homepage Journal
    Many excluded items in the robots.txt file involve mentions of Iraq, possibly to prevent people from finding changes to past statements and information when archived elsewhere."

    Maybe, but I would think they might also be looking for "shady" spiders that ignored robots.txt. I wouldn't be surprised if there aren't a few honeypot pages in there too.
  • by mcc ( 14761 ) <amcclure@purdue.edu> on Monday October 27, 2003 @05:30PM (#7322217) Homepage
    Perhaps their goal is simply so that when people google or whatnot for information on the Bush Administration and Iraq, they will be likely to find the Bush Administration's current views on and actions in Iraq, rather than outdated material?

    Completely ignoring for the moment the fact that these views and actions are really somewhat embarrasing for the Bush administration, this really makes sense from a practical viewpoint. Few things are as annoying as searching for something news-ish and finding primarily material from two years ago. And after all, if they ONLY were interested in people forgetting the old materials, they could have just removed those materials from the site totally. (Though perhaps they were aware removing the materials completely would cause mirrors, which would be fully searchable, to spring up.)
  • Mmmm.. Robots.. (Score:1, Interesting)

    by Anonymous Coward on Monday October 27, 2003 @05:30PM (#7322225)
    Robots accounted for well over 10% of all web traffic at a Huge E-commerce company I worked at a few years ago...

    Those robots consumed Many millions in system capacity.

    Of course this is completely different as our freedom is at stake.
  • Seriously though... (Score:4, Interesting)

    by MyNameIsFred ( 543994 ) * on Monday October 27, 2003 @05:31PM (#7322232)
    ...possibly to prevent people from finding changes to past statements and information when archived elsewhere...

    While anything is possible in politics, is it possible that the web admin is trying to limit the amount of traffic on the site? Is it possible that his analysis of the weblogs show a lot of traffic from robots looking for Iraqi-related info?

  • or even considering that previous statements might not match current statements means that the terrorists win. The WH Ministry of Truth works hard to ensure that the spin for the day gets out to the party faithful above the filters of "news" with their "facts" that don't gibe with the message we're trying to deliver.

    If you persist in contemplating a world where whatever statements that the WH puts out, no matter how they might seem to contradict previous statements, are not totally true and correct, then a relocation expert from Guantanamo will be by in a few minutes. Just step away from the computer.

  • Re:Oh please (Score:5, Interesting)

    by cgranade ( 702534 ) <cgranade@gma i l . c om> on Monday October 27, 2003 @05:35PM (#7322280) Homepage Journal
    This gets modded up as Insightful? I mean, the White House is routinely editing their trascripts, and if bots like Google and Wayback can go and find that no, Bush said that we found weapons, not a weapons program, then there goes Bush's latest FUD... *thud*. Just because it's a tinfoil hat worthy theory doesn't mean it isn't true... most aren't, but therein lies the issue: most.
  • related links (Score:5, Interesting)

    by js7a ( 579872 ) * <`gro.kivob' `ta' `semaj'> on Monday October 27, 2003 @05:35PM (#7322288) Homepage Journal
    A couple of web sites that (1) have in the past done a great job of catching these kind of things, and (2) have mailing lists you can subscribe to:

    Here's a minor example of something those two sites didn't catch: Remember Iraq's so-called "mobile biological weapons factories" [fas.org]? A month after the story broke that they were for weather balloons [slashdot.org], the CIA moved their report's URL [informatio...house.info].

    An intriguing fact about this whitehouse.gov/*/iraq thing is that they do in fact cover some of the important statements [bway.net] which are apparently not duplicated in the press release, conference, and briefing directories. Perhaps there was a "unique urgency" to cover up some poor choices of words?

  • by sketerpot ( 454020 ) <sketerpot&gmail,com> on Monday October 27, 2003 @05:39PM (#7322350)
    Honeypot or not, look at robots.txt. It's creepy: just about every entry is an Iraq-related page, and there are a lot of entries. If they wanted to just have a few honeypots, that shouldn't involve that many entries, or so many with the common theme of Iraq.
  • Re:Queue somebody... (Score:3, Interesting)

    by macshune ( 628296 ) on Monday October 27, 2003 @05:43PM (#7322392) Journal
    I found the original code on usenet, modified it and left the original french comments in. Heh, originally they made the referer the cia to scare unsuspecting webmasters. silly french:) this could easily be made to cycle through the robot.txt file, but i don't have the time right now, i'm in lab:)

    #!/usr/bin/perl -w

    use strict;

    use LWP::UserAgent;

    my $ua = new LWP::UserAgent;

    $ua->agent("Mozilla/4.0 \(compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322\)"); # super browser !

    my $req = new HTTP::Request 'GET' => 'http://www.whitehouse.gov/pathtostuff';
    $req->he ader(
    'Accept' => 'text/html',
    'Referer' => 'http://www.yahoo.com' # pour faire flipper le webmestre :-)
    );

    my $res = $ua->request($req);

    if ($res->is_success) {

    # traitement resultat $res

    }

    else {

    print "Erreur : ".$res->status_line."\n";

    }

  • by brian1442 ( 640731 ) on Monday October 27, 2003 @05:57PM (#7322555)
    It seems like every single directory has had the word "iraq" appended to the end. Do you think that this might have been a knee-jerk reaction by some admin who didn't really know what they were doing? I can't really imagine there are legitimate iraq dirs under easter and teeball directories.
  • by buckminster ( 170559 ) on Monday October 27, 2003 @05:58PM (#7322569) Homepage
    It appears that this robots.txt file was probably auto-generated. It looks like someone used a script to crawl the sites entire directory structure appending /iraq and /text to every directory. In the process they seem to have created a pretty complete map of the sites underlying directory structure -- not necessarily a good thing.

    Having said that, I'm not even sure that this robots.txt file would work the way it's supposed to. Seems like these iraq references should all have a trailing slash or a .html if they're actual pages.

    Someone clearly doesn't want Google caching Whitehouse content on Iraq. The question is why? And how come they're so lame about it?

  • climate change? (Score:2, Interesting)

    by PaulGrimshaw ( 605950 ) <mail@pPASCALaulg ... m minus language> on Monday October 27, 2003 @06:08PM (#7322680) Homepage
    Disallow: /climatechangefactsheet/iraq

    Disallow: /climatechangefactsheet/text

    Now why would they want to stop these being crawled?

    Paul.
  • by Anonymous Coward on Monday October 27, 2003 @06:16PM (#7322779)
    Last year the Washington Post ran a story on who would benefit from a war in Iraq. It mentioned Haliburton and the $50+ million in stock options Dick Cheney recieved from that company as an employee. It also mentioned the fact that Sadam had cancelled the oil contracts of several American companies. I tried to find it again to show a friend but the story had disappeared from both the Washington Post search routine and Google. (The WP could have had it removed from Google, I doubt Google itself had anything to do with the stories disappearance.)
  • Re:Funny (Score:5, Interesting)

    by jovlinger ( 55075 ) on Monday October 27, 2003 @06:24PM (#7322864) Homepage
    true. true. Apparently some poor fool [kuro5hin.org] made similar remarks on k5 a while back, and did indeed receive a personal visit from the SS. No charges filed, but 'tis a rude awakening indeed when your online words come and knock on your door.

  • by Anonymous Coward on Monday October 27, 2003 @06:48PM (#7323093)
    Not only does http://www.whitehouse.gov/infocus/iraq/ exist, it is also currently indexed by google. [google.com]

    I guess the googlebot doesn't visit the page, but knows of its existence from other pages??? Either that, or the googlebot is a bad boy that ignores robots.txt.
  • Re: and your ... (Score:5, Interesting)

    by Zeinfeld ( 263942 ) on Monday October 27, 2003 @06:52PM (#7323140) Homepage
    every major journalist worth his/her salt would be all over it within hours. so it wouldn't succeed in obscuring information.

    Where have you been living the past five years? Journalists don't criticize Bush.

    They still have not published the fact that he deserted from the national guard during Vietnam and they practically ignored his DUI conviction.

    The GOP has the media cowed with their constant 'liberal media' babble. There number of journalists who are prepared to hold Bush to account is tiny - Krugman, Conanston, Irvins, Alterman. After that its Al Franken, Jon Stewart and David Letterman.

    it would create an incredible backlash as soon as detected. what purpose would this serve?

    The chances that the mainstream media will pick this one up are very small. Just think how they would have reacted if it was Clinton!

  • Re:Funny (Score:1, Interesting)

    by Anonymous Coward on Monday October 27, 2003 @06:57PM (#7323188)
    No. We live in a republic.

    Actually, we don't. The fix is in, friend. Much like the Roman Empire, where most people did not realize the republic was dead until over a century after the fact, our republic died many years ago and most people don't know it. Something akin to professional wrestling (i.e. a good guy and a bad guy... both working for the same promoter) was put in its place.
  • Re:Already queued? (Score:3, Interesting)

    by Dave2 Wickham ( 600202 ) on Monday October 27, 2003 @07:02PM (#7323233) Journal
    Yeah, but it removes any pages it has stored when it finds itself disallowed from the page, IIRC.
  • by Zeinfeld ( 263942 ) on Monday October 27, 2003 @07:36PM (#7323547) Homepage
    The "iraq" entries were probably added by mistake

    Bullshit.

    The Iraq entries could only have got there if someone was told to go and stop stories appearing in the Google cache.

    The person who got the job appears to have done it in a pretty clumsy way, that is pretty much par for the course for this type of work. Nixon did not expect Gordon Liddy and his pals to get caught in a third rate burgalry either.

    It looks to me like someone was told to block out the Iraq files and simply did a directory listing on the web server and then appended /iraq to everything.

    If you want to find out for sure file some FOIAs.

  • Re: and your ... (Score:4, Interesting)

    by Zeinfeld ( 263942 ) on Monday October 27, 2003 @08:06PM (#7323792) Homepage
    The crux of this argument is that Bush missed some drills in 1972 while he was working on a political campaign in Alabama.

    The crux of the matter is that he refused to have his pilots medical just after the Pentagon added a check for illegal drug use.

    You can try to spin this whichever way that Karl Rove tells you but the facts are against you. The fact is that your great leader is a coward who ducked the draft and then deserted to avoid a drug test.

  • by pclminion ( 145572 ) on Monday October 27, 2003 @08:48PM (#7324125)
    So, if somebody like Google blatantly defied the robots.txt and crawled the entire site anyway, would this piss off the White House? We all know that robots.txt is a "gentleman's" agreement to not go certain places. It isn't an authentication or access control mechanism.

    Would the White House sue for violation of the robots.txt file? Under what laws could they sue? Is robots.txt an implicit grant of permission to view copyrighted content? Would GWB press the Congress for a new bill, to mandate legal enforcement of the robots.txt?

    That's probably not going to happen anytime soon, but it raises an interesting question. Is robots.txt legally enforceable? And if it was, would that be a good thing or a bad thing?

    Your thoughts?

  • by alan_dershowitz ( 586542 ) on Monday October 27, 2003 @09:08PM (#7324293)
    While I agree that by now there ought to have been more transparency by the US govt regarding the Guantanamo Bay detainees by now, if you had SHOT at U.S. soldiers in an engagement in Great Britain, you'd be an illegal combatant. This is pretty much why these people are being detained.

    These people were probably by and large draftees, which unfortunately in Afghanistan, meant they weren't going to _get_ a uniform. They certainly have a right to public trial, but by and large they were probably arrested legitimately. I see this more as an indictment of the unfairness of the Geneva Conventions with regard to poor nations, or forces that aren't backed by recognized governments.

    It would be a lot easier to classify this as "disgusting" if we knew just what was happening down there. Right now, we don't really know much of anything, which is disturbing on several levels. But isn't disgusting in the way I'd classify the very well documented types of supression that were commonplace under the government these combatants were fighting for.
  • by drf5n ( 561106 ) on Monday October 27, 2003 @09:45PM (#7324542)
    Pardon me, but some of them do lead to interesting things. /news/releases/2003/05/iraq/ exists, and even contains different data than
    news/releases/2003/05/text/ or news/releases/2003/05/

    See for yourself:

    http://www.whitehouse.gov/news/releases/2003/05/te xt/20030501-15.html [whitehouse.gov] versus http://www.whitehouse.gov/news/releases/2003/05/ir aq/20030501-15.html [whitehouse.gov] and http://www.whitehouse.gov/robots.txt [whitehouse.gov] has /news/releases/2003/05/iraq/ in it.

    Compare the headlines.
  • Re:Funny (Score:2, Interesting)

    by the_mad_poster ( 640772 ) <shattoc@adelphia.com> on Monday October 27, 2003 @10:20PM (#7324835) Homepage Journal

    Regarding your comment: it's a childish retaliation against another poster's .sig that appears (the link is broken so I'm going from the link text) to down France about it's quite obvious ties to Iraq. You know.. the whole "Pot calling the Kettle black thing"?

    So no, it's not news that the fundamentalist USA supported secular Iraq in a war against fundamentalist Iran.

    Gee... and we wonder why so many Muslims in that part of the world think we're just a bunch of marauding, Koran-hating Christian crusaders. Mm mm... no mixed messages coming from THIS side of the ocean... noooooooo.

    Here's a tip for anybody thinking about replying to start an argument over Iraq:

    I don't care. Bush fucked the whole thing up from the beginning by "going it alone" and now it's too late, so we'll just have to slog through it.

    And vote that asshole out of office when the elections come around. "Bring em on". Yea fucker... bring em right on in to the White House and see how big you are then. Tough talk from a military deserter... goddamn idiot. "Bring em on"... yea, as long as it's not YOU and YOUR kids that are meeting them on the field.... right?

  • Re:Funny (Score:2, Interesting)

    A dictatorship assumes George Bush took absolute power of the government. Our government is way too inefficient to support that political model.

    I don't think USA can ever be a dictatorship. However, it can very well turn into a fascist government. Look at someone like Hitler or Mussolini. People always claim that Hitler was a dictator but that is missing the point. If there were elections held in Germany (open fair elections monitored by the UN), Hitler would still have won by a massive majority.

    The US govt CAN start practicing fascism. I'm not saying it is doing that now but it isn't inconceivable. Dictatorship, on the other hand, is highly unlikely...

    Sivaram Velauthapillai
  • Re:WMD's found! (Score:3, Interesting)

    by TPFH ( 92944 ) on Tuesday October 28, 2003 @03:25AM (#7326474) Homepage Journal
    What wasn't reported widely in the media was that Saddam Hussein had the possesion of 2 of the 3 Egyptian God Cars! If he was able to get ahold of the third remaining card and the Millinium Puzzle he would have been able to TAKE OVER THE WORLD!!!!!

    On a more serious note, as much as I hear people joke about "We kept the receipts" that actually is how the UN Weapons Inspectors were able to find the weapons that they did.

    (btw, what percentage of the country think that it was Saddam Hussein that kicked out the inspectors in 1997?)

    Anyway, according to Scott Ritter, by the time that Clinton kicked the inspectors out of Iraq they had accounted for 95% of the WMD, and the main reason they were not able to complete the job was more because of the Clinton administration than the Iraqis. (Not to say that there were not a bunch of problems from the Iraqis.)

    Scott Ritter has been very outspoken about these issues and as a Marine Corps Captain durring Desert Storm and a Chief UN Weapons Inspector he is a very qualified authority. He risked his life searching for weapons and I think more people need to listen to him.

8 Catfish = 1 Octo-puss

Working...