White House Website Limits Iraq-Related Crawling 837
oscarcar writes "Dan Gillmor is reporting on the White House website's use of its robots.txt file to disable search engines from crawling certain material. Many excluded items in the robots.txt file involve mentions of Iraq, possibly to prevent people from finding changes to past statements and information when archived elsewhere."
Funny (Score:5, Funny)
Re:Funny (Score:2, Funny)
We have to give him credit for believing in the U.S. values enough not to shut the site down.
"There ought to be limits to freedom" - G.W. Bush (Score:4, Informative)
WMD's found! (Score:3, Funny)
Re:WMD's found! (Score:3, Interesting)
On a more serious note, as much as I hear people joke about "We kept the receipts" that actually is how the UN Weapons Inspectors were able to find the weapons that they did.
(btw, what percentage of the country think that it was Saddam Hussein that kicked out th
Re:Funny (Score:5, Funny)
I feel the same way about whitehouse.gov. Couldn't have said it better myself.
Re:Funny (Score:5, Interesting)
Re:What a bunch of flamebait (Score:3, Insightful)
I'm so sorry I expended my mod points earlier in the day. What a bunch of flamebait bullshit this line of crap is. "Dictatorship?" Get fucking real. Let me ask this in non-partisan terms:
Yeah, so what? I don't know about you, but part of my governmental conditioning program, er, public education, included a long history lesson attached to the flimsy statement of "Those who forget the past are doomed to repeat it." Noticing that many of the things going on in this country during Bush's term as president
Drawing farfetched conclusions (Score:4, Funny)
And your ... (Score:2)
Re:And your ... (Score:5, Insightful)
It looks like someone blocked off parts of the site to web-crawlers; I don't know for sure why all those blah/bloo/iraq entries are in there but they sure as hell don't lead to anything.
Censorship: 0
Screwups: 100
re: and your ... (Score:4, Insightful)
let's not get reactionary here, folks. it wouldn't make sense to do what's being alleged:
1. every major journalist worth his/her salt would be all over it within hours. so it wouldn't succeed in obscuring information.
2. it would create an incredible backlash as soon as detected. what purpose would this serve?
ed
Re: and your ... (Score:4, Insightful)
Don't be naive. How long do you think that any mainstream journalist who made a story of this would have a job for? The answer - not long. The US media in particular, although the UK is getting as bad, is little more than a relay system for government propaganda and real, detailed, complete examination of government behaviour, with equal air time to truly dissenting opinions (how many times has Chomsky been on CNN in the past 4 months?) is out of the question. What the government does is Good and Right and Should Not Be Questioned.
Media by the elite, serving the elite.
Re: and your ... (Score:5, Interesting)
Where have you been living the past five years? Journalists don't criticize Bush.
They still have not published the fact that he deserted from the national guard during Vietnam and they practically ignored his DUI conviction.
The GOP has the media cowed with their constant 'liberal media' babble. There number of journalists who are prepared to hold Bush to account is tiny - Krugman, Conanston, Irvins, Alterman. After that its Al Franken, Jon Stewart and David Letterman.
it would create an incredible backlash as soon as detected. what purpose would this serve?
The chances that the mainstream media will pick this one up are very small. Just think how they would have reacted if it was Clinton!
Re: and your ... (Score:4, Interesting)
The crux of the matter is that he refused to have his pilots medical just after the Pentagon added a check for illegal drug use.
You can try to spin this whichever way that Karl Rove tells you but the facts are against you. The fact is that your great leader is a coward who ducked the draft and then deserted to avoid a drug test.
Re:And your ... (Score:5, Insightful)
Native people fighting against an occupying force are known as freedom fighters, not terrorists.
ry again sparky.
Re:Drawing farfetched conclusions (Score:5, Insightful)
It looks like someone did a
find . -type d|perl -e 'while(<>){print "${_}/iraq\n"; print "${_}/text\n";}' > robots.txt
I have no idea what the purpose would be, but it seems like a funny thing to do if you were trying to hide something.
By the way, who is going around looking at people's robots.txt files?
Re: (Score:3, Insightful)
upside (Score:5, Funny)
Other, arguably more reasonable explanations (Score:4, Interesting)
Maybe, but I would think they might also be looking for "shady" spiders that ignored robots.txt. I wouldn't be surprised if there aren't a few honeypot pages in there too.
Re:Other, arguably more reasonable explanations (Score:5, Funny)
Oh, crap. I just plugged in
Quick, Slashdot that link before the Agents get to my cube!
Re:Other, arguably more reasonable explanations (Score:5, Interesting)
Re:Other, arguably more reasonable explanations (Score:5, Funny)
A question that GW gets asked all the time.
Re:Other, arguably more reasonable explanations (Score:5, Insightful)
http://www.bway.net/~keith/whrobots/disdirs.html And, yes these files *are* relevant.
Someone's been busy (Score:4, Informative)
http://www.whitehouse.gov/infocus/iraq
Not any more.
Although the current Google cache [216.239.59.104] lists
[snip 22 lines]
the current robots.txt leaps from
to
Conspiracy theory over...
See the GOP trying to spin this... FOIA time (Score:3, Interesting)
Bullshit.
The Iraq entries could only have got there if someone was told to go and stop stories appearing in the Google cache.
The person who got the job appears to have done it in a pretty clumsy way, that is pretty much par for the course for this type of work. Nixon did not expect Gordon Liddy and his pals to get caught in a third rate burgalry either.
It looks to me like someone was told to block out the Iraq files and simply did a directory listing
Queue somebody... (Score:5, Insightful)
Of course, people would be less likely to trust random-Joe from the Internet than, say, The Wayback Machine, but I expect this is what will happen...
Re:Queue somebody... (Score:3, Interesting)
#!/usr/bin/perl -w
use strict;
use LWP::UserAgent;
my $ua = new LWP::UserAgent;
$ua->agent("Mozilla/4.0 \(compatible; MSIE 6.0; Windows NT 5.0;
my $req = new HT
Re:Already queued? (Score:3, Interesting)
Careful (Score:2, Funny)
Nugs
Just Ordinary Web Activity (Score:3, Insightful)
If this was some crazy government conspiracy and they were trying to hide the information, why would they put it on their website? Could be any number of reasons they have done this perhaps they were getting loads of hits from google about iraq related things but if anyone really wants the information surely they can just visit it.
Re:Just Ordinary Web Activity (Score:4, Insightful)
Actually, the motivation around this could be to prevent caching of the documents, so that it's not so easy to compare differently dated versions of the same document. See this piece at Caltech [caltech.edu] for an example of how things change with time.
Re:Just Ordinary Web Activity (Score:3, Informative)
The US government has no buisness with semi-private material. Either don't put it on the website, or make it publicly available to everyone, including Google and friends.
Interesting line (Score:4, Funny)
I didn't know gee-dub likes SpongeBob too! My nephew is gonna flip out when he hears this.
Devil's Advocation Follows. (Score:3, Interesting)
Completely ignoring for the moment the fact that these views and actions are really somewhat embarrasing for the Bush administration, this really makes sense from a practical viewpoint. Few things are as annoying as searching for something news-ish and finding primarily material from two years ago. And after all, if they ONLY were interested in people forgetting the old materials, they could have just removed those materials from the site totally. (Though perhaps they were aware removing the materials completely would cause mirrors, which would be fully searchable, to spring up.)
Re:Devil's Advocation Follows. (Score:2, Insightful)
I agree that this is yes another step in the misinformation campaign surrounding the current administration. The policies that we've heard flip through hoops like trained seals. There's just no logic to all the reversals of focus, the "misquotes" and the public snafus we've seen happen. This is just another one of them.
Re: (Score:3, Insightful)
Re:Devil's Advocation Follows. (Score:2, Insightful)
Are you surprised? (Score:2, Flamebait)
This isn't partisan politics, either. The Republican party has been trying to keep Bush from violating the Presidential Records Act.
Yes, yes, the country's at war. Makes you wonder why Bush doesn't want anybody to know about communications between Reagan and his advisors.
country is not at war (Score:3, Insightful)
Re:country is not at war (Score:5, Insightful)
Re:country is not at war (Score:3, Insightful)
I suspect that many of the people captured met neither condition.
Re:country is not at war (Score:4, Insightful)
In which case they should be charged with something, either spying (unlikely if they were in their own country) or something else. They should then have the opportunity to defend themselves in open court with the ability to avail themselves of all the rights guaranteed by the Universal Declaration of Human Rights [un.org] which the US has signed up to. If US soldiers in Britain arrested me, I would not be wearing a recognisable uniform because I am not part of the military or any recognisable fighting force of government. That does not give them the right to forcibly remove me from my home country and lock me up without ever even charging me with anything! The actions of Bush and his cohorts in the Whitehouse are absolutely disgusting.
Re:country is not at war (Score:3, Interesting)
These people were probably by and large draftees, which unfortunately in Afghanistan, meant they weren't going to _get_ a uniform. They certainly have a right to public trial, but by and large they were probably arrested legi
Re:country is not at war (Score:3, Insightful)
Damn the terrorists to hell! I pray to God that He will strike all those who think like the terrorists down, and thrust them into the deepest recesses of hell. How can He be a God of Justice and Love if He allows this kind of crap to go on unpunished? They are not honorable, and they should feel DAMN lucky we didn't go and slaught
Re: country is not at war (Score:3, Informative)
> There hasn't been a real declared war since WWII. You can't "declare war on terrorists" and be done with it either, wars are supposed to be declared on countries when you go to fight them.
Also, US wars have to be declared by the Congress rather than by the White House... or at least that's the way it worked back when the Constitution still meant something.
Seriously though... (Score:4, Interesting)
While anything is possible in politics, is it possible that the web admin is trying to limit the amount of traffic on the site? Is it possible that his analysis of the weblogs show a lot of traffic from robots looking for Iraqi-related info?
Questioning any statements put out by the WH ... (Score:4, Interesting)
If you persist in contemplating a world where whatever statements that the WH puts out, no matter how they might seem to contradict previous statements, are not totally true and correct, then a relocation expert from Guantanamo will be by in a few minutes. Just step away from the computer.
Orwell (Score:2)
Everything Iraq.... (Score:5, Informative)
But not a problem, on google.com I just specify the site by saying 'Iraq site:whitehouse.gov' and it had 14,000 hits... the first one is the root of
Re:Everything Iraq.... (Score:4, Informative)
Next time it crawls the site it won't read the forbidden directories and will delete them (if present) from the Google Cache, essentially erasing any official iraq history from google (and other search engines)
Truly Frightening. (Score:5, Funny)
Obviously, they're keeping people from accessing the top-secret teeball Iraq files [whitehouse.gov] ! Besides:
check out these other frightening examples of censorship:Truly frightening.
Re:Truly Frightening. (Score:2, Funny)
Re:Truly Frightening. (Score:3, Insightful)
Barney, agent provacateur of the CIA? You Decide (Score:5, Funny)
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Which is the same number as "cheney", "powell" had 4, "saddam" didn't have any and "bush" only comes up with "bushpets".
Clearly, there is something to do with Barney and Iraq that The White House doesn't want you to know about.
myke
not that far fetched (Score:2, Offtopic)
The current administration is trying its damndest to control infomation that it doesn't like
Re:not that far fetched (Score:3, Informative)
I, for one... (Score:5, Funny)
Diff between fact and fict: Fict must be believed (Score:3, Insightful)
Winston's greatest pleasure in life was in his work. Most of it was a tedious routine, but included in it there were also jobs so difficult and intricate that you could lose yourself in them as in the depths of a mathematical problem -- delicate pieces of forgery in which you had nothing to guide you except your knowledge of the principles of Ingsoc and your estimate of what the Party wanted you to say. Winston was good at this kind of thing. On occasion he had even been entrusted with the rectification of the Times leading articles, which were written entirely in Newspeak. He unrolled the message that he had set aside earlier. It ran:
times 3.12.83 reporting bb dayorder doubleplusungood refs unpersons rewrite fullwise upsub antefiling
In Oldspeak (or standard English) this might be rendered:
The reporting of Big Brother's Order for the Day in the Times of December 3rd 1983 is extremely unsatisfactory and makes references to non-existent persons. Rewrite it in full and submit your draft to higher authority before filing.
It does seem questionable... (Score:3, Insightful)
Chris
No Christian Holidays (Score:2)
Disallow:
Does this mean they're going to ban Christmas in Iraq too?
related links (Score:5, Interesting)
Here's a minor example of something those two sites didn't catch: Remember Iraq's so-called "mobile biological weapons factories" [fas.org]? A month after the story broke that they were for weather balloons [slashdot.org], the CIA moved their report's URL [informatio...house.info].
An intriguing fact about this whitehouse.gov/*/iraq thing is that they do in fact cover some of the important statements [bway.net] which are apparently not duplicated in the press release, conference, and briefing directories. Perhaps there was a "unique urgency" to cover up some poor choices of words?
let's rejoice george (Score:2)
wow, a webmaster changed his robots.txt. i'm amazed.
Whitehouse doing something right for a change (Score:2)
Disallow:
Thank goodness they're limiting the export of that blasted purple dinosaur!
Didn't work. (Score:3, Funny)
Well either that, or it's simply preventing search engines from indexing honeypot type pages used for mis-information... Either or... but I like the first version... since it's more paranoid, and I have plenty of tinfoil ready to be shaped into hats...
Or (Score:2)
Or maybe, just maybe, they're doing it to save their server from being constantly crawled by paranoid conspiracy-theorists looking for changed statements and information.
SHHH!!! (Score:2, Funny)
What's that? Oh, all of the real top-secret stuff is at the NSA website [nsa.gov]?
Never mind then.
Not conspiracy, but I don't know what it *is* eith (Score:5, Informative)
Re:Not conspiracy, but.. (Score:3, Informative)
Which might explain why at least one of the directories - /infocus/iraq/ - clearly has an index [whitehouse.gov]. However, if they moved or renamed a file under that path, it might be generating 404's. From personal experience, I've had bad requests from Googlebot for files that were over 4 years old.
I have
Re:Not conspiracy, but I don't know what it *is* e (Score:3, Informative)
That subdirectory seems to contain all or most of the transcripts of Ari Fleischer's and Bush's interviews and press conferences leading up to the war and after. An example is this:
http://www.whitehouse.gov/infocus/iraq/excerpts_s e pt26.html [whitehouse.gov]
Re:Not conspiracy, but I don't know what it *is* e (Score:3, Insightful)
I really do wonder what brings people to zealously defend actions like this. Sure, it could be a mix up, but a really ill conceived one. It's obvious that you don't have all the answers, just like others here.
My guess is that the poster feels that Slashdot posters are simply leaping to unjustified paranoid conclusions, and the depth of this faith (or so he pictures it) outrages him (or her).
The intensity of the poster's reaction is simply a reflection of his or her perception of Slashdot readers' zeal.
Most of them are blocked because they're 404's (Score:3, Informative)
Wayback Machine (Score:3, Informative)
Seems odd and pointless to me. I'd like a statement explaining it. A lot like the "Disallow: /hidden/passwd" kind of entries.
Take a look for yourself (Score:3, Informative)
Disallow:
Now, how many pages would this possibly block?
M@
Re:Take a look for yourself (Score:3, Funny)
Disallow:
Soylent Green is Iraqi people!
Not just Bush (Score:3, Funny)
File looks auto-generated (Score:3, Interesting)
Having said that, I'm not even sure that this robots.txt file would work the way it's supposed to. Seems like these iraq references should all have a trailing slash or a
Someone clearly doesn't want Google caching Whitehouse content on Iraq. The question is why? And how come they're so lame about it?
We are at war with Eastasia. Eurasia is our ally. (Score:4, Insightful)
User-agent: *
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
And now, an offering for the lameness filter...
Oceania was at war with Eastasia: Oceania has always been at war with Eastasia. A large part of the political literature of five years was now completely obsolete. Reports and records of all kinds, newspapers, books, pamphlets, films, sound tracks, photographs- all had to be rectified at lightning speed. Although no directive was ever issued, it was known that the chiefs of the Department intended that within one week no reference to the war with Eurasia, or the alliance with Eastasia, should remain in existence anywhere. The work was overwhelming, all the more so because the processes that it involved could not be called by their true names. Everyone in the Records Department worked eighteen hours in the twenty-four, with two three-hour snatches of sleep. Mattresses were brought up from the cellars and pitched all over the corridors; meals consisted of sandwiches and Victory Coffee wheeled round on trolleys by attendants from the canteen. Each time that Winston broke off for one of his spells of sleep he tried to leave his desk clear of work, and each time that he crawled back sticky-eyed and aching, it was to find that another shower of paper cylinders had covered the desk like a snowdrift, half burying the speakwrite and overflowing onto the floor, so that the first job was always to stack them into a neat-enough pile to give him room to work. What was worst of all was that the work was by no means purely mechanical. Often it was enough merely to substitute one name for another, but any detailed report of events demanded care and imagination. Even the geographical knowledge that one needed in transferring the war from one part of the world to another was considerable.
This was written in 1948. Things have really progressed!
1984: simple answer (Score:5, Funny)
Obviously robots.txt just happened to be in the path!
Stupidity riegns supreme (Score:4, Funny)
Re:Stupidity riegns supreme (Score:3, Informative)
It's all still there for all to see, but it's not as easy to find. So they can say "We're not hiding anything." while they actually hide it.
Things that become inconvenient or embarrassing after the fact are hard to hide. At the time this quote by Dick seemed reasonabl
Not a suprise (Score:3, Funny)
Disallow:
Disallow:
Disallow:
Disall
Disallow:
Disallo
Di
Disall
Disallow:
Dis
Disallo
Dis
Disall
Disallow:
Disallow:
Disallow:
Disall
Disallow:
Disallo
Disallow:
Two conspiracy theory leaps (Score:3, Insightful)
1) First, a lot of these docs involve Iraq. So, wihtout real factual information, it's assumed they're trying to do something fishy regarding Iraq info
2) Using that assumption, the next assumption is that they're purposely trying to keep people from trying to find contradictory statements.
This could all be true, or it couldn't be. Either way, by making two assumptions without any real facts is just pathetic yellow journalism.
I have no problem with this (Score:3, Insightful)
Is robots.txt enforceable? (Score:3, Interesting)
Would the White House sue for violation of the robots.txt file? Under what laws could they sue? Is robots.txt an implicit grant of permission to view copyrighted content? Would GWB press the Congress for a new bill, to mandate legal enforcement of the robots.txt?
That's probably not going to happen anytime soon, but it raises an interesting question. Is robots.txt legally enforceable? And if it was, would that be a good thing or a bad thing?
Your thoughts?
Re:More American Cencorship (Score:3, Insightful)
They do, it's called voting, not to mention public opinion polls, which were near 70% for the invasion when the US invaded.
Re:More American Cencorship (Score:5, Insightful)
And 70% of the people in this country STILL think that Saddam played some part in 9/11. What was your point again?
Comment removed (Score:4, Insightful)
Re:A CLASSIC QUOTE... (Score:5, Insightful)
Paranoia aside, I object to these restrictions as a matter of principle. They're making it more difficult to access publically available information. It's not classified, and it never was. I, as a citizen of the U.S.A., have a right to know what my leaders have said and done.
Let's assume the whitehouse.gov search engine is completely honest, and faithfully returns a complete listing of all materials on the site having to do with Iraq. If that's so, then there should be no reason to disable other search engines, since their results would just confirm the internal results.
But the restrictions are in place, meaning that someone thought there was a good reason to do so. Restricting access makes it more difficult for people to research information pertaining to Iraq on the whitehouse.gov web site. Who are the people most likely to be doing that? Answer: journalists, activists, and concerned citizens. Obviously these restrictions aren't enough by themselves to dissuade a determined researcher; but it might slow them down. And it might actually stop a diffident researcher completely.
I'm not even going to go into scenarios where the whitehouse.gov search engine is not trustworthy, because serving up "doctored" speeches or information is highly unlikely. There are too many other archives to compare against, and it would be a major scandal if the administration was found to be altering records on its website. They'd have to be really, really dumb to do that.
The whole thing still leaves a bad taste in my mouth, though.
Re:A CLASSIC QUOTE... (Score:5, Insightful)
The other rule for transparency is that all material information be made available, kept, or destroyed in accordance to public regulation and individual policy. Individual policy must be consistent and decisions must be defensible based on policy.
The fact that people do not understand these two aspectsof transparency are what allow situations like Enron to develop. The later is what caused the destruction of Arthur Anderson. They have done nothing wrong, but they did not follow their own policy on document destruction, which made then look like at best idiots and at worst criminals.
We may compare this to other ventures to suggest policy. The NYT does not want google to cache articles because the NYT sells those articles after a certain time. Many other companies do not want deep linking because it reduces ad revenue. A fascist government may want to insure all users enter their site from a top page to make sure all users must go through the daily propaganda. A library tries hard to not track patrons so that no is afraid of using the library. The rational of the White House is beyond me.
The White House is not hiding documents. However, they are reducing the transparency of the government by limiting the avenues by which the public may access documents. Since the White House has stated many times that it believes in transparency, and in fact requires transparency when dealing with other governments, one can stipulate that transparency is the appropriate standard. So, until someone comes up with a policy that was developed and vetted through the normal processes used in the U.S., one has every reason to suspect nefarious motives.
And, if I may modify a statement that conservatives like to make, if you do not like transparency, go move to Iraq.
Re:Oh please (Score:5, Insightful)
I have to admit, when I first read the story I thought someone was being paranoid. But you really should RTF robots.txt file before you accuse the poster of being paranoid. The disallowed files are extraordinarily specific. I really can't come up with a plausible explanation beyond simoniker's.
Re:Oh please (Score:5, Interesting)
Re: (Score:3, Funny)
It must be true... (Score:2)
Re:Interesting allegation... (Score:3)
Re:Interesting allegation... (Score:5, Informative)
Compare the screenshots of what used to be on the white house website vs what's currently on the website.
Yes, I know, "how do we know this blogger didn't alter the screenshots?" You don't.
Re:EXACTLY (Score:5, Insightful)
Let's present an alternate scenario - since you have no evidence for yours, I don't have to present any evidence for mine.
It's May - Pres. makes his speech on the Carrier, the assumption by those-in-charge are that Chalabi's government will have control of the country within a couple of weeks and the US troops will be heading on home. The web folks (who want to make B & C look good) declare "combat's done! the troops are coming home! re-elect Bush!"
A few months later, that rosy scenario hasn't quite panned out. The aircraft carrier speech is becoming a liability for Bush - people started counting the number of dead troops in Iraq since he gave the speech, and it keeps going up. The web folks (who want to make B & C look good) say to themselves "this is a potential embarrassment to the president - let's see how we can make it less embarrassing."
And there you have it.
Re:that's because those are bad links (Score:3, Informative)
Disallow:
http://www.whitehouse.gov/infocus/iraq is a valid URL.
Re:Why the fuck does the government use robots.txt (Score:4, Insightful)
Nosirree, no legitimate webmaster would ever use robots.txt to gently guide visiting bots to the appropriate parts of the site and to keep them from trying to do silly things. The only possible use is to trample your rights while installing the new corporate-owned government.
Geez, people. Honestly.
Missing Iraq and 9.11 files (Score:5, Informative)