Thousands of Sites Wrongly Blocked 21
Ben Edelman writes: "In the context of the ACLU's pending
challenge to the Children's
Internet Protection Act (PDF), I recently prepared a list of some 6000+ web sites that, by and large, fail to meet the category definitions of popular Internet filtering programs yet are blocked by at least one such program. This topic may be old hat, but my work is new: I have prepared an unusually large list of sites (including police departments, libraries, home-schooling sites, candidates for political office, and on and on), and I have retested these sites over a period of several months."
Hello? (Score:1)
Re:Hello? (Score:2)
To stay on topic, it's a law in michigan that all publicly funded internet access terminals (schools, libraries, etc) have content filtering software installed except when said terminal is in a college enviroment. All K-12 schools and public librarys have to have this filtering though to contiue to recieve public funding. Tossed quite a panic in the ISD that I work with from time to time.
Re:Hello? (Score:2, Interesting)
But, seriously, I quit reading the report after I got to the "I received $XXX per hour for my work..." A quick scroll through the rest of the document revealed block upon block of deleted text, and I'm really only interested in that if the deleted text is worth figuring out.
The Most Interesting Parts; Protective Order (Score:4, Informative)
For example, http://cyber.law.harvard.edu/people/edelman/mul-v
Regarding the blacking out of certain text from my report: As http://cyber.law.harvard.edu/people/edelman/mul-v
Ben Edelman
Experiment - censorware collateral damage verify (Score:5, Informative)
[Originally sent to a mailing-list]
In honor of the censorware material just released by ACLU, I thought I'd try a little experiment in distributed verification.
I took one example from Edelman's report:
16. Southern Alberta Fly Fishing Outfitters #6809 /Regional/Countries/Canada/Business and Economy/Shopping and /Regional/North America/Canada/Alberta/Recreation and
http://www.albertaflyfish.com [albertaflyfish.com]
Blocked by: N2H2 (Pornography - Sep 11, Oct 7), Websense (Sex - Jul 5,
Aug 18, Sep 11)
Yahoo:
Services/Outdoors/Fishing/Fly Fishing/Lodges/
Google:
Sports/Fishing
Fly fishing in Alberta Canada on the world famous Bow River.
Now, what does censorware have against this site? Maybe it doesn't like too many 'Fly' references in one place? No, it turns out that this site has the misfortune to be virtually hosted and share an internet address with:
http://clubexoticx.com [clubexoticx.com] - Club Exoticx
There's a bunch of other completely innocuous sites suffering the same collective guilt of the censorware blacklist. I'd like people to go to N2H2's lookup, at http://database.n2h2.com/cgi-perl/catrpt.pl [n2h2.com] and *verify* this for themselves by testing the following sites:
http://albertaflyfish.com [albertaflyfish.com] - Southern Alberta Fly Fishing Outfitters
http://alistairbrown.com [alistairbrown.com] - Alistair Brown Folksinger
http://eclothing.com [eclothing.com] - 'The Game Is On Sportswear Company Ltd.'
http://effectivemanagementsolutions.com [effectivem...utions.com] - Effective Solutions
http://eligh.com [eligh.com] - Springboard Consulting
http://eyepowered.com [eyepowered.com] - E Y E P O W E R E D - 360 Degree Panoramas
http://friendlyfacesonline.com [friendlyfacesonline.com] - Create personalized family cartoon
http://gear4pickups.com [gear4pickups.com] - Gear4Trucks: HitchHoist Portable Truck Crane,
http://informationonhold.com [informationonhold.com] - Information On-Hold
http://letsmakewine.com [letsmakewine.com] - Let's Make Wine
http://planetregister.com [planetregister.com] - Planet Registe
http://ppt-slides.com [ppt-slides.com] - 35mm Slides from your computer file
http://proteach.net [proteach.net] - Pro Teach Main Page - Baseball instruction
http://rosiedonovan.com [rosiedonovan.com] - Rosie Donovan Photography
http://springboardtoinnovation.com [springboar...vation.com] - Springboard Consulting
Here, I'll make this easy. Just click these URLs:
http://database.n2h2.com/cgi-perl/catrpt.pl?req_UR L=http://albertaflyfish.com [n2h2.com] R L=http://alistairbrown.com [n2h2.com] R L=http://eclothing.com [n2h2.com] R L=http://effectivemanagementsolutions.com [n2h2.com] R L=http://eligh.com [n2h2.com] R L=http://eyepowered.com [n2h2.com] R L=http://friendlyfacesonline.com [n2h2.com] R L=http://gear4pickups.com [n2h2.com] R L=http://informationonhold.com [n2h2.com] R L=http://letsmakewine.com [n2h2.com] R L=http://planetregister.com [n2h2.com] R L=http://ppt-slides.com [n2h2.com] R L=http://proteach.net [n2h2.com] R L=http://rosiedonovan.com [n2h2.com] R L=http://springboardtoinnovation.com [n2h2.com]
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
http://database.n2h2.com/cgi-perl/catrpt.pl?req_U
You should get
The Site: [all sites above]
is categorized by N2H2 as:
Pornography
If there's some error-message text in a red font, that means the N2H2 program itself wasn't working, try again.
Now, since I've publicized this, I expect it'll be changed very rapidly for this one item. I have a saying: "Alacrity varies directly with publicity". But this is just one example in a HUGE blacklist. What else is lurking in there?
Sig: What Happened To The Censorware Project (censorware.org) [sethf.com]
Re:Experiment - censorware collateral damage verif (Score:2)
For another example discussed, see
http://sethf.com/anticensorware/cyberpatrol/247for 1.php [sethf.com]
Regarding the topic of "banning entire IP subnets", MAPS and other spam blacklists don't do that as an implementation effect. They do it as a deliberate tactic. I don't want to get into that topic too much here, but it's a social issue, not a technological one.
Sig: What Happened To The Censorware Project (censorware.org) [sethf.com]
Re:Experiment - censorware collateral damage verif (Score:2)
Its a real problem. (Score:2, Informative)
But it seems that someone else disagreed with me, and now it is categorized as 'satire'. Exactly how a site with such poor standards of journalistic integrity is allowed in that category amazes me.
I have now added adequacy.org to my junkbuster file, so I never have to see it again.
Re: (Score:2, Informative)
Many of those sites are *NOT* wrongly blocked (Score:2, Informative)
Re:Many of those sites are *NOT* wrongly blocked (Score:3, Insightful)
1) I agree that some portions of content on some of the sites on my list have been correctly categorized. But in the instance you described, it sounds like the specific URL on my list doesn't contain content meeting filtering programs' category definitions. As a result, even if there's reason to categorize other content on that same server, there's no need to categorize this specific page.
(To put this a different way: Many of the filtering programs seem to classify entire sites -- all content on an entire domain name, for example. But there's no reason why pages couldn't instead be rated on a page-by-page basis [and indeed some filtering companies report that they do this, too, in at least some instances]. To the extent that programs fail to do review and separately categorize every individual page, they may overblock pages without content meeting their criteria.)
2) There's no doubt that some URLs on my lists actually do meet filtering companies' category definitions. I'm no librarian, and neither am I otherwise trained in content categorization, so it wasn't my job to identify this content. (Plus, as you can imagine, it's a large task to view many thousands of sites!) Instead, librarians reviewed certain of the sites (including a random sample of the entire list) to attempt to estimate the proportion of sites from my lists that are, in their professional opinions, suitable for use within a library. It's my understanding that the results of their study are forthcoming.
Re:Many of those sites are *NOT* wrongly blocked (Score:1)
Re:Many of those sites are *NOT* wrongly blocked (Score:1)
Re:Many of those sites are *NOT* wrongly blocked (Score:1)
Re: (Score:2)
Slander? (Score:2)
categories (Score:1)