The DOJ's New Spin on Blocking Software

Bennett Haselton has writes "In recent arguments over the constitutionality of the Child Online Protection Act, both sides have argued over the efficiency of Internet blocking software. While COPA would prohibit commercial U.S. websites from publishing freely available material that is "harmful to minors", the ACLU has argued that blocking software is a far more effective alternative, since among other things it can block porn sites located overseas, non-commercial websites, and p2p programs, all of which are beyond the reach of COPA. On the other hand, we had the surreal experience of watching the Department of Justice lawyer arguing in favor of a censorship law by saying that the blocking software alternative was unfair to children -- because it blocked too much legitimate material." The rest of Bennett's essay follows.

"For example," said DOJ attorney Eric Beane during opening arguments, "one filter even blocked a website promoting a marathon to raise funds for breast cancer research. Part of the CIA's World Fact Book was blocked. And a page with an ACLU calendar. [Blocking software blocks] a significant portion of other materials on the World Wide Web, materials that in many cases are necessary for a child to complete his homework." (Opening arguments transcript, p. 37.) As someone who has been publishing critiques of blocking software for years, I read those words and felt like cheering, despite the fact that I'm sitting in the other side's fan section for this match. (Beane is right, but he's missing the point, which is that whatever problems exist with blocking software, are minor compared to the problems with COPA -- because blocking software raises no constitutional issues when it's used by a private party in their own house, whereas COPA affects everyone in the U.S.)

The irony, of course, is that three years ago, in the trial over the similarly-named Children's Internet Protection Act (CIPA) which required blocking software in all schools and libraries that receive federal funds, it was the ACLU pointing out the flaws in blocking software and the Department of Justice claiming that blocking software was accurate and effective.

At first it would seem that both sides are now guilty of flip-flopping. But reviewing what was said then and what was said now, my conclusion is that the ACLU did nothing more than shift their focus to a different set of facts, while the government did contradict themselves. And the source of this seeming flip-flop actually comes down to something pretty simple: two different ways of stating one set of numbers.

Now before going further I can't resist saying that I think the whole debate over "harmful to minors" material is pretty silly, because I don't think the pro-censorship side has ever put forth a reason why they think that pictures of naked people, or even people having sex with each other, are harmful to people under 18. I disagree with some people on matters like abortion and the death penalty, but I at least think they have some facts on their side; but I don't know of any facts supporting people who think that pornography is dangerous. Why is a woman's nipple harmful but a man's nipple isn't? How are the majority of high school students who have already had sex anyway, supposed to be harmed by pictures of other people having sex? And apart from the logical paradoxes, the pervasiveness of the Internet has now given us empirical data too: virtually all minors have now have access to anything they want to get on the Internet (either at home, or by sneaking to a friend's house), and where's the evidence that adolescents' brains have been hormonally turned to mush any more than they always have been?

But for the remainder of the discussion, suppose you're addressing people who believe that nudity and sexual material really are harmful to people under 18. (In any case, the judges probably believe it, and even if they don't, they're bound by legal precedents that assume as much.) The question is how accurately blocking software achieves this goal.

Blocking software has two types of error rates: underblocking (failure to block porn sites) and overblocking (blocking of non-pornographic sites). Underblocking errors are usually expressed one way: the percentage of porn sites in a given sample that are not blocked. But overblocking errors can be stated in two ways: the percentage of non-porn sites that are blocked, or the percentage of blocked sites that are not pornographic. (There are borderline cases like nude art sites, but it turns out they're not common enough to affect the margin of error much; the vast majority of sites are either clearly porn or clearly not.)

The key is that if you want the overblocking rate to sound low, you talk about the percentage of non-porn sites that are blocked. If you want it to sound high, you talk about the percentage of blocked sites that are non-porn.

For example, in the 2003 Supreme Court arguments over CIPA, Department of Justice attorney Theodore Olson downplayed the error rates of blocking software by saying:

"But even if it's tens of thousands of the -- of the 2 billion pages of material that is on the Internet, we're talking about one two-hundredths of 1 percent, even if it's 100,000, of materials would be blocked."
Here he's referring to the percentage of non-porn sites that are filtered. Attorney Paul Smith, arguing against the law, countered:
"And so we have -- on these lists is a proportion, a huge proportion, perhaps 25, perhaps 50 percent of the sites that are blocked that are not illegal even for children."
"And the evidence is that there's about 11 million websites on the Internet, in --in the accessible part of the Internet and that 100,000 of those are the sexually explicit ones and that the --there are at least tens of thousands more that are on the list. So it's --the Government also says in their brief that about one percent of the Internet is over- blocked, which would be about 100,000 sites. So it is a substantial percentage. It is also a substantial amount. And most importantly, it's a very large percentage of what they're blocking is not what they intend to block."
-- that is, talking about the percentage of blocked sites that were non-pornographic. Both sides cited the same figure (100,000 non-pornographic sites blocked, apparently referring to an average across all blocking programs) -- but that same number could be seen as an "error rate" of either one hundredth of one percent, or 50%, depending on which formula you use.

Then in this year's COPA trial, the ACLU called CMU professor Lorrie Faith Cranor who testified that in tests that she reviewed,

"[blocking software programs] correctly blocked an average of approximately 92 percent of objectionable content. And they incorrectly blocked an average of 4 percent of content not matching the test criteria."
(Oct. 24th transcript, p. 57.) Back to talking about the percentage of non-porn sites that are blocked -- which, again, when you put it that way, sounds low. On the other hand, although I couldn't find exact numbers cited by the DOJ's lawyers on the number of sites that were incorrectly blocked, in the portions of his opening argument quoted above, Eric Beane focused on the sad fact of the sites that were blocked -- not the fact that they comprised only a tiny fraction of sites on the Web. The two sides simply swapped formulas.

As for Peacefire's own studies over the years of blocking software error rates, one of the legitimate criticisms that could be made about our efforts was that we focused almost exclusively on the second number, the percentage of blocked sites that were non-porn. If you were interested in how blocking software actually affects the surfing experience of minors who are forced to use it, perhaps you would focus more on the first number, the percentage of non-porn sites that are blocked. Perhaps, you might say, that as an organization addressing the blocking software issue specifically from a minors' rights point of view, we really should have focused on that number quite a bit! But I did get a bit preoccupied with playing "gotcha" with the blocking companies, focusing on the percentage of blocked sites that were obvious mistakes, because it was frankly too much fun publicizing the absurdly high error rates of their programs, which belied the claims made by most blocking companies that all sites on their blacklist were examined by a human at their company before being added. (Although it seems to have done some good -- as far as I know, no blocking company is making that claim about their product today.)

The error rates were indeed absurdly high; we took a sample of the first 1,000 .com domains in an alphabetical list, ran them through several programs, and found that of the sites blocked, between 20% and 80% (!) were errors. (The median error rate was about 50%, which corresponds to the figure given by Paul Smith in the CIPA trial oral arguments quoted above.) This surprised even critics of blocking software, and skeptics complained that we must have made mistakes or simply fudged the numbers. (The whole point of using the first 1,000 .com domains was that if we had used a random sample and gotten error rates like that, we could have been accused of "stacking the deck" and using a fake random sample that was loaded with known errors and not truly random.) Years later, it came out that the companies whose products we'd tested, had been following a policy that if they found an objectionable site on a given IP address, all sites on that IP would be blocked, on the theory that hosting companies often group porn sites together on the same machine. Trouble was, while this may have often been true for bona fide porn sites, it was not true for most sites that featured just an incidental shot of someone's bare breasts or a large amount of profanity -- but this would also be enough to get all sites blocked at a given IP. So the 80% error rate was about what you'd expect after all.

You might think that a product with an 80% error rate could never survive in the marketplace, but consider who was buying the software. On the one hand, you had schools and companies buying the programs -- but they didn't care whether it worked so much as they cared about being able to show, for liability reasons, that they did something. On the other hand, you had parents who really did care about keeping porn off their computer -- but how many parents really did any thorough testing of the product, other than making sure it blocks the obvious sites like A serious test could take days. Their kids are the only ones who would end up doing any thorough "testing" of the product, and if they found a way around it, it's not likely that they would tell their parents. With no market pressure to fix problems, an 80% error rate wasn't really surprising.

But even the most vocal critics of blocking software only pointed out that blocking software sometimes blocked sites about plumbing, or soccer, or aluminum siding; we never claimed that most of those sites would be blocked. Even with our high numbers of wrongly blocked sites, if they had been expressed as a percentage of non-porn sites that are blocked, they would have still sounded like a "low error rate".

The moral is, always keep track of what the "error rate" refers to in these debates. By moving around a few variables in a formula, the Department of Justice was able to go from saying in 2003 that blocking software was minimally intrusive, to making a speech in 2006 that made blocking software sound so tragically limiting that you could practically hear the violins playing. (I know, people who live in glass houses... *ahem*)

And what about the ACLU? If the Department of Justice is guilty of flip-flopping, from saying in 2003 that blocking software is a reasonable and narrowly tailored solution, to saying in 2006 that it's clumsy, ineffective, and overbroad, is the ACLU guilty of flip-flopping in the opposite direction?

Actually, the ACLU's position has always been consistent: blocking software has First Amendment problems when used in a school or library, due to overblocking and underblocking errors, but if used in the home it is still a lot more effective than a law like COPA, which would score pathetically on the same scale. As ACLU attorney Chris Hansen stated in opening arguments:

"COPA does not reach the 50% of all speech that is overseas... Filters are the most effective. Almost all of the filters that [expert witness] Mr. Mewett tested were at least 95% effective. Think about the 5% ineffectiveness compared to where we start with COPA being 50% ineffective..."
(Opening arguments, p. 22. Note: Chris Hansen has confirmed that the official transcript is wrong; it has him saying "35%" instead of "95%", which wouldn't make any sense.) As for overbreadth, COPA would criminalize speech by adults, intended for adults, something that no blocking program could ever do -- and as for minimizing collateral damage to innocent sites, does anyone think that even if COPA is upheld, parents will throw out their blocking software?

Even though the ACLU focused on different statistics in the two trials, in both cases they were focusing on the numbers that were relevant to the issue. When talking about constitutional problems with blocking software in schools and libraries, the percentage of blocked sites that are incorrectly blocked, is important, because it's their First Amendment rights that are at issue. The DOJ lawyer talking about all the sites that weren't blocked, was missing the point. If your site is being blocked, it hardly matters to you that for every blocked site there are hundreds that are not. "Hey, your site is not accessible, but don't worry, your competitors' sites are!"

On the other hand, when talking about the use of blocking software in the home, the publisher's First Amendment rights are not at issue; the issues that most parents would care about, are how effective it is, and whether most clean sites are still accessible. Well of course most of them are. Blocking software is not that bad.

Confused? The option to just stop making a big deal out of porn on the Internet is looking better all the time, isn't it?

