Pakistan Looking For Homegrown URL Blocking System 97
chicksdaddy writes "Tech-enabled filtering and blocking of Web sites and Internet addresses that are deemed hostile to repressive regimes has been a major political and human rights issue in the last year, as popular protests in Egypt, Tunisia, Libya and Syria erupted. Now it looks as if Pakistan's government is looking for a way to strengthen its hand against online content it considers undesirable. According to a request for proposals from the National ICT (Information and Communications and Technologies) R&D Fund, the Pakistani government is struggling to keep a lid on growing Internet and Web use and is looking for a way to filter out undesirable Web sites. The 'indigenous' filtering system would be 'deployed at IP backbones in major cities, i.e., Karachi, Lahore and Islamabad,' the RFP reads (PDF). It would be 'centrally managed by a small and efficient team stationed at POPs of backbone providers,' and must be capable of supporting 100Gbps interfaces and filtering Web traffic against a block list of up to 50 million URLs without latency of more than 1 millisecond."
Comment removed (Score:4, Interesting)
Re:A government that seems to understand the Inter (Score:5, Interesting)
Re:A government that seems to understand the Inter (Score:4, Interesting)
Maybe but URL filtering in under 1ms with any sizeable list of URLs is going to be pretty darn impossible. Its pretty tough to do much of any thing to traffic that requires any sort of lookup that fast. I mean DRAM fetch is 5+ns.
Even if you can search your lookup table fast enough keep in mind you are not just comparing values at fixed offsets like NAT and IP Access lists and similar need to you first have to figure out is this traffic http? Locate the host header and read until new line. Non of that is especially time consuming but its still going to be a chuck of that already tight ms.
Re:A government that seems to understand the Inter (Score:5, Interesting)
1 millisecond is 1,000 microseconds or 1,000,000 nanoseconds. A 2 GHz CPU runs at least one instruction every nanosecond and usually more like 6-12 instructions. As you say, the DRAM fetch is significant, but a well-designed B-tree database already loaded in RAM reduces the impact because of good algorithm design.
It's like an eternity in CPU time.
Of course, you can't write the code in Python, Perl or Ruby. You have to use C++.
With so many illiterates (Score:3, Interesting)