Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
The Internet Security Your Rights Online

Tool Detects "In-Flight" Webpage Alterations 197

TheWoozle writes "In a follow-up to a recent story about ISPs inserting ads into web pages, the University of Washington security and privacy research group has teamed with the International Computer Science Institute (ICSI) to develop an online tool to help you identify if your ISP is inserting ads or otherwise modifying the web pages you request."
This discussion has been archived. No new comments can be posted.

Tool Detects "In-Flight" Webpage Alterations

Comments Filter:
  • by nokilli ( 759129 ) on Wednesday July 25, 2007 @11:29AM (#19983869)
    If that isn't desirable, do a patch to Apache that creates a header that holds a hash of the content.
    The hash gets calculated once for static content, which is usually the bulk of the traffic, no? So
    not too big of a hit.

    Browser sees content. Browser sees hash. Browser compares the two...

    --
    Censored [blogspot.com] by [blogspot.com] Technorati [blogspot.com] and now, Blogger too! [blogspot.com]
    • Frames (Score:3, Insightful)

      by benhocking ( 724439 )
      What if the ISP is simply putting the web-page in its own frame, and the advertisement in a second frame? Unless you add the ability for web-pages to dictate that they should not be in frames, this one can't really be trapped for like that. The ISP could create its own hash for the served web-page that holds the frames.
      • by XanC ( 644172 )
        While I'm not sure why frames are any different from whatever other kind of content modification, you're right that the ISP could modify the hash, so GP's idea apparently won't help. SSL would...
      • i dont think they could manage to do this without it being obvious to the user. frames arent exactly subtle.
      • Re: (Score:2, Interesting)

        by mdm-adph ( 1030332 )

        ...Unless you add the ability for web-pages to dictate that they should not be in frames, this one can't really be trapped for like that...

        <script language="JavaScript" type="text/javascript">
        <!--
        if (top.location != location)
        {
        top.location.href = document.location.href ;
        }
        -->
        </script>
        That should do it. ;)
        • Re: (Score:2, Interesting)

          by VGPowerlord ( 621254 )
          If the ISP is inserting it into a frame on the fly, you've successfully created a page that will continually try to reload itself, as it will never be the topmost ancestor.
      • Re: (Score:2, Interesting)

        by ixl ( 811473 )
        The hash would have to be signed by the originating website. So the frame would be detected, because the hash wouldn't be signed by the domain name that created the other content. Browsers could also display (at least) a warning when an unsigned frameset included a signed frame.
      • Re: (Score:2, Interesting)

        > What if the ISP is simply putting the web-page in its own frame, and the advertisement in a second frame?

        What if we just jail the billionaires who own the ISPs for altering the copyrighted content of web pages?

        A 99.9999997183% decrease in salary for hours worked accompanied by a change in lovers from Big Boobs to Big Bubba might be just what the doctor ordered.
    • by Bigby ( 659157 )
      What if the ISP, having the server's (Apache HTTPD) code, recomputes the hash in the same manner.

      Browser sees content. Browser sees hash. Browser compares the two...gets an OK.
      • by eheldreth ( 751767 ) on Wednesday July 25, 2007 @01:45PM (#19985897) Homepage

        What if the ISP, having the server's (Apache HTTPD) code, recomputes the hash in the same manner. Browser sees content. Browser sees hash. Browser compares the two...gets an OK.

        1.) Claim the hash is to protect the copyright on your site
        2.) Sue any ISP that alters the site without permission under the DMCA
        3.) ???
        4.) Profit!
    • But the ISP would just need to alter the header with the new hash for the adulterated page (which he can calculate as easily as the browser can). Also, this is no good for Ajax...
    • by J'raxis ( 248192 )
      Not just a hash, but a message digest [google.com].
  • Do ISPs really do this? I've never really noticed anything like this.
    • Re: (Score:2, Interesting)

      by Anonymous Coward
      My hosting service (the University of Minnesota) sticks a little legal disclaimer (some h5 tags) in a contrasting colot at the bottom of every HTML page it serves for non-official accounts. It's the typical "The University of Minnesota is not responsible for the content...blah blah blah" message.
      • by HeroreV ( 869368 )
        That's hosting. This is about ISPs.
      • by Knara ( 9377 )
        That's different. U of MN does that on student accounts that are hosted on their servers. This is basically the ISP you have network connectivity to the Internet through, intercepting an html page on its way to your browser, re-writing it to include their ads, and then sending it the rest of the way to your computer.
    • Do ISPs really do this? I've never really noticed anything like this.

      None! None whatsoever. No carrier would do that, because it would be unseemly.

      [ARE YOUR SEEMS TOO TIGHT? YOU NEED ACME MAGIC WEIGHT LOSS SUPER DIET GINGER ROOT SUPPLEMENT!]
  • When was the last time I saw an ad of a rival to Verizon in my verizon dsl line, I wonder.
  • by db32 ( 862117 ) on Wednesday July 25, 2007 @11:32AM (#19983923) Journal
    Do we sue the ad folks for inserting ads and stealing content? I mean, in just about any other medium this would wind up in court overnight as copyright and stolen content and so on. But now we have a circumvention tool to detect it...so are we going to get sued under DMCA like nonsense for attempting to circumvent the ad insertion?

  • I like UW and their tools. I think they've done wonderful work. Paint.NET is fun, easy, and I love that they are still working on it.

    Who/what is able add to your pages:

    • Host ISP
    • browser
    • plug-ins
    • End User ISP? - in other words, your hosting ISP most definately can add to your page. But, can the end-users ISP, insert it into to the stream as it passes through? Technically, this would be feaseble. Are there examples of this?
    • Why not? One of the win32 desktop firewalls (zone alarm pro? IIRC) would modify HTML code on the fly to remove javascript calls to window.open and a few other things to "stop popups"... I guess that the viewer's ISP could easily run everything thru a transparent proxy and either change the page or change the images (see "upside down internet", etc.)
  • by proverbialcow ( 177020 ) on Wednesday July 25, 2007 @11:33AM (#19983929) Journal
    ISPs intercepting, altering results from online security tool
    • Lest you think I'm merely joking, FTFA:

      Caveat 2: Our integrity checking mechanism is not cryptographically secure. If a "party in the middle" were modifying web pages that you visit, it could modify our scripts as well. Instead, our mechanism acts as a "tripwire" that is likely to catch any party that is currently unaware of our experiment. In the future, we could create a huge number of variants on the JavaScript tripwire. This would make it more difficult for a "party in the middle" to reliably determine
    • by nweaver ( 113078 ) on Wednesday July 25, 2007 @12:08PM (#19984493) Homepage
      We are specifically worried about this case. But we have some thoughts on how to make it more difficult for someone to do that, which will probably end up in a full paper later.
      • Why not simply sign the pages with your private key, rather than just a hash, and distribute the public key over HTTPS. For bonus points, do this with an embedded javascript file that is referenced by an HTTPS URL (and if you're doing this, you could even embed the key in the script), to prevent that being modified. This way you can still distribute most of your contents over plain HTTP and use client-side verification for everything else.
      • "But we have some thoughts on how to make it more difficult for someone to do that, which will probably end up in a full paper later."

        Here is my paper:

        Use SSL.

        Thank you very much for coming. Join us for coffee and danishes in the back.
        • That doesn't really solve the problem. The tool is designed to detect whether or not your ISP is futzing with the HTML it's carrying. Even supposing SSL would make it impossible to corrupt, if the ISP couldn't alter the page if it tried, the tool would register a false negative. You would need to be able to allow the ISP to alter the page, but have the tool be able to detect and report that corruption in some inalterable way.

          What makes this so tricky is that you can't trust ANYthing the ISP is sending to yo
  • by nweaver ( 113078 ) on Wednesday July 25, 2007 @11:33AM (#19983931) Homepage
    We (the authors of the page) will be answering questions in this thread.
    • make a package that can be used as a simple drop-in to a website to detect this. If enough websites implement something that alerts users that the webpage was altered, isp will be forced to stop doing this.
      • by Qzukk ( 229616 )
        isp will be forced to stop doing this.

        That, or ISPs will work harder to defeat the detection.
        • That is a war that this package will win - probably with some cryptographic checks in version 2.0.
          • Eve can perform key exchange with Alice and with Bob, making them think they have performed key exchange with each other.
            • It isn't a question of doing a key-exchange, it's a question of authentication. I'd probably go for getting a checksum from a https:/// [https] page. Verify the certificate of the server, and you've got a valid checksum to use. The ISP would then have to pretend to be your https server to break this.
            • I don't think so.

              The dirtier the ISP has to play to smuggle their ads in, the worse the backlash. Come on - some inserted ads are simply unethical. But if the ISP starts breaking into SSL connections and someone finds out (and they WILL), the ISP is in for a big lawsuit. They may even be committing fraud.
          • Not quite... (Score:4, Interesting)

            by nweaver ( 113078 ) on Wednesday July 25, 2007 @02:44PM (#19986705) Homepage
            This is a war however which we can make damn difficult by using virus-like mutation techniques, so that every checker looks different: force THEM to solve the AV defender arms race.

            As long as the actual API used by the Javascript is common enough that the ad-injectors can't recognize and block our code by keeing in on the API calls rather than the overall Javascript.

            The proper solution, adding integrity checking to all HTTP, seems like its not happening.
            • by nweaver ( 113078 )
              Also, detecting the absence of the checker is insufficient, because Javascript might be turned off.
            • The proper solution, adding integrity checking to all HTTP, seems like its not happening.

              True.

              Sad, but true.
            • Wouldn't a simple solution be to send traffic through https? The protocol exists, all major browsers support it. Some low-end machines might have trouble doing all of the cryptography in addition to page rendering, but multicores and dedicated crypto hardware are both becoming common and could change that.

              After all, encrypted traffic looks like a stream of random numbers to the ISP, right? Hard to modify.
              • Two problems:

                1. SSL is reasonably easy on the client side, but serving SSL still causes a significant performance hit on a server. A server that could handle 5000 requests per second with normal HTTP might only be able to handle 1500 with SSL enabled.
                2. Digital certificate issuance is a horrible mess. If you don't get a certificate from one of a very few issuers, browsers will flip their shit and tell the end user that your page is horribly insecure. Becoming an issuer in that category is possible, but Microso
    • Re: (Score:2, Funny)

      by Anonymous Coward
      Hi,

      What is your favorite flavor of ice cream?
    • Without reading the article (a slashdot tradition), why would your service be any better than using SSL? SSL was designed to detect alterations in content, and has been around for ages.
      • Re: (Score:3, Informative)

        by nweaver ( 113078 )
        Because people don't use SSL, and ISPs are actively inserting adds into web pages.

        ANd click the link anyway, we want to have as many people try it as possible.
        • by jZnat ( 793348 ) *
          People don't go to online banks? Or shop online? Or read their email online? *shock!*

          This is the 21st century where cryptography is common...
          • Re: (Score:2, Insightful)

            by EvanED ( 569694 )
            Oh c'mon. You're looking at the uncommon case. Do you really want to suggest that even a sizable minority of the sites you visit on a daily basis use HTTPS?

            I visit my banking site a couple times a week. I shop online a couple times a month. I read email online more commonly, but not *that* commonly from a web browser.

            By contrast, I visit /. several times a day, I visit Fark a couple times a day, I visit a couple blogs a time or two a day, I visit CNN a couple times a day, I visit a couple other forums a cou
    • Have you found that these services are applying modifications to requested pages that specifically state not be cached with the no-cache option [w3.org]? Have you found these modifications to also apply to AJAX requests?
      • by nweaver ( 113078 )
        We do not check for either of those cases (yet).
        • Re: (Score:3, Informative)

          by csreis ( 1132205 )

          Actually, our test page happens to answer these questions, to some extent.

          All of our test pages are marked with "Pragma: no-cache" and "Cache-control: no-cache" in the HTTP response headers, but we're observing changes to the pages anyway.

          Our integrity checking mechanism uses AJAX requests (XmlHttpRequests) to fetch the test page. ISPs can't distinguish between an AJAX request and a normal page request (i.e., they both look like normal HTTP requests), so they inject ads into both. However, we're

          • Re: (Score:3, Insightful)

            by Compholio ( 770966 )

            ISPs can't distinguish between an AJAX request and a normal page request (i.e., they both look like normal HTTP requests), so they inject ads into both.

            Under normal circumstances AJAX and "normal" requests are the same; however, AJAX has a "setRequestHeader" parameter that can be used to set additional headers. This is significant in that HTTP/1.1 states:

            The Cache-Control general-header field is used to specify directives that MUST be obeyed by all caching mechanisms along the request/response chain.

            You'v

  • by maggard ( 5579 ) <michael@michaelmaggard.com> on Wednesday July 25, 2007 @11:34AM (#19983951) Homepage Journal

    No need for thousands of "All good in Kalamazoo" & "Up to date in Kansas City" posts.

  • by Spy der Mann ( 805235 ) <`moc.liamg' `ta' `todhsals.nnamredyps'> on Wednesday July 25, 2007 @11:43AM (#19984119) Homepage Journal
    A friend of mine had a similar problem with his webpages. They were on a free host (rolls eyes). I wrote a script for him to store special tags to denote the beginning and the end of his webpage content. After the webpage was loaded, a script erased everything and replaced all the html with his marked content. Ta-da, no ads!

    If you want to be stricter, encode your webpage content with base64 to make sure the ads don't intrude your precious content.
    • by Raistlin77 ( 754120 ) on Wednesday July 25, 2007 @12:01PM (#19984411)
      I'll bet that his user agreement with that free host also clearly states that circumventing their added content in the manner that your script does is prohibited. If they discover your script, they'll likely disable his account.
    • A friend of mine had a similar problem with his webpages. They were on a free host (rolls eyes).

      Sounds like someone's being a cheapskate. Paid hosting can be had where you get your own virtual server for $1 a month, though a domain name is extra. For as little as that costs, it's almost not worth any time dicking around trying to counter your free host's means of hosting his site.
    • by Excors ( 807434 ) on Wednesday July 25, 2007 @12:23PM (#19984737)
      For sites like GeoCities that add

      </object></layer></div></span></style></noscript>< /table></script></applet>(...adverts...)
      to the bottom of your page to stop you trying to hide their adverts, it could be good to add <plaintext style="display: none"> to your page just before the point where they add their junk. plaintext is the unstoppable monster [htmlcodetutorial.com] of HTML – there is no closing tag, and the rest of the page will be treated as plain text instead of HTML. It's a slightly obscure feature, but it has better support between web browsers than many other parts of HTML and it can be fun to play with...
    • by HeroreV ( 869368 )
      You're confused. You mentioned advertisements being inserted by the host of the website, but this is about ISPs adding ads to pages they do not host.
  • International Computer Science Institute (ISCI)
    It's ICSI. Pronounced Ee-ksee. It's where they exile you if you're not nerdy enough for Berkeley Computer Science proper, or something ;)
  • by NeoTerra ( 986979 ) on Wednesday July 25, 2007 @11:51AM (#19984249)
    A certain ISP in Canada [userfriendly.org] delt with this not long ago...
  • I've wondered about this for a while as a way to defeat XSS attacks but would be adding some sort of ability to sign the content in a HTML response be beneficial here? You could use your SSL cert to simply add a signature response body for content transmitted over http. I way to inform the browser to expect the signature that the ISP can't strip out may be problematic though.

    The XSS idea would be to have the ability to have multi-part responses from the web server. The browser would put the page together fr
  • I can think of one way to do it - but it wouldn't be too hard for a determined ISP to defeat:

    Step 1: Calculate md5sum of webpage, store in separate location.
    Step 2: Include on the webpage some javascript to md5sum itself and compare this to md5sum in known location. Issue an alert if it differs.
    Step 3: Profit!

    Of course, this is awkward for dynamically generated pages and if the ISP is happy to mess around with the page to insert ads, they're probably also happy to mess around with any javascript which dete
  • by SeanTobin ( 138474 ) * <byrdhuntr AT hotmail DOT com> on Wednesday July 25, 2007 @11:58AM (#19984377)
    It seems that everyone is concerned about downstream modification, and is completely ignoring the possibility of upstream modification. What if Sprint [verizon.com] started modifying upstream http [amazon.com]-posts to start a more viral ad distribution system? Not only would they be able to target their customers [barnesandnoble.com], they would also be able to target the customers of anyone who could read the post!

    This is the reason that we need to push for network neutrality [handsoff.org]. When the only choices are between a giant douche [summerseve.com] which alters content and a turd sandwich [panerabread.com] which alters content, the customer ends up screwed [lowes.com] in the end.

    • What if Sprint [verizon.com] started modifying upstream http-posts to start a more viral ad distribution system?

      Not for nothing, but I'd imagine Sprint [sprint.com] would be more likely to insert an ad for Sprint [sprint.com] than an ad for Verizon [verizon.com].

      Then again, maybe Verizon is your carrier... so maybe you would be directed to Sprint at Sprint [verizon.com].

    • That is an utterly preposterous idea. Not even the most depraved ISP would resort to that.

      Get your free 500 HOURS OF INTERNET [aol.com] today!
  • by ookabooka ( 731013 ) on Wednesday July 25, 2007 @12:12PM (#19984559)
    These guys actually want as much traffic as they can get to get a good idea of what isps are doing what. Go ahead, click online tool. [washington.edu] It's pretty nifty.
  • Old stuff. (Score:4, Interesting)

    by TheLink ( 130905 ) on Wednesday July 25, 2007 @12:31PM (#19984845) Journal
    Years ago on one April Fool's day, I got a list of ad sites (from the usual /etc/hosts files out there), then got the internal DNS server to resolve them to a server that served up the company logo instead (for all possible url paths).

    FWIW, seemed only one person noticed that the forbes page they loaded somehow had the company logos everywhere :). Nope I didn't get fired or even reprimanded - plus even better - I was saving company bandwidth (remember this was years ago)... Nobody complained about the lack of ads from ad.doubleclick.net and gang.

    I toyed with the idea of substituting ads with reminders (meeting at 2pm, or "you have been on slashdot for 2 hours!") and other more useful information.

    Lastly, I don't think their naive hashing thing checks if you are altering the images - the content may remain unchanged, but linked to contents may change (they aren't checked from what I see), so it doesn't work for my scenario where different ads are substituted for the unaltered URL.

    That said, I'm still curious on:
    1) How many ISPs would bother modifying traffic from those 7 destinations they are testing.
    2) What the various laws around the world say about this.
    3) What those laws say about "sponsored internet access" where an ISP gives a cheaper package/plan where the ads are substituted with the ISPs advertisers with the risk of some corrupted info.
    4) What those laws say about "streamlined internet access" where an ISP provides a package/plan where ads and other crap are removed (or modified) for their customer.
  • by Sloppy ( 14984 ) on Wednesday July 25, 2007 @01:16PM (#19985511) Homepage Journal

    ..why not just use SSL?

    I can understand how this wouldn't help with hosting ISPs who insert ads into their own customers' pages, but if you're worried about your readers' ISPs modifying your pages, SSL seems like a no-brainer.

    What's the downside? It can't still be CPU, can it? It's 2007 now, and processing power is ridiculously cheap/fast.

    • by nweaver ( 113078 )
      As I mentioned in another reply:

      a) Certificates are a pain and cost $$$

      b) CPU isn't free. It costs ~.1 CPU second to do an SSL handshake. This is actually a big-deal amount, there is a reason why Gmail defaults to http except for authentication.
      • 0.1CPU-second to do an SSL handshake seems like a huge amount, since this would limit you to ten connections a second. Considering that hardware crypto devices that can handle SSL at over 100Mb/s exist, I'm really surprised that the numbers are that high. Can you cite the system you used to test this? Considering that CPU speeds have been doubling roughly every 18 months for a while, I find it somewhat hard to believe that five years ago it was taking around a whole CPU-second to do the handshake. If th
        • by Cheesey ( 70139 )
          Big sites do use hardware SSL crypto processors.

          Despite Moore's law, etc., the cost of RSA operations on general purpose CPUs is very high. Not a problem on the client side, but on the server side, one hardware accelerator could save you thousands of general purpose CPUs.
        • by nweaver ( 113078 )
          I don't have the stats for SSL, but for a simple SSH it takes .17 seconds to do a simple handshake, authenticated login, echo, exit (to a system that is .9 ms RTT latency), while only .06 seconds to do a shell script locally of "echo ".

          So I'm a little high on my .1 CPU second, but its not THAT far off. RSA in software is slow...

          The bulk encryption is cheap, but thats another story.
          • I just did a similar test to yours, and I got this:

            time ssh localhost echo foo
            foo

            real 0m0.399s
            user 0m0.066s
            sys 0m0.017s

            This is on an old, 1GHz Athlon. The real time is quite high, but the machine is doing a few other things, so much of that is time spent on other tasks. Since the crypto is all done by SSH, not by a hardware device, it must be part of the 'user' time. This gives 0.066s on an old machine doing the client part of SSH. A decent server chip should be able to do the same thing in 0.01

  • Doesn't just using HTTPS as the protocol to retrieve pages at URLs make the server sign the code, and encrypt it so no middlemen can change it "in flight"? I guess if the HTTPS server is controlled by the ISP, the server just signs the altered pages. But what kind of downstream test can stop that?

"Protozoa are small, and bacteria are small, but viruses are smaller than the both put together."

Working...