Forgot your password?
typodupeerror
Privacy Security The Internet

Involuntary Geolocation To Within One Kilometer 207

Posted by Soulskill
from the proxy-stock-rising dept.
Schneier's blog tips an article about research into geolocation that can track down a computer's location from its IP address to within 690 meters on average without voluntary disclosure from the target. Quoting: "The first stage measures the time it takes to send a data packet to the target and converts it into a distance – a common geolocation technique that narrows the target's possible location to a radius of around 200 kilometers. Wang and colleagues then send data packets to the known Google Maps landmark servers in this large area to find which routers they pass through. When a landmark machine and the target computer have shared a router, the researchers can compare how long a packet takes to reach each machine from the router; converted into an estimate of distance, this time difference narrows the search down further. 'We shrink the size of the area where the target potentially is,' explains Wang. Finally, they repeat the landmark search at this more fine-grained level: comparing delay times once more, they establish which landmark server is closest to the target."
This discussion has been archived. No new comments can be posted.

Involuntary Geolocation To Within One Kilometer

Comments Filter:
  • by Hazel Bergeron (2015538) on Friday April 08, 2011 @09:30AM (#35756400) Journal

    I don't know about your internet, but mine involves alternative routes to a particular physical location. Not just because that's how the Internet works, but because there are competing providers. And there are all sorts of things which delay, from WiFi to pipe congestion to intentional prioritisation to the OS having something more interesting to do.

    Although I should have stopped reading at "time it takes to send a data packet to the target" - really? How does one measure precisely this?

    • Re: (Score:3, Funny)

      by j00r0m4nc3r (959816)
      My internet is just a series of tubes, so all you need to do is measure the distance the hamster travels in the tube. Simple.
      • by thomasdz (178114)

        My internet is just a series of tubes, so all you need to do is measure the distance the hamster travels in the tube. Simple.

        My internet is also a series of tubes, but I think mine use compressed air to send messages around...so I think you must have "dial-up" and I must have that "high speed broadband".

      • my sex partner is just a series of tubes. coincidentally, a hamster is also involved

      • by rcamans (252182)

        Hamsters? I want my internet upgraded to Hamsters. All I got were worms.

        • Despite the advertising claims to the contrary, my Internet line appears to be turtles, all the way down.
      • Re:implications (Score:4, Informative)

        by cgenman (325138) on Friday April 08, 2011 @11:03AM (#35757710) Homepage

        It's easier than that. Just figure out how much energy a hamster consumes walking a mile in the tubes. Weigh them when you send them out, and weigh them again when they come back.

    • it's reporter-speak for a ping

      you could do this on a webpage with some fairly innocuous javascript that keeps track of timestamps and reports back

      and yes, if you have alternate routes, this method fails. except that describes only 0.1% of internet users. for your average bloke with a cable modem opening a webpage with a speck of seemingly harmless javascript, this method should work fairly reliably

      • by Shados (741919)

        Bingo. I see a lot of people already going "BUT BUT THIS DOESNT WORK WHEN (insert edge case here).

        Even if this is 70%~ reliable at most, it would still be a marketing gold mine, where the accuracy is very low to begin with and relies heavily on loose estimation.

        • by poetmatt (793785)

          70%? I wouldn't even gamble on it being reliable information outside of it's use as a ping. 1 Kilometer can be a small range or a huge range depending on population density and whether urban vs rural.

          • you have a speck of javascript on a webpage that opens an XMLHTTPRequest (AJAX) and sends a series of overlapping timestamps. you could have a couple dozen samples in the time it takes you to read this comment, average them out on the server side, include some more sophisticated methods taking into other extraneous measurements like traffic estimates for time of day and general location, type of modem/ internet provider, etc, and get a genuinely reliable lock for any average web user sitting on any average

          • by cgenman (325138)

            If it increases marketing responses by even 0.1%, you know it will be standard on every single web ad served up in three years.

          • by jonbryce (703250)

            If it is to send an ad for a local pizza delivery service, 1km is close enough wherever you are.

        • by JWSmythe (446288)

          I think that's why they said they could get the IP within 690 meters on average.

          You have to figure in that Google does plenty of data mining. Consider what they know about so many users. They know the name, address, phone number, and a bunch of demographics on a lot of users.

          Consider if Person A was to be located by Google. He comes from a particular subnet on a large ISP. They already know that recently active users on that subnet give a physical/mailing add

          • it's a given google pretty much knows more about the average bloke than the average bloke knows about himself

            but this research demonstrates a way anyone can piggy back on google's servers and get that info for themselves as well, which ups the creep factor considerably

            furthermore, with triangulation of servers, and a bunch of pings over time, i bet you could refine the results considerably, down to one location

            it's one thing for google, some advertiser, or the feds to be able to locate you by ip. its anothe

            • by JWSmythe (446288)

              You know, I totally misread the article the first time around, and saw it as saying that it was a Google project.

              Triangulation doesn't really do much for you. You have to consider the routes used. I ran a side project at one job for a while, which mapped routes between our own points. Well, there is a full description here. [jwsmythe.com] In doing this, we had traceroutes run about once every 5 minutes.

              I had more detailed reporting that wasn't shown in the portfolio.

      • My ISP is 100 miles from where I am ... and I am not on wireless ... ..oh you appear not to be anywhere near right ...

    • Re:implications (Score:5, Interesting)

      by Rinisari (521266) * on Friday April 08, 2011 @09:43AM (#35756556) Homepage Journal

      There was that story a while back about some physicists figuring out that they couldn't send email more than 500 miles [ibiblio.org].

      Back on topic, I'll bet VPNs throw wrenches in their methods.

      • by killmenow (184444)
        You know what else throws a wrench in their methods? Seven proxies.
      • Similarly, by looking at my ping times, it's possible to show that I am no more than six thousand kilometers from my ISP. I'm not sure that's good enough to find my street.

    • by jpapon (1877296)
      I don't see your point. It's very simple to measure the time it takes a packet to get somewhere and back.

      You seem to be under the impression that they're simply taking the speed of light and dividing by the delay to get distance. That is, of course, not what they are doing at all.

    • by jpapon (1877296)
      I of course meant multiply, not divide.
    • by gstoddart (321705)

      I don't know about your internet, but mine involves alternative routes to a particular physical location. Not just because that's how the Internet works, but because there are competing providers.

      Yeah, but in practice depending on where you live and how your ISP is set up, you'll probably find the address allocated to your cable modem is fairly static, or at least consistently within a range. I just don't think that if you're in a fairly major center this isn't already fairly well established.

      Fairly consis

    • by jhoegl (638955)
      Each packet sent has a time association with it.
      You do a packet capture on one end, and a packet capture on the other.
      Ping is not needed.
    • I assume that like most places, the cables aren't direct lines from A to B, so an accurate judge of distance seems hard to do. Cable length, perhaps... but coiled wires, vertical spans, and other runs of cable would seem to skew the judge of distances based on packet times. Am I wrong? Wouldn't that at least introduce a large margin of error? What about packet buffering?
      • by jd (1658)

        ICMP isn't significantly buffered (although all packets are buffered to some extent) and the law of large numbers suggests that the cable length issue will be the same for all possible paths given enough hops and enough paths, so will simply fall out of the equation given enough directions. You couldn't use triangulation on two paths, but the errors caused by such variation should fall off (albeit asymptotically to some minimum error - which seems to be 1 Km) as the paths increases.

        My guess is that, in prac

    • by jd (1658)

      There haven't been competing providers for a VERY long time. Not in any serious sense. Most of the Internet is one gigantic spanning tree with no redundant connections anywhere. Because of a design flaw in the BGP4+ protocol, alternative routes can also cause router flaps.

      As for your other points, use Pathchar or PChar some time. It reports to you not only the time it takes to bounce packets but the pipe congestion at each link as well. You also want to look up "Internet Weather", which reports the overall

    • by guruevi (827432)

      It's basically triangulation with TCP or ICMP packets. It's not based on a single measurement from a single location. Let's say Google (Because they like to play where is waldo with their customers) has 100 datacenters. They measure a couple of times the time it takes from each datacenter over routers with known locations and average delays to your ISP's IP you connect from. They just keep drawing circles and the area that overlaps the most times is most likely the area you're in. Given that you or the rout

  • Will the same technique work for IPv6?

    • Why shouldn't it? IPv4 and IPv6 are not that different. Only problem is that few web sites are IPv6 enable currently, so you would have less landmark servers.
      • I'm haven't done a whole lot of reading on IPv6, so I was just curious whether the increased address space leads to any difference in how routing is done. It seems that with a unique public addresses and no NAT there would be more direct routes that could be taken, which would potentially mean more routers with the same address in their routing tables, which would mean more targets to check. Then, depending on congestion along various paths, one landmark may *seem* like the closest when in fact it simply

        • Real (not from a tunnel broker) IPv6 is hierarchical. This means that the first half of the address will give you a rough geolocation, and you can use landmark servers with the same prefix to go from.

          The technique should work just fine.

    • by jd (1658)

      IPv6 creates an interesting problem, as it is fundamental to the protocol that you can transition from one ISP to another without loss of any connections and without having to use a packet forwarder. This means that under some circumstances a more accurate picture can be built with enough data (since you have to be on the border of the two ISPs) but equally it means that for the same amount of data the calculation will be less accurate because routing assumptions won't hold up. You're no longer comparing li

    • by aaarrrgggh (9205)

      No, it won't work for IPv6, since the speed of light is so much faster with v6.

  • Used to be, on the Internet, no one knows you're a dog.

    I've been playing a lawyer for a long time, but I guess it's better to disclose before being found out. You heard it here first.

  • by Burdell (228580) on Friday April 08, 2011 @09:36AM (#35756468)

    How do they expect to tell the difference between latency due to distance and latency due to protocols, encoding, etc.? For example, a local T1 might have round-trip latency in the 3-4ms range, while a DSL to the same location might be 10ms (in fast mode, even higher for interleaved). A dialup connection will be much higher, while a metro-ethernet might be less than 1ms. All those times also assume no congestion along the path.

    Since the speed of a signal in single-mode fiber is about .6 c, each 1ms difference in round-trip latency gives a 90km margin of error.

    • by Dan East (318230)

      Further, the best accuracy you can obtain with DSL, for example, is the radius of area served by a particular station. The DSL latency times per kilometer are in the dozens of microseconds, so it would not be possible to resolve distances within a DSL service area just by millisecond ping times. In my rural area they push DSL out at least 3 miles. So even if you consider "average" as half of that radius, that gives an accuracy of 2,400 meters. I think they claim to narrow that down by the fact that DSL

    • by _0xd0ad (1974778)

      The amount of latency inherent in your connection wouldn't matter, so long as it was fairly consistent. As long as a route of longer distance consistently returned longer ping times than a route of shorter distance, it could be inferred that you're closer to the server which can ping you quicker.

    • by sootman (158191)

      Bruce Schneier is almost certainly a lot smarter than anyone posting on this page so it would be foolish to simply dismiss anything he says out of hand. OF COURSE all the subtle nuances of their work won't fit into a Slashdot summary. Don't you think it's likely that they did some testing and determined that their results had X accuracy Y percent of the time before they published their findings? This isn't just two morons BSing in a coffee shop saying "Hey, I bet we could..." and then publishing a blog post

  • by mbone (558574) on Friday April 08, 2011 @09:40AM (#35756522)

    Seems like this would be easy to counteract (although at the kernel hack level). All you would have to do is introduce a 30-50 msec time variable delay into all new packet sends (i.e., ICMP responses, first packet of a TCP session, etc.).

    In fact, if you encrypt everything, you may get these sorts of delays "for free."

    Also, this will not work well if you are using encrypted tunnels or VPNs to access the web. Your delay then is (tunnel delay) + (tunnel end point to attacker delay) + (encryption delays), so you seem a good deal further away than you really are.

    • The problem with this is that you are further away from *everywhere*. That is, you are further away from all landmarks equally. For all intents and purposes, then, you are saying you are "straight down" from where you really are. Even then, you are only affecting the last leg of the route. You only have limited control over who you directly connect to, and that would seem to provide the maximum bound over which you have control. Of course, if you have a single link to the outside world through your da

  • So, in reality, they figured out a way to use ping responses the way kids at the lake (or pool) play Marco...Polo.

    I wonder how many they had already kicked back when they came up with their idea?

    Don't get me wrong--it's cool tech, but I continue to be amazed by how so many "new" technologies simply mimic things that already exist in other parts of life. Kudos to the researchers. I think I'd rather spend time at the lake.
  • by cavreader (1903280) on Friday April 08, 2011 @09:48AM (#35756610)
    Back in the early 80's a Physic's grad student at Berkley was working in their data center and noticed a descrepency in user usage statistics and started investigating. He was able to isolate the user ID of the unauthorized user by analysing the usage statistics. At the time the user statistics were used for billing computer time. The user was basically trying to use the Berkley system as a proxy for attacks on other systems. He eventually spliced into the network to intercept packets containing the User ID in question and calculated the amount of time it took for those packages to complete a round trip to determine the geo location of the person hacking into the system. At first he thought he was wrong because his calculations based on signal response time said the unauthorized user was 6000 miles away. He later discovered the calculation was correct and the hacker was located in Germany. He published a book called "The Cuckoos Egg" with all the details. It is a really good book.
  • 1.. "my connection is too weird/ unique/ confabulated/ etc..."

    yes, but you are 1% of internet users. the average bloke on a cable modem is reliably caught with this method

    2. "there is traffic/ no way to ping/ etc..."

    you have a speck of javascript on a webpage that keeps track of timestamps, opens an AJAX XMLHTTPRequest and pings alot, and the server averages things out. voila: you could get 60 samples in the time it takes you to read this comment, and therefore a good lock on your location

    INCOMING...

    • by black3d (1648913)

      the average bloke on a cable modem is reliably caught with this method

      Well, the average bloke is narrowed down to 1km, that's still a good 50-100 residential properties, and no way for the "attacker" to know which, so this attack on it's own doesn't do much. This coupled with perhaps someone's surname and a telephone book, might get a hit for a malicious attacker, but a lot of folks don't list in telephone books anymore. Ahh.. who knows. It might be useful for something. :)

      • by _0xd0ad (1974778)

        the average bloke is narrowed down to 1km, that's still a good 50-100 residential properties, and no way for the "attacker" to know which, so this attack on it's own doesn't do much

        It'd be plenty good for showing him ads for restaurants and stores that he'd probably drive past on a regular basis, though.

        • i think you could do better than that by triangulating with different servers and averaging out over time

          i think law enforcement/ counterterrorism/ etc. could make good use of this methodology. yeah, those guys could just subpoena the ip address, but in time sensitive issues, this is a pretty neat trick

          heck, your average stalker weirdo with access to a number of servers in different farms/ colos either because of his job or just because he's a very committed stalker weirdo could do this

      • Same-Origin-Policy enforcement in the AJAX means means the javascript can't hook out to other servers... unless you control 3 or 7 or 37 different servers in different farms/ colos under the same domain name. the distant servers couldn't receive the info, but you could have each server fire in cycle, and have one receiving server take the timestamps in. so with a heavy rotation of pings over a brief period of time, and a bunch of different servers to triangulate ping times over time, and some extraneous inf

    • by mikkelm (1000451)

      How does this get +5, Interesting?

      How far do you think that this "average bloke" on a cable modem is from his CMTS? How far in any other arbitrary direction do you think that another "average bloke" with a CM in the same addressing pool is from the same CMTS?

      • say i control a number of servers under the same domain, and i use a simple script to run many pings quickly. can't i correct for errors and refine the technique researched here and resolve you apart from your neighbor?

        • by mikkelm (1000451)

          No. Not realistically possible even with a single CMTS feeding a single neighborhood.

          Completely impossible is telling your location apart from another customer on the same CMTS, in the same addressing pool, topologically located as far from the CMTS as you are, but in the opposite direction. Unless your electrons carry a compass.

          • ok, thanks, that's useful. i understand what a ring is. so you can narrow it down to 2 possibilities then? i mean a ping time is a ping time, right?

            • by mikkelm (1000451)

              What you can narrow it down to, if you're conducting your delay measurements from an external network, is that the IP address /might/ be leased to a CM that's somewhere within a radius of 20 miles from the CMTS. Then you need to figure out where the CMTS is.

              This kind of accuracy is already being achieved by regular location databases.

              • why doesn't the ping supply info about location past the CMTS? assuming you could lock someone down to a particular CMTS, you could infer what portion of that ping time is due to travel beyond the CMTS to the CM, no? i understand one ping isn't reliable. but if you were talking about a scheme where you were bouncing off a number of servers and averaging out over say, 60-120 pings, with extraneous traffic, time of day, and internet provider recon mixed in, you could have reliable data, no?

                but you are correct

                • by mikkelm (1000451)

                  Ping certainly could provide information about /delay/ past the CMTS, assuming that the delay between the source system and the CMTS is constant and predictable, but you cannot know where the target CM is located past the CMTS merely by examining delay from an external source. One interface on a CMTS can provide service to hundreds of homes, many miles apart, so you have absolutely no way of knowing whether two CMs to which the measured delay is identical are in neighboring houses, or equally far from the C

                  • alright, you schooled me, thanks

                    i assumed that it's just a ring past a CMTS, so you have 2 options, rather than 1. however, you are telling me the topology past a CMTS is more variable. additionally, the most useful piece of info you tell me is that if a neighbor starts downloading a movie, or the other neighbor starts playing WoW, variances in ping time become completely meaningless from one day to another, one hour to another, or even one second to another

                    got it, case closed, this method is useless

  • All the location based adverts I see in the UK (mainly "hot girls in are waiting for you", but I digress...) seem to centre on the location of my ISP's data centre.

    The only routers visible to the outside world will be upstream of my ISP. Latency might tell someone how far I am from them +/- the distance from my ISP, but last time I looked my ISP blocked ping anyway.

    I would imagine this would apply to the majority of UK DSL users.

  • Good luck, boys, my cable modem is two miles from the house.

    • by drinkypoo (153816)

      Being able to find your repeater is as good as finding you... Now if you have multiple hops with directionals only on your side then it could take them a minute...

      • Yeah, I'm almost 20 devices, 4 houses, and multiple VDSL/802.11 conversions away from the Internet connection. One of the VDSL lines is buried and goes over a ridge.

        But, really, I'd give up any anonymity that provides for a cable or DSL line to the house - doing tech support for your neighborhood after an ice storm sucks.

  • I want to try this out and see how they do. Every other geoplocation service I have tried puts me miles from where I am at. I take that back infosniper.com may have gotten it exactly right. They only show the town but the marker was right one my office.

  • I have DSL. My ISP's closest PoP is over 500KM away in a Toronto (I'm in Montreal). My PPPoE session is carried over an L2TP tunnel; my first hop is 500KM away. This is actually a very common scenario for anyone in Ontario or Quebec, since that's how all DSL in the region works. If you're on Bell Canada, your PoP is probably in the same city, but if you're using a wholesaler, it's probably not. Because the lowest possible latency to me is in Toronto, that's where this technique would see me.

    As such, it'd be

  • Note that it is not enough that there is a "landmark" router physically near you, it also has to be near you from a network topology sense. It doesn't help geolocation much if the museum next door has a landmark router if the peering point between your networks is 1000 km away.

    Now, if you are in a city on a major ISP, this is likely not to be problem. If, on the other hand, you are out in the country, then there is unlikely to be a landmark router near, and if there is one, it is quite possibly on a differe

  • ... and sitting behind the mystical, seven anonymous proxies, the method is useless to find anyone actually smart enough to properly operate a computer.

    I suppose it'll be helpful to find the average user who's playing at cyberstalking or sending threatening emails.

  • Color me skeptical.

    heck i was going for the cheap shot

  • We don't use the metric system in the US! You'll never find me!
  • With this method, they could have finally found that coffee pot.
  • Ok, so he is using ping. Who in their right mind still allows their computer to respond to ICMP requests?

Pause for storage relocation.

Working...