Microsoft Tracking Behavior of Newsgroup Posters 543
theodp writes "Ever get the feeling your Usenet newsgroup list is being watched? By Microsoft? If so, consider yourself right. An interesting but troubling CNET interview with Microsoft's in-house sociologist goes into how the software giant is keeping a close eye on newsgroups and other public e-mail lists, tracking and rating contributors' social habits and determining "people who the system has shown to have value." Those concerned that it's not a good idea for computers to track their belongings and whereabouts are advised that they may ultimately have to fragment their identities, keeping multiple IDs and e-mail addresses."
Huh? (Score:5, Interesting)
Who isn't already doing this?
With the advent of spam most people I know abandonned their first email address years ago. I have one for each service I use (including slashdot).
I read the article! (Score:5, Interesting)
The AURA just sounds like the CueCat Digital Convergence people who wanted to put a bar code on everything. Again, MS is not the company I'd like to see doing this.
*Rather Offtopic - but Digital Convergence used to advertise the CueCat with an 'Angel coming down to earth from heaven to barcode everything' and the well-known Digital Angel RFID people have also made a religious reference in the company's name. The hue and cry of Christian's 'the number of the beast' references beg the question:
Who the hell is doing marketing for these people? I remember getting an icky feeling when I saw the 'infomercial' for the CueCat, and similarly the Digital Angel website. And I'm not the 'churchy' type. I can only imagine what the fundies think...
* This idea is copyrighted. Use of this idea may not be used to more attractively market 'evil' technology, or put a chip in my head. Thanks.
Call me captain conspirator... (Score:2, Interesting)
Troubling? (Score:5, Interesting)
This is good valid research, the type that applied research CS programs should be doing. Thismay actually make a difference in a deployed product.
I think we should tone done the M$ and SCO crap for a while.
Multiple addresses wont work (Score:1, Interesting)
I've seen some of the tracking ... (Score:5, Interesting)
The next day he was showing Ben Schneiderman [umd.edu] some of this stuff at the open house. A bunch of us looked on as they chatted, planned visits, golf outings and talked about how it all worked.
Depending on the queries he gave it, this one program would chew through data from usenet. and give back all kinds of stats and then draw relationships It even did graphical representaitons of users' actvity. Density of posts in a single thread versus starting new threads, frequency of posts, replies vs. new messages etc would be denoted by distance from the main timeline, darkness and width of the circel and so forth. You would look at a wide but faint circle and say (and I may be off in how the key worked, but ...) "This guy sticks to the topic over a long period of time" or you could denote the flame warrior or the vagrant by their graphical representation and so forth. The way the data was processed was really cool and how quickly you could start to decipher the keys was really interesting.
The Big brother implications ... well that's a whole 'nother thing there too isn't it?
This is cool! This headline is utterly unfair! (Score:5, Interesting)
What's being talked about here is reverse engineering trust heirarchies, algorithmically, simply from a discussion corpus extracted from Usenet.
This is very, very cool stuff. It is a hard application of a soft science, and if its results match empirical data, it represents a greater level of understanding about the human mind.
This is something to celebrate and take interest in, not malign simply because it's Microsoft that's behind it.
I do remind the security paranoid that reputation management remains one of the few characteristics obsessively protected in otherwise anonymous systems.
Yours Truly,
Dan Kaminsky
DoxPara Research
http://www.doxpara.com
Re:Troubling? (Score:2, Interesting)
Re:Troubling? (Score:3, Interesting)
Re:Multiple addresses wont work (Score:4, Interesting)
However, if you're posting reviews to Amazon or ePinions your text is likely to have analyzable content.
I know someone who has done this type of analysis and discovered people who reply to their own posts in dicussion boards under different IDs to make it look like they had some kind of consensus. When confronted with the analysis, they admitted the ruse.
Dear USENET User (Score:2, Interesting)
It has been determined that your use of the USENET service is in violation of the Microsoft USENET License (MUSEL). According to our records, you have repeated to such groups as comp.os.linux, alt.php, and other non-Microsoft approved newsgroups. According to the terms of the license you agreed to by turning on your computer, you are only allowed to post to Microsoft-centric and/or owned groups.
Since this is a serious violation of the terms of MUSEL we are revoking your use of the USENET service and have already automatically updated your computer to reflect this change. As of 7:00am CST today, your computer will not allow you to access anything related to USENET (including GOOGLE Groups, Newfeeds.com, etc). Any attempt to bypass this restriction will result in a compliance violation being filed against you, a fine of up to $250,000, and up to 5 years in prison.
Thank you for your understanding in this matter. And thank you for allowing Microsoft to choose you as a user.
Sincerely,
The Microsoft USENET Compliance Team
Will someone please explain.. (Score:1, Interesting)
Knowledge and Data are two (very) different things (Score:2, Interesting)
Consider that what MS is doing is analogous to what TRW,Experian and Equifax do for consumers, or what Dun&Bradstreet do for corporations. They are trying to mine information from a publicly available source. There's nothing really wrong with that. The question becomes what do you do with that information? I think most people are concerned about what someone can do with that sort of information when it can be correlated to other tangential information.
Consider:
MS mines a news group - the FBI comes in and subpoenas the records of Joe Looser as "part of an on going investigation". (Joe isn't notified of this because the Patriot Act allows them to serve a search warrant and delay notification to the targeted party that the warrant is being served) Afterwards, they go to the library and pull the records of the books that you just checked out. Been doing a little studying on microbiology have we? Oh, and last year, you checked out a copy of the Koran. They then tap into your health records (which are now electronic, but protected by HIPPA) and see that you've filled a cipro proscription 3 times in the past 4 months. Couple this with your high school and college records that comment that you are a "troubled" loner and you get arrested on suspicion of terrorism. Given that you may or may not be allowed to talk to your attorney... who knows how long you could be detained.
In reality, you're high school records indicate your a troubled loner because you didn't get along with your guidance counselor, and you made the mistake of showing the school librarian how easy it was to crack into her macintosh. (And we all know "those Hacker types" are all social miscreants.) Plus, you wore a "Free Kevin" shirt as a frosh. The books you got from the public library on microbiology were actually for a report you were doing on computer genetic algorithms, comparing and contrasting DNA in organic organisms vs. electronic programs. The Koran was required reading for your comparative religion class (damn those humanity requirements) but you were smart enough to get the book via inter library loan, and not have to buy a copy from the school bookstore. ($36 for a paperback? Yikes.) Your cat knocked over the first bottle of cipro and it spilled into the sink; you finished out your prescription and then refilled it, just in case... you never know when you'll end up with strep throat, and waiting three weeks to get a doctors appt. at the campus clinic sucks.... oh yeah, as it turns out, the "terroristic" posting on the Al'Queda message board was made by someone who had an email address that was identified by another computer as a likely email alias of a known terrorist.
Granted that this is a contrived scenario, but I think this could become "the rule" as opposed to the "exception". As the old saying goes, when you have a hammer, everything looks like a nail [c2.com]. When you have all this "data" it's very tempting to assume that you can turn it into knowledge.
Microsoft is waisting their sociological resources (Score:3, Interesting)
And I think Microsoft is simply wasting their time studying news groups and BBs. For some stupid reason government and corporations only hire sociologist for BS two-bit studies with fairly insignificant or irrelevant findings.
What is Microsoft going to get out of this data? A new chat or email client? New MSN features? A fancy new search engine? New task bar icons with even more dialog bubbles that alert me every 5 minutes? Whoopdy freak'n do da!
(pssss... Microsoft... that should be the least of your concerns right now)
MS should hire more then one sociologist and have them analyze their product distribution / development model and Windows usability. Microsoft currently produces a fairly annoying operating system in an extremely inefficient way. Moreover, Microsoft's current tactics are the cause of a lot of lost money for that company.
Why not get some sociologists to look at Microsoft's business model, Microsoft's products, and the development of Microsoft's products? Microsoft could become a socially responsible company (and no, donating to a charity does not make up for all of the BS Microsoft does); Microsoft could have happy customers (like "Apple" happy... not "my computer hasn't crashed this month" happy); Microsoft's software could have fewer problems; and Microsoft could stop wasting money on multimillion dollar law suits that they bring upon themselves.
Business degrees, consultants, lawyers, and a few UI psychologists are not enough. They're another dynamic out there that MS is missing.
But hey, if MS wants to keep wasting money and keep pissing people off... by all means, they should keep doing what they're doing. It's only going get worse.
Re:Huh? (Score:5, Interesting)
I set up a subdomain from one of my domains, that forwards all mail to one of my real addresses. Everytime I have to use my email, I use something at that subdomain, for example, slashdot@catch.domain.com. If I get spam to that address, 1) I can block the address without affecting anything else, and 2), I know who got my name on the list.
Particularily useful when you have to register to get access to download or use something. I'm careful about giving out those addresses anyways, and always "opt-out", so I get a surprisingly small amount of spam to them. I've yet to recieve spam for an address I gave to a company that said it wouldn't spam me.
Actually MS has been doing this for quite a while (Score:3, Interesting)
One of the things MVP's were told was that MS tracked our posting habits in their newsgroups. They used our e-mail addresses for this. The tracking was purportedly to help determine if our MVP status would be retained from year to year. (it's an annual award) Since they acknowledged way back when that they were tracking users on their own newsgroups it really doesn't surprise me all that much that they'd expand it to cover more groups.
Actually, given that Google has an archive [google.com] of many of the newsgroups it really wouldn't be all that difficult for pretty much anybody do track individual posting habits, etc. Just run some searches for the e-mail address of the user in question.
I did this for MS, too (Score:3, Interesting)
One choice quote from memory... "WE NEED A PATCH. GOD IF YOU SHOVED SOME COAL UP THERE ASSES YOUD GET A DIAMOND!!!LOL"
It paid $10/hr, and I needed the money.
Re:Good (Score:2, Interesting)
Re:I read the article! (Score:3, Interesting)
Your fear of overseas workers is clouding your judgement. The main reason people go to newsgroups is *precisely* because they want to avoid the cut and paste replies of unskilled people. And the main reason a company will support a newsgroup is precisely because their own customers (some of them skilled) will contribute to it without getting paid.
But if you know of a company stupid enough to do what you say they do -- please post a link to their forums. If you're speaking from personal experience, I'll assume you'll have a link for us. Right?
Re:Why is google better? (Score:3, Interesting)
Right now Google's reputation is one of the corporate assets. And they are taking good care of it. Next year, who knows.
P.S.: For profit isn't what makes an organization untrustworthy. It's centralized power. Once centralized power comes to exist the psychos become greedy to take it over. (And having observed myself, I know that sometimes these psychos were the same people who were benefactors before the power centralization occured.) The power to control is a heady drug, and nobody should be considered immune to it's lure. It's an old story. In fact, that was one of Tolkien's points. Frodo was trustworthy *because* he was incapable of using the ring to claim much power. (Well, and because he and Bilbo had already given proof that they were *relatively* immune to the siren call.)
Re:Huh? (Score:5, Interesting)
I use MyInitials_UniqueIdentity@mydomain.com. For example, when I bought tickets from that over-priced poor-quality monopolistic Ticketmaster, I created an entry in my
mf_ticketmaster_ca: mynormailmailbox
If I get spam, I comment the line out. I don't think your system allows anything extra... so I'm intrigued about your approach. Oh, and Ticketmaster did give away my email address. Their privacy statement is quite eye opening too.
Re:Since the early days of netnews... (Score:3, Interesting)
I haven't used newsgroups much, and therefore my opinion may be inaccurate, but it seems like anyone looking an groups using software with theses new search features is going to approach things very differently than people using tradition methods. Essentially, if there if a group can be called a community, it's probably that way because everyone who spends much time there knows each other (to some extent), follows whatever have developed, and so on. Someone who comes in because their news reader told them the group was popular is not going to see any of that, and if it happened too often then it could be rather disruptive. If it happened a lot, then it could change the way people handle themselves in the group and the methods used to rate threads and authors might become useless.
It also seems like the ratings could concentrate posts too much. If people use the system to search old threads then it wouldn't be an issue, but if it gets used to find places to ask things then it could increase the number of questions that could have been answered by searching, RTFMing, etc.. If only the best resources get used, then they could grow to the point of becoming impossible to search while everything else is ignored.
Finally, I wonder about the good posters as a support resource attitude. Obviously plenty of people are willing to help others online, but that doesn't necessarily mean they want everyone coming to them for assistance. Again, it wouldn't be a problem if the system encouraged searching only resources, but if it ended up encouraging un-researched posts then it could flood good groups and authors with unnecessary questions. (Obviously some answerers are going to be fine with those questions, but in bulk they tend to get annoying.)
None of these things are necessarily an issue at all, of course -- they would only be a problem in the context of Microsoft releasing a news reader with their search features (as was implied by the last article on the topic) and getting a lot of people to using. If it remained a search tool that wasn't used all the time then it could be very useful.
Re:This sounds familiar! (Score:4, Interesting)
What me worry? (Score:1, Interesting)
1. Warez, crack, and 0-day groups;
2. Music groups (mp3);
3. Porn groups; and
4. Multimedia groups (see #s 1, 2 and 3, above)
in addition to the traditional "discussion" groups. Their primary focus (the amount of information they choose to retrieve/store/analyze) may be variable, but the fact is that they have been and are maintaining a database for ALL usenet.
Secondly, I'll point out that while many groups are already actively monitored by a mix of different entities, Microsoft's involvement changes everything. The
a) losing one's posting privileges;
b) having one's news service account permanently cancelled;
c) losing your internet access;
d) finding yourself investigated by law enforcement;
e) being exposed to civil ligitation; and
f) being exposed personally.
And third, subjecting to scrutiny by any interested party what has traditionally been a comfortable and fairly anonymous back alley for everyone to openly (ok, mostly anonymously) enjoy has many inherent problems and dangers. And if behind it all is Microsoft (enter favourite conspiracy theory here), should you be worried? Most of us are worried enough by their overt actions. And is any of this any different from someone implementing a way to track peer to peer file sharing? Are the dangers any different?
So, next time you want to post information, comments, naked pictures of your ex-girlfriend, pr0n from a website you subscribe to, or some mp3's ripped from some CDs you own, know that enough information has been collected about you and your activities to allow someone to pursue an action against you. And if you slip under the radar with an assortment of changing nics, fake e-mail addys, accounts and proxy servers, remember that what information does exist is being stored for later analysis. No one bothered in the past because it was too much trouble. Now, it's become easy.
On a lighter note
Re:Huh? (Score:4, Interesting)
Using a catch-all (mail to ANY address at that domain gets forwarded) means I don't have to set up anything in an alias file or whatever. I just have to enter it, and it works. If one address gets overly-spammed, I can block that specific address, while the catch-all continues to work.
Using a regular domain (domain.com) for that purpose just means you also get all the dictonary spam. Often spammers will try info@ sales@ administrator@ bob@ etc. If it's a sub-domain, they're a lot less likely to try that, if at all. If you do end up getting a large-scale dictionary attack on the subdomain, you can just make a new one. Though I think those large-scale attacks are targeted - one of my friends works at an ISP, and he says they get them quite a bit, where they just try thousands of common usernames.
Basically, using a sub-domain makes a bit less work, and gives you a bit more protection, if you need it.
Re:Huh? (Score:1, Interesting)
But, since I have an ID per project, I do not have to reveal myself as being a contributor for other projects.
This can cause some administrative overhead, but I find it worth it. Forwarding mail to a common account is easy, you just have to be carefull not to reply with the wrong ID.
I dont see it as paranoia or a virtualized kind of multiple personalities syndrome, more a precausion to protect my privacy and point of view of software development (which I see as freedom of expression) against more IP-oriented people.
OTOH i am not so paranoid to e.g. change my style of programming or writing for each ID.
Re:When they... (Score:1, Interesting)
An email address is not that much different than a phone number or address, so it is like they're tracking phone numbers and addresses. And how about MSN customers? Microsoft would already have the phone numbers and addresses, in addition to any other information a person reveals about themselves online. I find it disturbing that an ISP would take such a bold step and say "Yeah, we're watching you."
Everyone "tracks" users in their minds, remembering who's knowledgeable and who's a flamer.
Poor comparison. Microsoft isn't like "everyone". Microsoft has more political influence and money than a great number of countries on this planet.
This MS Sociologist seems to be doing this for research purposes though.
There is nothing in the article, or in Microsoft's past behavior to support a premis that Microsoft would undertake such an endeavor to simply forward
the pursuit of science. Here nn SIx Sigma world, companies do not spend money unless they anticipate a return. The return is what we should be concerned about as information about people is a valuable commodity.
It's no worse than keeping tack of consumer car purchases of certain colors to decide on what color to make your own product. It's not really spying.
Oh really. Maybe your definition of spying is different than mine. Here's what dictionary.com has to say on the matter:
v. spied, (spd) spying, spies (spz)
v. tr.
1.To observe secretly with hostile intent.
2.To discover by close observation.
3.To catch sight of: spied the ship on the horizon.
4.To investigate intensively.
Collecting information on people is an agressive act. You collect detailed information about things you wish to control. It's not a matter of if it's spying or not. It's not a matter of whether you have something to hide. It's about power and control and how willingly and to what degree each of us chooses to submit. You seem content to coast along trusting these incursions into your free will to be benign and of no consequence and go so far as to advocate that everone else submit as well. I pity you. To be born without a desire to have control over your own destiny is like being born without a brain. Like some ameoba floating around, content to drift wherever the water happens to be flowing on any given day.
If you want to become special attention to MS.... (Score:3, Interesting)
I think it's a very important thing. And we have build NetScan to protect what I think are legitimate claims for privacy. Like a Net spider, NetScan takes publicly accessible documents off the Internet, and it respects metadata that says "Leave me alone!" There is the robots.txt file that says, "You can look at this but not that." With Usenet there is one that says "Leave my messages alone," and we respect that. We will not store your messages if you put that in them."
Given how much MS lies.....
if you do these things mentioned above you will become special attention to MS
For certainly MS inhouse will be interested in what others don't want them to be interested in....
No different than a web search engine... (Score:3, Interesting)
demographics ahead of debugging (Score:2, Interesting)
Ah wait they cant do that can they.... they dont like revealing code so that the community can help them fix their bugz
Gates & Bullmer look inside and fix problems, not outside and cause more
Blame IBM if they had waited for CPM86 we would never jave these leaches at MS, they would still be working on basic