Microsoft Tracking Behavior of Newsgroup Posters 543
theodp writes "Ever get the feeling your Usenet newsgroup list is being watched? By Microsoft? If so, consider yourself right. An interesting but troubling CNET interview with Microsoft's in-house sociologist goes into how the software giant is keeping a close eye on newsgroups and other public e-mail lists, tracking and rating contributors' social habits and determining "people who the system has shown to have value." Those concerned that it's not a good idea for computers to track their belongings and whereabouts are advised that they may ultimately have to fragment their identities, keeping multiple IDs and e-mail addresses."
Give it a break (Score:5, Informative)
Re:Tracking Slashdot too (Score:2, Informative)
Are you kidding me? Any pro-MS post is an instant karma killer. That accounts for the AC posts. I'll probably get modded down just because I didn't spell it M$.
RTFA, What is really going on here (Score:5, Informative)
The article is about this guy at MS and what he does there. The are several projects he is involved with.
One is the Netscan tool. This is available for use by the general public. You can run it yourself and seen what it can and can not do. [microsoft.com]
http://netscan.research.microsoft.com/
I beleive that it was orginally created in part to help identify helpful people in the user community so they could be rewarded (becoming and MVP for instance) They do not discriminate against you based on what platform you use as a desk top or what OS your website is hosted on. Just if you regularly post stuff and reply to posts.
I do not know much about the other tool except what is in the article.
The other tool is very much unrelated to newsgroups and like the cue cat on steriods execept I do not belive data goes to the parent company.
Re:Post Frequently == Spammer? (Score:3, Informative)
If a user ID posts lots, and all of those posts are a new thread (instead of a response to an existing one), and if those new threads don't generate repsonses themselves, then those are characteristics that point to spamming.
However, if a user ID posts lots, and many of those posts are in response to other posts (i.e. answering questions), and many of those posts are in turn responded to (i.e. acknowledging useful information, or asking for more details), then those are characteristics that point to a guru who is a good source of information on the topic.
Re:Multiple addresses wont work (Score:5, Informative)
You just pulled that out of your ass, and you know it. There are so many gigantic misunderstandings underlying that statement that I can't even begin to attack it, so suffice it to say, a simple Bayesian analysis more than likely cannot identify people based solely on what they write.
Ok, I'll give you a hint. Suppose we apply this method to Slashdot. There are about 650000 Slashdot readers. You are talking about calculating the class-conditional probability for every user on Slashdot. The differences in class-conditional probability (per user) are going to unbelievably small -- so small that any results you achieve are going to be statistically meaningless.
Bayesian techniques work okay for classifying when you've only got two or three buckets. But when you try to apply it to say, thirty buckets (much less 650000!!) it breaks down really quickly.
Also, remember that the true name for the technique is "Naive Bayesian inference." In this case (heh, in most cases) the term "naive" doesn't mean "clever and infallible."
Yes, I do research on text analysis algorithms with applications to anti-spam filters, so I do have some clue what I'm talking about.
Re:Give it a break (Score:3, Informative)
Any globally-available group is or can be available on their servers with no significant difficulty. I poked around and came up with local groups (e.g. chi.general) and non-MS language groups (e.g. comp.lang.python). Perhaps you're confusing the msnews.microsoft.com domain with the microsoft.public hierarchy?
Re:Who cares (Score:3, Informative)
Ha ha DA! (Score:2, Informative)
Anyhow, as far as I can remember it was this japanese girl that did th site in-house (kiko,tiko or something like that)
Anyhow, I would't worry about digital angel. They have no capital, no employees, no customers, devices never worked, no marketing, and their 'international offices' are one-man sales shops. Oh and they have $90 mil credit with IBM Credit that they have to repay this year
Re:I read the article! (Score:2, Informative)
Not sure if anyone else provided the link, but... (Score:2, Informative)
Sure, it works. (Score:3, Informative)
Microsoft has hated it forever [essential.org]. For much the same reasons movie makers [slashdot.org] and other large advertisers of shoddy junk hate information exchange. Large forums, such as TV/Radio, Slashdot, your local, state and federal governments can be astroturfed. [csuchico.edu] Micorsoft's problem with smaller groups, like your local lug, is that they can't spam them all. They don't have the resources and never will to create trused users in all of those groups. So long as reliable search engines exist, we will all continue to enjoy honest information from impartial sources.
Marc Smith's efforts represent Microsoft's response to such groups. Efforts to "add core value" and rank newsgroups from a company that's proved it's willingness to lie to the public should not be trusted. Poor Marc has been at this for four years, but Microsoft's search engine, mail client and web browser all still blow. What I imagine M$ will do is start steering users of their OS to M$ friendly newsgroups. They will also try to destroy the structure of newsgroups themselves and limit who can run them and focus harrasment on groups unfavorable to them. They won't win but they will try. They have already forced most large ISPs to block ports on cable modems and DSL so that the average person has a hard time serving information. The push for control of information is ongoing.
This was part of an academic research project .... (Score:2, Informative)
Mentioned before on Slashdot (Score:3, Informative)
This has been mentioned before [slashdot.org], here on Slashdot, but not in this negative context. Previously, people just thought of Microsoft's newsgroup tracking as a curiosity, and not something with an ulterior motive.
USENET is losing its relevance these days, unfortunately, due to spammers and the difficulty of creating new groups to keep up with current trends. Most message-based chat nowadays takes place on innumerable topic-specific websites running "bulletin board" software such as YaBBSE [yabbse.org]. It might be a little too late to do anything to USENET now, either good or bad....