Armoring Spam Against Anti-Spam Filters 511
moggyf points to a BBC article about how spam can be successfully tweaked to slip past current filtering methods, excerpting "To finding out how to beat the filters Mr Graham-Cumming sent himself the same message 10,000 times but to each one added a fixed number of random words. When a message got through he trained an 'evil' filter that helped to tune the perfect collection of additional words."
iluvspam adds "It's an interview with POPFile author John Graham-Cumming that summarizes his talk at the recent MIT Spam Conference. You can still listen to the technical details here (choose the Afternoon 1 session, he starts about 75 minutes in)."
That's dedication... :( (Score:3, Insightful)
Tch tch... (Score:5, Insightful)
"Make it idiot-proof, and someone will make a better idiot"
Re:Hmmm... (Score:5, Insightful)
It's a pisser that spammers now have another tool to circumvent filters; on the other hand, the people who write the filters know exactly what a spammer would do to make "better" spam.
The question is: who will implement first?
my spam filter (Score:5, Insightful)
It works a treat
The other trick I have found useful is the CamelCase nature of my name - spammers tend to mail me either as skarcher or SKARCHER, and both trip filters on my mailbox.
Fool-proof spam method (Score:1, Insightful)
Sure it's a little awkward, but picking through your email for that valid email amongst the spam is even moreso.
Re:combat the flaw? how? (Score:2, Insightful)
Although it may be difficult to discover where the spam came originated, it's pretty clear where it wants you to go (probably the person who commisioned the spam in the first place.)
Re:That's dedication... :( (Score:4, Insightful)
As you alluded to, it'd be easier to teach fish to fly. The internet essentially carries with it a stupid-user tax. Worms, virii, spam, et al are the by-products of stupidity, but as with most taxes, it just something that you have to deal with.
how NOT to get SPAM 101 (Score:4, Insightful)
2. don't use free email services hotmail etc.
3. don't use AOL
4. don't let anyone have your address that forwards messages like "cute bunny pic" or "funny anti-geek joke" etc.
5. don't post your email anywhere.
6. don't sign up for majordomo lists.
Only if you're the author. (Score:4, Insightful)
In the article, it points out those words listed are good for getting past his filter. If you don't normally have mail that uses those words, then your filter will still catch it as spam.
Now, if you do deal with the Berkshire Marriott frequently, asking them for comments on your wireless setup, then yes you're up the creek.
Re:Why bother? (Score:2, Insightful)
Just my 0.03c (adjusted for inflation)
Re:Here's a sneaky one... (Score:2, Insightful)
Having said that, the collaboration between spammers and pornographers means that they will have access to a lot of processing power. They just need to exchange porn for e-stamps. MS have probably thought about it, but I don't know what their solution is.
Really don't understand it. (Score:5, Insightful)
When I'm going through the webmail access to my spam-bait accounts (the ones that are listed on my websites that I don't bother retrieving with my POP email client anymore because of hundreds of spams a day to each), if I'm fooled into opening one up, most likely because of it having a subject header that might be someone legitimate, the moment I see that the message body says anything spammy I immediately click the Delete button. I imagine everyone else in the world is doing the same thing.
It's gotten to the point where the preoccupation of spamming is just to get past filters, the result of which is that the message is grumblingly deleted by the irritated recipient. Who out there is saying, "Oh, look, this message got past all my spam filters and contains a lot of jumbled, garbled nonsense text alongside a plug for herbal penis enlarging pills. This must be legitimate. Now, where's my credit card,"? Do the spammers think that we're all clones of Dilbert's pointy-haired manager?
Spamming is not only irritating, it's pointless. Who is paying these people to spam us? Are people actually buying penis enlarging pills and patches, herbal viagra, mortgage refinancing, credit repair kits, or any of that stuff? Enough to put millions of dollars a month into the hands of career spammers?
I'm hopelessly at sea in this matter.
Re:Obligatory POPFile Link (Score:2, Insightful)
While it is true that I still have to waste bandwidth and CPU cycles to get rid of this unwanted mail, I no longer have to waste time. I've got my parents, friends, and neighbors all hooked up with POPFile - I believe this is realistically the only way to fight spam - move the decimal place on their success ratio over a couple notches; dig into their bottom line.
Re:Ok fuck it (Score:5, Insightful)
The solicitation was made on a server located in the US. I don't doubt that Ashcroft would consider that US jurisdiction, regardless of the physical location of the poster.
There's a lot of guys in dog cages at Guantanomo Bay who've NEVER been to the US. I'm not so sure these days that when the US governemnt is pissed off at you, where you are and where you did something matter a whole lot.
Re:Discovering Keyword Demographics (Score:2, Insightful)
Re:Really don't understand it. (Score:3, Insightful)
Obviously, if an individual has gone to some trouble to set up spam filters, then she doesn't want to be bothered and the spam is pointless. However, the vast bulk of these filters are set up by the ISPs, and there's some value to the spammer to get through them to the idiot on the other side who apparently might actually respond to the spam.
I don't see how this is necessarily a problem (Score:3, Insightful)
Re:how NOT to get SPAM 101 (Score:2, Insightful)
2. don't use free email services hotmail etc.
3. don't use AOL
4. don't let anyone have your address that forwards messages like "cute bunny pic" or "funny anti-geek joke" etc.
5. don't post your email anywhere.
6. don't sign up for majordomo lists.
Yeah great and I'm sure it works a treat BUT. 1 and 6 are not practical for many people. 2 and 3 for whatever reason these services may suit some people (money constraints, location). Some people have friends or relatives who do 4, should they just start ignoring them? What if they want to converse with those people [are these playboy bunny pics by the way? ;-)]? 5 one simple mistake an you are done for anyway.
Also, why should a spammers be allowed to prevent people from using the internet as they see fit. No, I'm sorry but there are better solutions then trying to follow all your advice. I mean, whilst your points are vaild you might as well say:
7. Don't use the internet
I guarentee that last one will work perfectly!
Re:infinite monkeys (Score:5, Insightful)
First, if the spammer sends thousands of copies of the same message and just changes the "extra words" that he is testing, it will take very little time for Bayesian to adapt to the rest of the message. Suddenly, the rest of the message that previously contained non-spammy words will be considered very spammy and will overwhelm the "extra words" that each message contains. Each time the message is caught as spam, the probability that any future tests get through--regardless of the "extra words"--will be reduced even further.
Second, as the article said, it's a lot of work on the part of the spammer. They'd have to send out thousands of messages to each target to "sniff them out" and most of those wouldn't even be effective since most of them would be caught by filters and those few that got through very few would load the HTML bugs to identify themselves.
Finally, it assumes that those that are using Bayesian filters are filtering their email but leaving their security (inasmuch as HTML bugs) wide open. While there may be some people that use Bayesian and leave HTML bugs active, it has to be a small minority.
In short, it seems to me they've "found" a way to get around Bayesian that won't work, so to speak. I just don't see the problem.... ??
Re:infinite monkeys (Score:5, Insightful)
This is exactly the point. Most of the spam examples will die out because they have an ineffective collection of non spam words. But a few will survive and you now can train an own Bayesian filter which collects the versions of spam that generated webbug hits. After a while some words will shine prominently in your Bayesian filter database for being very effective at slipping through Bayesian spam filters.
Basicly you a fighting the dote with itself. And yes. You can automate the process. Just take your everyday spam (penis enlargement, unsecured credit, Nigerian business opportunities...), take a dictionary and then randomly mix dictionary words into your spam messages and send them out to your email database. Create a website to get the webbug hits and associate every spam message with a hash of the random dictionary words to identify successful sets of anti spam words.
Re:combat the flaw? how? (Score:3, Insightful)
How NOT to get SPAM 201 - a more practical guide (Score:5, Insightful)
spam filtering useless on the long term (Score:2, Insightful)
I think i've seen something about NGMP at the Jabber Software Foundation and if I recall accurately there already is some implementation.
Re:combat the flaw? how? (Score:3, Insightful)
You are starting with a heretical premise that government, or rather, the large corporations which pull the strings, have the same objective as the end user (the end of spam). Of course, it could be stopped (by cracking down hard on those contracting the spammers). But it is much more useful for them if the "war on spam" goes on and on, while the measures with side-effects (on your wallet, your freedom and your privacy) are gradually introduced to "combat" the spam. Just recall other such "wars" such as "war on drugs" or "war on poverty/racism" or "war on smoking" or "war on guns" or the most recent "war on terrorism". This is an ancient recipe of control and enslavement, perfected by churches and priesthoods over millenia (war on sin/devil, war on death), merely translated into modern jargon and current circumstances.
Perhaps not! (Score:2, Insightful)
It doesnt take very much volume to defeat the function of spam-blocking.
I have a very effective spamfilter on my server (customised spamassassin + some procmailscripts) 95-98% catch, virtually no false positives. The remaining spam is just nonsense, the mails make no sense, and the spammers are unable to sell anything from these spam-mails. Their primary purpose seems to defeat the filter, so if I setup the filter to block them, it will also generate false positives.
Not to mention that it'd have to include a mechanism for the spammer to get paid for the victim sending the message.
They dont need to get pay for the "conterminated" emails. The purpose would be to create false positives, by doing so force the operator to loosen the filter, and THEN get the real spam trough.
I'd lose my patience quickly if someone I knew sent me spam a second time after I alerted them to their problem. Fortunately, I don't know that many clueless people.
I dont see how that will stop spammers trying to conterminate legit emails. A few clueless users is all it takes.
Re:Here's a sneaky one... (Score:4, Insightful)
I think you overestimate the intelligence of these creeps. The fact that spammers are using more and more of these garbage terms, randomizers, and other hacks to get around the filters actually encourages me -- it demonstrates that they really don't have the slightest clue how statistical content based filtering actually works. Currently, they are taking advantage of the extremely bad decision to assign a 0.4 score to unknown words. The spammers are exploiting a crack in the armor, which means the armor needs to be fixed.
A human can filter spam. A spammer can't weasel his way around human intelligence, so this sets an upper bound on how advanced the spammer techniques can get. All we have to do is get document classification up to the point of competitiveness with human performance, and the problem is solved. And research into these directions isn't wasted, because the motivation for the research is for actual important document organization tasks. The effect of stomping out spam will be a cool side effect.
If a spammer was ever actually intelligent enough to get around serious, well-constructed classifiers, I highly doubt he would be in the business of spamming. To suggest that spammers could intellectually compete with people whose have spent years specializing in statistical language processing is a tad bit ridiculous.
At some point, to sell something, the spammer has to say something intelligible which is an advertisement. They can't hide this. Techniques which are foiled by bogus terms at the bottom of the email are broken. It's not a valid reason to believe that spammers are actually getting smart.
Y'all are going to hate this, but... (Score:4, Insightful)
It is silly to assume that all these people are just morons. After all, Viagra is proven to work, it is a legitimate product of sorts. The internet is there for hefty short limp (ahem ahem) non-digerati as well as for propeller heads, God bless 'em.
It seems to me that spam is the runaway bastard-child of something which actually is good and useful -- that is, targeted marketing to the willing. Don't throw out the baby with the bathwater. There is a huge legitimate market out there, just begging to be flee^wmarketed.
The anti-spam people are fighting against the Invisible Hand. Good luck.
Re:infinite monkeys (Score:4, Insightful)
Bayesian filters rely on words. That means it is dependent upon word breaks and certain spellings. Well, spammers have been avoiding word breaks (either by removing spaces or introducing unnecessary ones) and obvious "spam words" by mangling the word or introducing "1337"-type spelling.
And Bayesian filters can't parse graphics, so a lot of spammers are careful to put words likely to trigger spam filters into graphics.
BTW, this article [brain-terminal.com] explains why there will never be a filtering-based solution to solving spam until SMTP itself is made more secure.
easily combatable (Score:3, Insightful)
It could then mark it with a spam rating and be combined with spamassassin or such.
plus, wouldn't the spamassassin logic be able to say, "hey, we're getting a lot of non-word stuff - our filters tell us it's spam" and defeat this spam already?