Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Spam The Internet Your Rights Online

Armoring Spam Against Anti-Spam Filters 511

moggyf points to a BBC article about how spam can be successfully tweaked to slip past current filtering methods, excerpting "To finding out how to beat the filters Mr Graham-Cumming sent himself the same message 10,000 times but to each one added a fixed number of random words. When a message got through he trained an 'evil' filter that helped to tune the perfect collection of additional words." iluvspam adds "It's an interview with POPFile author John Graham-Cumming that summarizes his talk at the recent MIT Spam Conference. You can still listen to the technical details here (choose the Afternoon 1 session, he starts about 75 minutes in)."
This discussion has been archived. No new comments can be posted.

Armoring Spam Against Anti-Spam Filters

Comments Filter:
  • by bc90021 ( 43730 ) * <`bc90021' `at' `bc90021.net'> on Wednesday February 04, 2004 @11:22AM (#8179667) Homepage
    It's unfortunate that spam must be lucrative enough that one man will send himself the same message 10,000 times and train an evil filter! We need to get people to stop buying products advertised through spam (granted, easier said than done), as in the end, it's the financial incentive that makes a spammer spam. :(
  • Tch tch... (Score:5, Insightful)

    by supersam ( 466783 ) on Wednesday February 04, 2004 @11:22AM (#8179671) Homepage
    Didn't they know something as simple as...

    "Make it idiot-proof, and someone will make a better idiot"

  • Re:Hmmm... (Score:5, Insightful)

    by somethinghollow ( 530478 ) on Wednesday February 04, 2004 @11:25AM (#8179690) Homepage Journal
    Like many other academic studies, such as skinning humans alive to see how long they can live, I think this one should only be placed into the right hands.

    It's a pisser that spammers now have another tool to circumvent filters; on the other hand, the people who write the filters know exactly what a spammer would do to make "better" spam.

    The question is: who will implement first?
  • my spam filter (Score:5, Insightful)

    by SkArcher ( 676201 ) on Wednesday February 04, 2004 @11:26AM (#8179706) Journal
    if Message header = "type = text/html" then send to "Spam"

    It works a treat :)

    The other trick I have found useful is the CamelCase nature of my name - spammers tend to mail me either as skarcher or SKARCHER, and both trip filters on my mailbox.
  • by Anonymous Coward on Wednesday February 04, 2004 @11:28AM (#8179718)
    A fool-proof spam method is to reply to each piece of email sent to your account, asking for the sender to validate themselves with you. This would be only necessary for senders from addresses that have not yet been validated. This would would essentially stop spam dead.

    Sure it's a little awkward, but picking through your email for that valid email amongst the spam is even moreso.
  • by RHS Bomber ( 549879 ) on Wednesday February 04, 2004 @11:31AM (#8179749)
    How about going after the people who own the links in the body of the spam?
    Although it may be difficult to discover where the spam came originated, it's pretty clear where it wants you to go (probably the person who commisioned the spam in the first place.)
  • by andih8u ( 639841 ) on Wednesday February 04, 2004 @11:32AM (#8179759)
    We need to get people to stop buying products advertised through spam

    As you alluded to, it'd be easier to teach fish to fly. The internet essentially carries with it a stupid-user tax. Worms, virii, spam, et al are the by-products of stupidity, but as with most taxes, it just something that you have to deal with.
  • by musikit ( 716987 ) on Wednesday February 04, 2004 @11:35AM (#8179795)
    1. don't sign up on any page that requires you email address to verify *cough*like this one [slashdot.org] *cough*

    2. don't use free email services hotmail etc.
    3. don't use AOL
    4. don't let anyone have your address that forwards messages like "cute bunny pic" or "funny anti-geek joke" etc.
    5. don't post your email anywhere.
    6. don't sign up for majordomo lists.
  • by Eevee ( 535658 ) on Wednesday February 04, 2004 @11:36AM (#8179803)

    In the article, it points out those words listed are good for getting past his filter. If you don't normally have mail that uses those words, then your filter will still catch it as spam.

    Now, if you do deal with the Berkshire Marriott frequently, asking them for comments on your wireless setup, then yes you're up the creek.

  • Re:Why bother? (Score:2, Insightful)

    by the real darkskye ( 723822 ) on Wednesday February 04, 2004 @11:39AM (#8179834) Homepage
    The answer is simple, the spammers (the ones doing the spammage, not the ones selling the products) are probably making money from every e-mail sent. As such if they dropped the 1,000,000's of e-mail address they knew were being blocked from their lists, they'd lose 1,000,000 * [profit per e-mail]

    Just my 0.03c (adjusted for inflation)
  • by AndrewHowe ( 60826 ) on Wednesday February 04, 2004 @11:47AM (#8179906)
    Nothing. People just have to realise that filtering based on content doesn't work, and will never work, until perhaps we have strong AI. Once the penny drops, we can move on...
    Having said that, the collaboration between spammers and pornographers means that they will have access to a lot of processing power. They just need to exchange porn for e-stamps. MS have probably thought about it, but I don't know what their solution is.
  • by The I Shing ( 700142 ) * on Wednesday February 04, 2004 @11:48AM (#8179907) Journal
    I've said this before, but I'll say it again. I really don't understand why all this even happens.

    When I'm going through the webmail access to my spam-bait accounts (the ones that are listed on my websites that I don't bother retrieving with my POP email client anymore because of hundreds of spams a day to each), if I'm fooled into opening one up, most likely because of it having a subject header that might be someone legitimate, the moment I see that the message body says anything spammy I immediately click the Delete button. I imagine everyone else in the world is doing the same thing.

    It's gotten to the point where the preoccupation of spamming is just to get past filters, the result of which is that the message is grumblingly deleted by the irritated recipient. Who out there is saying, "Oh, look, this message got past all my spam filters and contains a lot of jumbled, garbled nonsense text alongside a plug for herbal penis enlarging pills. This must be legitimate. Now, where's my credit card,"? Do the spammers think that we're all clones of Dilbert's pointy-haired manager?

    Spamming is not only irritating, it's pointless. Who is paying these people to spam us? Are people actually buying penis enlarging pills and patches, herbal viagra, mortgage refinancing, credit repair kits, or any of that stuff? Enough to put millions of dollars a month into the hands of career spammers?

    I'm hopelessly at sea in this matter.
  • by joebok ( 457904 ) on Wednesday February 04, 2004 @11:52AM (#8179946) Homepage Journal
    Yes - POPFile [slashdot.org] is fantastic! Since April 4th, my filter is 99.47% accurate at sorting my mail into 6 buckets. Over 18,000 spams have disappeared without me seeing them.

    While it is true that I still have to waste bandwidth and CPU cycles to get rid of this unwanted mail, I no longer have to waste time. I've got my parents, friends, and neighbors all hooked up with POPFile - I believe this is realistically the only way to fight spam - move the decimal place on their success ratio over a couple notches; dig into their bottom line.
  • Re:Ok fuck it (Score:5, Insightful)

    by swb ( 14022 ) on Wednesday February 04, 2004 @11:53AM (#8179950)
    Another example of people assuming that EVERYBODY lives in the USA or is under US law...

    The solicitation was made on a server located in the US. I don't doubt that Ashcroft would consider that US jurisdiction, regardless of the physical location of the poster.

    There's a lot of guys in dog cages at Guantanomo Bay who've NEVER been to the US. I'm not so sure these days that when the US governemnt is pissed off at you, where you are and where you did something matter a whole lot.
  • by tbannist ( 230135 ) on Wednesday February 04, 2004 @11:54AM (#8179961)
    You're not thinking like a spammer, it won't change things very much. If a spammer discovers different keywords that reach different demographics, what do ou think he'll do? I'm betting he'll just send the spam to every address once for each of the sets of keywords. So instead of half of all e-mail being spam, we'll see a huge jump where half of delivered e-mail is spam and 90% (or more) of all e-mail is spam.
  • by One Louder ( 595430 ) on Wednesday February 04, 2004 @11:57AM (#8179980)
    It all depends upon where the blocking is taking place. Clearly some people are responding to spams, so there appears to be some incentive for the spammers to get their message through.

    Obviously, if an individual has gone to some trouble to set up spam filters, then she doesn't want to be bothered and the spam is pointless. However, the vast bulk of these filters are set up by the ISPs, and there's some value to the spammer to get through them to the idiot on the other side who apparently might actually respond to the spam.

  • by PixelCat ( 58491 ) on Wednesday February 04, 2004 @11:58AM (#8179989)
    What he's doing is a brute-force attempt to find words with--for himself--a high ham probability. I don't see how this is necessarily going to be an effective general-purpose technique. If you need to start bombarding people with thousands of messages to find the good words you're just going to drive more people into using filters--and this will almost certainly coerce ISPs into doing more filtering as well. Plus, you've got to deal with the issue of keeping data on all those users to find out which words are good for them. This would require you to tailor your spam to each individual user, which probably is going to increase the cost to the spammer (at least in terms of disk storage and time, anyway) and, as Graham-cumming implemented it, is going to fail utterly for anyone who isn't viewing mail as HTML, anyway.
  • by grandmofftarkin ( 49366 ) <3b16-ihd3@xemaps.com> on Wednesday February 04, 2004 @12:00PM (#8180000)
    1. don't sign up on any page that requires you email address to verify *cough*like this one [slashdot.org] *cough*
    2. don't use free email services hotmail etc.
    3. don't use AOL
    4. don't let anyone have your address that forwards messages like "cute bunny pic" or "funny anti-geek joke" etc.
    5. don't post your email anywhere.
    6. don't sign up for majordomo lists.

    Yeah great and I'm sure it works a treat BUT. 1 and 6 are not practical for many people. 2 and 3 for whatever reason these services may suit some people (money constraints, location). Some people have friends or relatives who do 4, should they just start ignoring them? What if they want to converse with those people [are these playboy bunny pics by the way? ;-)]? 5 one simple mistake an you are done for anyway.

    Also, why should a spammers be allowed to prevent people from using the internet as they see fit. No, I'm sorry but there are better solutions then trying to follow all your advice. I mean, whilst your points are vaild you might as well say:

    7. Don't use the internet

    I guarentee that last one will work perfectly!

  • by letxa2000 ( 215841 ) on Wednesday February 04, 2004 @12:00PM (#8180009)
    I'm not sure I understand why they think this is a problem with Bayesian filtering. Basically, they're saying that if a spammer sends you the same message thousands of times but inserts a few slightly different words each time, and if the thousands of messages get through the Bayesian filter to the user, and if the user doesn't disable HTML bugs on his email client, then we have a problem...?

    First, if the spammer sends thousands of copies of the same message and just changes the "extra words" that he is testing, it will take very little time for Bayesian to adapt to the rest of the message. Suddenly, the rest of the message that previously contained non-spammy words will be considered very spammy and will overwhelm the "extra words" that each message contains. Each time the message is caught as spam, the probability that any future tests get through--regardless of the "extra words"--will be reduced even further.

    Second, as the article said, it's a lot of work on the part of the spammer. They'd have to send out thousands of messages to each target to "sniff them out" and most of those wouldn't even be effective since most of them would be caught by filters and those few that got through very few would load the HTML bugs to identify themselves.

    Finally, it assumes that those that are using Bayesian filters are filtering their email but leaving their security (inasmuch as HTML bugs) wide open. While there may be some people that use Bayesian and leave HTML bugs active, it has to be a small minority.

    In short, it seems to me they've "found" a way to get around Bayesian that won't work, so to speak. I just don't see the problem.... ??

  • by Sique ( 173459 ) on Wednesday February 04, 2004 @12:12PM (#8180104) Homepage
    Second, as the article said, it's a lot of work on the part of the spammer. They'd have to send out thousands of messages to each target to "sniff them out" and most of those wouldn't even be effective since most of them would be caught by filters and those few that got through very few would load the HTML bugs to identify themselves.

    This is exactly the point. Most of the spam examples will die out because they have an ineffective collection of non spam words. But a few will survive and you now can train an own Bayesian filter which collects the versions of spam that generated webbug hits. After a while some words will shine prominently in your Bayesian filter database for being very effective at slipping through Bayesian spam filters.

    Basicly you a fighting the dote with itself. And yes. You can automate the process. Just take your everyday spam (penis enlargement, unsecured credit, Nigerian business opportunities...), take a dictionary and then randomly mix dictionary words into your spam messages and send them out to your email database. Create a website to get the webbug hits and associate every spam message with a hash of the random dictionary words to identify successful sets of anti spam words.
  • by Winkhorst ( 743546 ) on Wednesday February 04, 2004 @12:14PM (#8180133)
    The best solution I have found so far is to have your own domain and generate specific email addresses for specific types of communications. You keep your actual ISP email address totally secret and don't give it to anybody except your domain registrar. You then generate an address for your best friends and aquaintances you can trust and keep it separate from everything else so you don't have to change it but once every few years if that. You have a specific Shopping and Registration address you kill and replace after it becomes spammy. And you have an address for things like newletters and email groups you can also change and reregister if they leak out to the spam boobs. There are all kinds of variations on this theme, but that's the basic gist of the matter: Secrecy and flexibility.
  • by djrogers ( 153854 ) on Wednesday February 04, 2004 @12:33PM (#8180311)
    • 1) Register a domain (come on, they're cheap now)
    • 2) Get an email address from your ISP or other provider (yahoo, fastmail.fm etc) that is complex and convoluted - no names or words
    • 3) set up mail redirection with Zoneedit, redirection.net etc. with a catchall to your new mailbox.
    • 4) Use a different email address every time you must sign up for anything (ie amazon.com@newdomain.com)
    • 5) Filter on sent to headers at first sign of compromised id, or if the volume for a particular id gets too heavy and you're tired of client side filtering, set a specific redirection for it to sample@sample.com (do a whois on sample.com if you're curious).
    • 6) Enjoy the same spam free mailbox I've had for 2 years...
    Also helpful is to change your reply-to address every few months and give your friends different addresses based on how clueful they are
  • by oohp ( 657224 ) on Wednesday February 04, 2004 @12:35PM (#8180334) Homepage
    The whole idea of spam filtering is flawed on the long term. It's a vicious circle. Anti-spammers make new innovations like Bayesian filtering, spammers pay Russian and Eastern European hackers with questionable ethics to develop new spam filter evading techniques and viruses that open up mail relays, etc. We should instead focus on developing alternatives to SMTP like NGMP and such, which make mail storage the sender's responsibility.

    I think i've seen something about NGMP at the Jabber Software Foundation and if I recall accurately there already is some implementation.
  • by Nightlight3 ( 248096 ) on Wednesday February 04, 2004 @01:16PM (#8180691)
    How about going after the people who own the links in the body of the spam?

    You are starting with a heretical premise that government, or rather, the large corporations which pull the strings, have the same objective as the end user (the end of spam). Of course, it could be stopped (by cracking down hard on those contracting the spammers). But it is much more useful for them if the "war on spam" goes on and on, while the measures with side-effects (on your wallet, your freedom and your privacy) are gradually introduced to "combat" the spam. Just recall other such "wars" such as "war on drugs" or "war on poverty/racism" or "war on smoking" or "war on guns" or the most recent "war on terrorism". This is an ancient recipe of control and enslavement, perfected by churches and priesthoods over millenia (war on sin/devil, war on death), merely translated into modern jargon and current circumstances.

  • Perhaps not! (Score:2, Insightful)

    by The_DOD_player ( 640135 ) on Wednesday February 04, 2004 @01:21PM (#8180732)
    I don't feel that would be an effective spamming technique. A person's outgoing e-mail is such low-volume that a spammer isn't really spreading the word.

    It doesnt take very much volume to defeat the function of spam-blocking.
    I have a very effective spamfilter on my server (customised spamassassin + some procmailscripts) 95-98% catch, virtually no false positives. The remaining spam is just nonsense, the mails make no sense, and the spammers are unable to sell anything from these spam-mails. Their primary purpose seems to defeat the filter, so if I setup the filter to block them, it will also generate false positives.

    Not to mention that it'd have to include a mechanism for the spammer to get paid for the victim sending the message.

    They dont need to get pay for the "conterminated" emails. The purpose would be to create false positives, by doing so force the operator to loosen the filter, and THEN get the real spam trough.

    I'd lose my patience quickly if someone I knew sent me spam a second time after I alerted them to their problem. Fortunately, I don't know that many clueless people.

    I dont see how that will stop spammers trying to conterminate legit emails. A few clueless users is all it takes.
  • by pclminion ( 145572 ) on Wednesday February 04, 2004 @01:44PM (#8180914)
    The stuff you're talking about is all fine, but it will fail because the spammers will evolve to defeat it.

    I think you overestimate the intelligence of these creeps. The fact that spammers are using more and more of these garbage terms, randomizers, and other hacks to get around the filters actually encourages me -- it demonstrates that they really don't have the slightest clue how statistical content based filtering actually works. Currently, they are taking advantage of the extremely bad decision to assign a 0.4 score to unknown words. The spammers are exploiting a crack in the armor, which means the armor needs to be fixed.

    A human can filter spam. A spammer can't weasel his way around human intelligence, so this sets an upper bound on how advanced the spammer techniques can get. All we have to do is get document classification up to the point of competitiveness with human performance, and the problem is solved. And research into these directions isn't wasted, because the motivation for the research is for actual important document organization tasks. The effect of stomping out spam will be a cool side effect.

    If a spammer was ever actually intelligent enough to get around serious, well-constructed classifiers, I highly doubt he would be in the business of spamming. To suggest that spammers could intellectually compete with people whose have spent years specializing in statistical language processing is a tad bit ridiculous.

    At some point, to sell something, the spammer has to say something intelligible which is an advertisement. They can't hide this. Techniques which are foiled by bogus terms at the bottom of the email are broken. It's not a valid reason to believe that spammers are actually getting smart.

  • by duck_prime ( 585628 ) on Wednesday February 04, 2004 @01:57PM (#8181034)
    ... The internet essentially carries with it a stupid-user tax. Worms, virii [sic, heh], spam, et al are the by-products of stupidity, but as with most taxes, it is just something that you have to deal with.
    With respect to spam, let's take a step back. Obviously somebody out there is gleefully munching handfuls of Viagra and (ahem) "enhancement" pills to psych himself up to (ahem) r0x0r his wife until her weight-loss pills kick in.

    It is silly to assume that all these people are just morons. After all, Viagra is proven to work, it is a legitimate product of sorts. The internet is there for hefty short limp (ahem ahem) non-digerati as well as for propeller heads, God bless 'em.

    It seems to me that spam is the runaway bastard-child of something which actually is good and useful -- that is, targeted marketing to the willing. Don't throw out the baby with the bathwater. There is a huge legitimate market out there, just begging to be flee^wmarketed.

    The anti-spam people are fighting against the Invisible Hand. Good luck.
  • by FireBreathingDog ( 559649 ) on Wednesday February 04, 2004 @02:13PM (#8181191)
    It's much easier than that to defeat Bayesian filtering. Ever \/\/0|\|D3R why you're getting so much spam with obfuscated words? Or why you're getting so much spam where the text content is contained primarily in images rather than plaintext? Those things bypass Bayesian filters, that's why!

    Bayesian filters rely on words. That means it is dependent upon word breaks and certain spellings. Well, spammers have been avoiding word breaks (either by removing spaces or introducing unnecessary ones) and obvious "spam words" by mangling the word or introducing "1337"-type spelling.

    And Bayesian filters can't parse graphics, so a lot of spammers are careful to put words likely to trigger spam filters into graphics.

    BTW, this article [brain-terminal.com] explains why there will never be a filtering-based solution to solving spam until SMTP itself is made more secure.

  • easily combatable (Score:3, Insightful)

    by CAIMLAS ( 41445 ) on Wednesday February 04, 2004 @07:59PM (#8184948)
    This is easily defeated by an intelligent spellcheck built into antispam filters. It'd be able to recognize things such as commonly misspelled words, PGP/GPG keys, and file signatures, but would then create a rating based on number or percentage of non-words.

    It could then mark it with a spam rating and be combined with spamassassin or such.

    plus, wouldn't the spamassassin logic be able to say, "hey, we're getting a lot of non-word stuff - our filters tell us it's spam" and defeat this spam already?

FORTRAN is not a flower but a weed -- it is hardy, occasionally blooms, and grows in every computer. -- A.J. Perlis

Working...