WhyNot?

Anti-spam idea

Category: Spam
Responses: 4 (2 in support, 0 neutral, 2 in opposition)
Number of views: 361
Tracking: Track this idea
Community Rating:Average AverageYour Rating:

It seems to me that the reason spam is so objectionable is not that they're continuially trying to get me to download porn or increase the size of my willy, but becasue after I've decided I'm not interested they keep pestering me with the same messages, day after day. I don't get any more interested for being offered it ten thousand times!

The emails all have slightly different titles, and slightly different body copy, but fundametally the messages are the same each time. So, how would it be if we created a metric, which represented how similar two messages are to each other.

Now, if you saw a whole cluster of identical messages, that's a "hotspot", and possibly represents spam. A very large number of nearly identical messages would also represent a hotspot. And, becasue we're measuring similarity, once we've got a hotspot we can save it and use it to look for other messages days or weeks later.

The hotter the hotspot, the more likely the hotspot is to be spam. Routers and ISPs could dump the messages without even passing them on. Not very responsible routers could pass on the messages, but levy a charge for doing so - and there lies the possibility of creating real opt-out lists.

The point about this idea is that, once the message has arrived in my inbox, it's hard to find any mechanical way of detecting it as spam, and the damage has already been done (the message has clogged up the telecom links). But, when the messages are still moving around the world in a herd, they're a much smaller target - easier to spot, and easier to exterminate.

It all lies in the similarity metric. Surely, some kind of "bad" hash function would would as a good marker, and then a co-compression would tell us exactly how similar they are. Any Information Theorists out there?

Jules May, Nov 08 2003

What do you think of this idea or comment?
(You can change your vote at any time)

agree I agree no opinion No opinion disagree I disagree

Users who liked this idea also liked:

Other ideas in category (Spam):

Comments from other members:

Add your comment

This has been done, several times. The better ones use Bayes Theorem which I'm both too ignorant and lazy to explain. Fortunately you can read all about it at http://spamprobe.sourceforge.net/

psb777, Nov 08 2003

It is better that you get all the mail and sort it at your computer. There is always going to be something you want that looks like spam. Even a really good spam sorter will make a mistake now and then. Here is the stats on mine:

Messages classified: 17,201 Classification errors: 30 Accuracy: 99.82%

It is an excellent and free spam sorter is available at http://popfile.sourceforge.net/

The same idea as mentioned in the comment above, but used on the reader's own pc, not at the server.

holymakeral, Nov 08 2003

I use IE and have set up message filter rules to do some screening. Basically all mails that has not got my address in the "To" field will be passed to the trash box.

Every now and then I shall check the trash box to see any mistakes (about 1 in 10).

tonymak, Nov 08 2003

I'd be insulted if someone were filtering my mail without my permission, which is what it sounds like you are proposing. I suggest getting an email client with decent mail filtering (Thunderbird works well for me, and it's free) so that you can tell it how stringent you want it to be and you can doublecheck it every once in awhile to see if it made any mistakes.

An anti-mass mailing algorithem can trash legitmate mail also (newsletters, etc.).

dumllama, Jun 30 2005