Jump to content

Page navigation

Content

Unsolicited spam filter training?

Recently I get comment spam that does not look like spam at all. It does not contain any links to pr0n or poker sites, but the text is obviously autogenerated junk. The only purpose I can think of is an attempt to pollute bayes spam filter.

The comments contain some generic "hey, what I nice blog" text and some links to sites that I would not suspect of blog spamming: sun.com, altavista.com, ... But the text of the links sound pretty much like spam, e.g. "my parents didnt told me about it".

Some time ago I saw a similar phenomenon with email spam: junk mail that just contained autogenerated text without any sign of advertising. I guess someone is trying to train all spam filters around the world that words usually associated with spam are not bad at all, so the next sweep of real spam will pass through the filters.

You are reading the (archived) weblog of Benjamin Niemann. This weblog has been closed, no new articles will be posted here.
If you can read german, you may have a look at my new weblog.

Navigation:

Archive:

Small print