[BlueOnyx:01657] Re: filter email on language
Michael Stauber
mstauber at blueonyx.it
Mon Jul 13 05:29:48 -05 2009
Hi Steffan,
> Does anybody know a way (say procmail) to filter e-mail on language
> I like to seperate Dutch email from other language
That is something that can't really be done with enough relieably. The problem
is that you can't trust email clients to encode emails correctly or leave
indications about what language may be found in the message body. In the end
you really need something that evaluates the text in the message body to make
guesstimates (and they're really guesses!) about what language that is.
Which means those checks may sometimes be wrong. Chances are they're more
often wrong on shorter emails (with less text to parse).
Just with Procmail alone this will thefore be rather complex, tricky and
possibly unrelieable.
With SpamAssassin you could possibly get a functionality somehwere near what
you want to achieve.
In SpamAssassin you can define so called "accept_locales" (languages that you
want to accept email in). If an email arrives for that user (or you can also
set it globally) in a language that is not OK'ed, then the mail will trigger
the rule CHARSET_FARAWAY. It may be possible to set up SpamAssassin to
ok_lokale NL and use a procmail rule to move anything not having the
CHARSET_FARAWAY to a specific folder destine for Dutch messages. How relieable
SpamAssassin can detect Dutch messages I can't say.
Another option is to use the SpamAssassin plugin RelayCountry. Our AV-SPAM v5
and v5.1 use this since the latest update (3-4 days ago) to assign scores to
emails from China, Korea, Russia, Romania and a few other countries. If you
have our AV-SPAM (I think you do), then check
/etc/mail/spamassassin/country.cf for the existing syntax. It'll give you some
ideas. You can add scores for additional countries, too. Like a negative score
for messages originating in NL. But that method won't simply catch messages in
Dutch language. It'll catch *any* message that originated on a mailserver in
the Netherlands - regardless if the message is in Dutch, English or whatever
other language.
But maybe someone else has another idea that helps you along.
--
With best regards
Michael Stauber
More information about the Blueonyx
mailing list