CARVIEW |
Select Language
HTTP/2 200
cache-control: max-age=43200
server: Combust/Plack (Perl)
vary: Accept-Encoding
content-encoding: gzip
content-length: 6685
content-type: text/html; charset=utf-8
last-modified: Sat, 11 Oct 2025 12:03:09 GMT
date: Sat, 11 Oct 2025 12:03:09 GMT
strict-transport-security: max-age=15768000
Fighting the Good Fight against spam deluge (was: Senatorial (Senescent?) reflective pause) - nntp.perl.org
Front page | perl.perl5.porters |
Postings from July 2008
nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About

Fighting the Good Fight against spam deluge (was: Senatorial (Senescent?) reflective pause)
Thread Previous | Thread NextFrom:
Tom ChristiansenDate:
July 31, 2008 15:16Subject:
Fighting the Good Fight against spam deluge (was: Senatorial (Senescent?) reflective pause)Message ID:
27231.1217542549@chthonIn-Reply-To: Message from Johan Vromans <jvromans@squirrel.nl> of "31 Jul 2008 13:31:41 +0200." <m2hca6waqq.fsf@phoenix.squirrel.nl> NB: Those uninterested in iterator classes or how to survive 50,000 pieces of mail a day should please skip to /Perl-spin near the bottom for another semi-strange perl-related spin-death issue. >Tom Christiansen <tchrist@perl.com> writes: >> I still have a vague hunch like a module, or here even a pragma, >> might be a good idea. > I'd go for a nice iterator class instead of <<<<>>>> weirdness. Hi Johan, Agreed on all those icky bbbracketsss! But now I'm curious--a condition for which, per Dorothy Parker, there is no cure :). Still, I'll try to cure it by asking whether you might you mean: (1) A class that has some sort of: use overload "<>" => sub { ... }; (2) A tied-filehandle class that provides a special OPEN and maybe READLINE function? (3) Something that uses one of the Iterator:: CPAN modules as a base? (4) Something else entirely, maybe like hand-rolled iterators like # Pass -10, 1, 42, even or "az" if you please. # Defaults to 0. sub new_iterator(;$) { my $start = @_ ? shift() : 0; return sub { state $count = $start; return $count++; }; } I know we once talked about an ITERATE method, kinda like the PROPAGATE one, but I think 1 and 2 covered that need, and both seem to do so while allowing the old <> look-and-feel. Still, that didn't stop people from still writing 3 and even 3. > I don't mind typing a few more characters, especially since (as > pointed out several times now) it is functionality that often occurs > only once in a program -- if at all. I really do use this sort of thing a lot in sysadminny work. I have a pair of little perl scripts to temporarily blacklist imposters, one that trails maillog for sendmail complaints (of my own devising), and the other daemonlog for spamd(8) [not spamd(1)!] messages. Sometimes I run them on /var/log/{daemon,mail}log* respecively, which includes gz files. #!/usr/local/bin/perl # blacklist-imposters-smtp use File::Tail; $ML = "/var/log/maillog"; die "need to run as superuser" unless $> == 0; tie(*$ML, "File::Tail", "name" => $ML) || die "tie failed to /var/log/maillog: $!"; @handles = ( @ARGV ? *ARGV : (), *$ML ); @ARGV = map { /\.gz$/ ? "gzip -dc < $_ |" : $_ } @ARGV; foreach $fh (@handles) { if (@ARGV && $fh eq "*main::ARGV") { warn "reading from ", join(", " => @ARGV), "\n"; } else { warn "reading from $fh\n"; } while (<$fh>) { # eg: # Jul 31 03:04:49 chthon sm-mta[8808]: m6V94csp008808: ruleset=check_mail, arg1=<root@perl.com>, rel ay=mail.myebroadband.com [58.26.29.142], reject=553 5.3.0 <root@perl.com>... Imposter! # #Jul 31 04:22:30 chthon sm-mta[8799]: m6VAMLn5008799: ruleset=check_rcpt, arg1=<cech@jhereg.perl.co m>, relay=imr-d01.mx.aol.com [205.188.157.39], reject=553 5.3.0 <cech@jhereg.perl.com>... Defunct host spam rejected if ( /(Defunct)/ # sent to eg jhereg.perl.com || /(Imposter)/ # sent from eg perl.com but rcvd on ext if ) { print "$0: Found $1 mail "; unless ( / \[ ( \d+\.\d+\.\d+\.\d+ ) \] /x ) { warn "malformed line tailed: $_"; next; } print " blacklisting $1\n"; system("spamdb -t -a $1") == 0 || warn "spamdb command failed: $?"; } } } or in blacklist-imposters-spamd: #!/usr/local/bin/perl # blacklist-imposters-spamd use File::Tail; $ML = "/var/log/daemon"; die "need to run as superuser" unless $> == 0; tie(*$ML, "File::Tail", "name" => $ML) || die "tie failed to /var/log/maillog: $!"; @handles = ( @ARGV ? *ARGV : (), *$ML ); @ARGV = map { /\.gz$/ ? "gzip -dc < $_ |" : $_ } @ARGV; foreach $fh (@handles) { if (@ARGV && $fh eq "*main::ARGV") { warn "reading from ", join(", " => @ARGV), "\n"; } else { warn "reading from $fh\n"; } while (<$fh>) { # eg: # Jul 30 23:07:42 chthon spamd[8872]: 121.131.215.187: To: sam@mox.perl.com # Jul 30 23:00:36 chthon spamd[8872]: 189.19.251.140: connected (54/46), lists: uatraps # Jul 31 06:58:55 chthon spamd[8872]: 122.166.2.69: From: "Mail Delivery Subsystem" <noreply@perl.co m> # Jul 31 06:58:55 chthon spamd[8872]: 122.166.2.69: To: sales@perl.com # Jul 31 06:58:55 chthon spamd[8872]: 122.166.2.69: Subject: Message could not be delivered if ( /spamd.*: (\d+\.\d+\.\d+\.\d+): (From:.*(perl|paypal)\.com)/ ) { print "$0: Found imposter poser in spamd log $2, blacklisting $1\n"; system("spamdb -t -a $1") == 0 || warn "spamdb command failed: $?"; } if ( /spamd.*: (\d+\.\d+\.\d+\.\d+): (To:.*(mox|jhereg|wraeththu)\.perl\.com)/ ) { print "$0: Found defunct host poser in spamd log $2, blacklisting $1\n"; system("spamdb -t -a $1") == 0 || warn "spamdb command failed: $?"; } } } I also have permanent spamdb entries with choice honeypot users--like john or michael, sandra or or susan--people who've never had accounts here but are common names for so-called directory-scans. But that's something that takes care of itself. There are from daemonlog: Jul 31 01:16:06 chthon spamd[8872]: (BLACK) 220.225.252.243: <brent@maginfo.fr> -> <john@perl.com> Jul 31 01:16:34 chthon spamd[8872]: 220.225.252.243: disconnected after 384 seconds. lists: spamd- greytrap Jul 31 01:16:53 chthon spamd[8872]: 220.225.252.243: connected (31/29), lists: spamd-greytrap Jul 31 01:17:49 chthon spamd[8872]: 220.225.252.243: From: brent@maginfo.fr Jul 31 01:17:49 chthon spamd[8872]: 220.225.252.243: To: john@perl.com Jul 31 01:17:50 chthon spamd[8872]: 220.225.252.243: Subject: hello Jul 31 01:17:51 chthon spamd[8872]: (BLACK) 220.225.252.243: <wjm@best.com> -> <michael@perl.com> Jul 31 01:18:55 chthon spamd[8872]: 220.225.252.243: disconnected after 387 seconds. lists: spamd- greytrap Jul 31 01:19:06 chthon spamd[8872]: 220.225.252.243: connected (22/21), lists: spamd-greytrap Jul 31 01:19:36 chthon spamd[8872]: 220.225.252.243: From: wjm@best.com Jul 31 01:19:36 chthon spamd[8872]: 220.225.252.243: To: michael@perl.com Jul 31 01:19:36 chthon spamd[8872]: 220.225.252.243: Subject: Jul 31 01:20:28 chthon spamd[8872]: (BLACK) 220.225.252.243: <britney@teleport.com> -> <sandra@per l.com> Jul 31 01:20:39 chthon spamd[8872]: 220.225.252.243: disconnected after 387 seconds. lists: spamd- greytrap Jul 31 01:20:48 chthon spamd[8872]: 220.225.252.243: connected (21/21), lists: spamd-greytrap Jul 31 01:22:17 chthon spamd[8872]: 220.225.252.243: From: britney@teleport.com Jul 31 01:22:17 chthon spamd[8872]: 220.225.252.243: To: sandra@perl.com Jul 31 01:22:17 chthon spamd[8872]: 220.225.252.243: Subject: hello Jul 31 01:22:43 chthon spamd[8872]: (BLACK) 220.225.252.243: <brent@maginfo.fr> -> <john@perl.com> For those that don't take care of themselves, Perl helps a lot with the ad-hoc ones, and it really does simplify coding to run a simple map on @ARGV to convert gzipped archived logs into parsable text. I rely on pf and spamd(8) for my front line of defense (#1), with its very clever interaction with pf(4) and persistent tables for packet redirection. For my second line (#2) of defence, I have sendmail configured pretty agressively. Besides that, I've split up the duties into a somewhat elaborate separation of external-mta (listens externally; load-limited; time-limited, etc) from internal-mta (listens internally, not load-limited) from internal queue-deliverer (which runs only a few jobs at a time). It catches things like: Jul 31 06:44:57 chthon sm-mta[8204]: ruleset=check_relay, arg1=imsar.bu.edu.ro, arg2=127.0.0.4, re lay=imsar.bu.edu.ro [217.73.165.147], reject=553 5.3.0 Spam blocked - see https://www.spamhaus.org/ I also have sendmail primes log messages for later processing blacklist-imports-smtp with its noises about imposters or defunct hosts. The message only makes it to spamassassin, slow as it is, after that, as stage #3. This is from maillog, not daemon, and shows it processing your message to me: Jul 31 06:45:36 chthon sm-mta[26698]: m6VCjXVl026698: from=<perl5-porters-return-139016-tchrist=pe rl.com@perl.org>, size=1756, class=-60, nrcpts=1, msgid=<m2d4kuw7c4.fsf@phoenix.squirrel.nl>, prot o=SMTP, daemon=MTA, relay=x6.develooper.com [63.251.223.186] Jul 31 06:45:41 chthon spamd[2463]: spamd: connection from localhost [127.0.0.1] at port 21901 Jul 31 06:45:41 chthon spamd[2463]: spamd: setuid to tchrist succeeded Jul 31 06:45:41 chthon spamd[2463]: spamd: processing message <m2d4kuw7c4.fsf@phoenix.squirrel.nl> for tchrist:101 Jul 31 06:45:47 chthon spamd[2463]: spamd: clean message (-10.6/4.5) for tchrist:101 in 6.3 second s, 2040 bytes. Jul 31 06:45:47 chthon spamd[2463]: spamd: result: . -10 - BAYES_00,RCVD_IN_DNSWL_HI scantime=6.3, size=2040,user=tchrist,uid=101,required_score=4.5,rhost=localhost,raddr=127.0.0.1,rport=21901,mid= <m2d4kuw7c4.fsf@phoenix.squirrel.nl>,bayes=0.000000,autolearn=ham Jul 31 06:45:48 chthon sm-queue[3539]: m6VCjXVl026698: to="|/home/tchrist/.audit_mail tchrist", ct laddr=<tchrist@perl.com> (101/10), delay=00:00:14, xdelay=00:00:08, mailer=prog, pri=229756, dsn=2 .0.0, stat=Sent There's a also stage #4 (|.audit_mail) and even a stage #5 (sorting into incoming folders, eg: direct, personal, p5p, etc). Every day, between 30,000 and 60,000 or so pieces of mail, nearly *ALL* spam, are attempted to be delivered to me. But my load stays around 0.42 and the machine is nimble to the interactive touch, even though it's only an old 300 Mhz Pentium-2 (686) with 128M of real memory of 512K L2 cache. It took some doing to get me to that state, but I think it's amazing it works all, let alone with nearly no visible impact on its 2-4 interactive users. I do have another Perl-spin bug/problem related, but I am pretty sure this is some pessimal combo of input data and processing code. Here's an example of it: UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND 0 2463 3795 0 2 20 44920 10020 poll I ?? 14:42.02 perl5.10.0: spamd child (perl5.10.0) Nearly 15 minutes CPU time to process one message?! That's SpamAssassin's spamd(1) servicing a spamc(1) client. I figure that somewhere there must be a regex in SpamAssassin that could use nonbacktracking, whether (?>...) or possessive quantifiers or perhaps the new "backtracking control verbs". But I don't know where it is, and haven't the patience to track it down. So I just kill those off when they take too long. Is this something others here have seen? I'd at first thought that 5.10.0 had fixed it, but I was mistaken. --tom -- "What Orwell feared were those who would ban books. What Huxley feared was that there would be no reason to ban a book, for there would be no one who wanted to read one. Orwell feared those who would deprive us of information. Huxley feared those who would give us so much that we would be reduced to passivity and egoism. Orwell feared that the truth would be concealed from us. Huxley feared the truth would be drowned in a sea of irrelevance. Orwell feared we would become a captive culture. Huxley feared we would become a trivial culture, ... " --Neil Postman, forward to "Amusing Ourselves to Death" agressively arg1 arg2 ARGV bayes best.com blacklist blacklisting britney bytes cc cech Christiansen chthon com combo CPAN CPU ct d01.mx.aol.com daemon,mail daemonlog dc defence didn't DNSWL don't dsn eg eq etc ext Fcc fh File::Tail filehandle greytrap gz gzip gzipped haven't hoc http Huxley I'd I've imr imsar.bu.edu.ro jhereg jhereg.perl.co jhereg.perl.com Johan Jul jvromans kinda l.com laddr localhost localhost,raddr m2d4kuw7c4.fsf m2hca6waqq.fsf m6V94csp008808 m6VAMLn5008799 m6VCjXVl026698 maginfo.fr mail.myebroadband.com maillog main::ARGV Mhz ML mox mox.perl.com msgid MTA noreply nrcpts Orwell p5p paypal pe Pentium Perl perl.co perl.com perl.org perl5 perl5.10.0 pf phoenix.squirrel.nl PID PPID pragma pri prot rcpt rcvd READLINE regex respecively rl.com RSS ruleset sandra scantime sendmail setuid sm SMTP Spam SpamAssassin SpamAssassin's spamc spamd spamdb squirrel.nl STAT sysadminny tchrist tchrist,uid tchrist:101 teleport.com TT uatraps UID usr var Vromans VSZ WCHAN who've wjm wraeththu www.spamhaus.org x6.develooper.com xdelayThread Previous | Thread Next
- Re: Iterator::Diamond (Was: Re: Fighting the Good Fight against spam deluge) by Tom Christiansen
- Re: Fighting the Good Fight against spam deluge by Johan Vromans
- Re: Fighting the Good Fight against spam deluge by chromatic
- Fighting the Good Fight against spam deluge (was: Senatorial (Senescent?) reflective pause) by Tom Christiansen
- Re: Fighting the Good Fight against spam deluge (was: Senatorial(Senescent?) reflective pause) by Tim Bunce
- Re: Fighting the Good Fight against spam deluge (was: Senatorial (Senescent?) reflective pause) by jm
- Re: Fighting the Good Fight against spam deluge (was: Senatorial (Senescent?) reflective pause) by jvromans
- Diamond iteration (was: Fighting the Good Fight against spam deluge) by Tom Christiansen
- Re: Fighting the Good Fight against spam deluge by Roland Giersig
- Re: Fighting the Good Fight against spam deluge by Johan Vromans
- Re: Fighting the Good Fight against spam deluge by chromatic
- Re: Fighting the Good Fight against spam deluge by H.Merijn Brand
- Re: Fighting the Good Fight against spam deluge by Roland Giersig
- Re: Fighting the Good Fight against spam deluge by jvromans
- Re: Fighting the Good Fight against spam deluge by Tom Christiansen
- Re: Fighting the Good Fight against spam deluge by Johan Vromans
- Re: Fighting the Good Fight against spam deluge by Roland Giersig
- Re: Fighting the Good Fight against spam deluge by H.Merijn Brand
- Re: Diamond iteration (was: Fighting the Good Fight against spam deluge) by Tom Christiansen
- Re: Diamond iteration (was: Fighting the Good Fight against spamdeluge) by H.Merijn Brand
nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About