ports/162218: SpamAssassin's sa-learn can't parse mbox of CommuniGate Pro

Alexey Markov redrat at mail.ru
Tue Nov 1 08:20:09 UTC 2011


>Number:         162218
>Category:       ports
>Synopsis:       SpamAssassin's sa-learn can't parse mbox of CommuniGate Pro
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-ports-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 01 08:20:07 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator:     Alexey Markov
>Release:        8.2-RELEASE-p4
>Organization:
JSC Complitex
>Environment:
FreeBSD meson.complitex.ru 8.2-RELEASE-p4 FreeBSD 8.2-RELEASE-p4 #0: Mon Oct 17 11:44:31 MSD 2011     redrat at meson.complitex.ru:/arc/obj/arc/src/sys/MESON amd64
>Description:
In the recent versions of Communigate Pro format of date in the
From_ line was changed. Old was like "From <>(________-000000000007) Wed Feb 20 20:28:23 2008", and new is like "From <>(S_____________-000000085573) 16-04-2010_08:55:34_".

Because of it sa-learn can't parse CGP's mbox anymore, and users got "Learned tokens from 0 message(s) (0 message(s) examined)" message.
>How-To-Repeat:
Install SpamAssassin and CommuniGate Pro, and try to sa-learn some spam from CGP's mbox.
>Fix:
Attached patch fixes this problem.

Patch attached with submission follows:

Index: lib/Mail/SpamAssassin/ArchiveIterator.pm
===================================================================
--- lib/Mail/SpamAssassin/ArchiveIterator.pm	(revision 1190346)
+++ lib/Mail/SpamAssassin/ArchiveIterator.pm	(working copy)
@@ -396,7 +396,8 @@
   }
   seek(INPUT,$offset,0)  or die "cannot reposition file to $offset: $!";
   for ($!=0; <INPUT>; $!=0) {
-    last if (substr($_,0,5) eq "From " && @msg && /^From \S+  ?\S\S\S \S\S\S .\d .\d:\d\d:\d\d \d{4}/);
+    #Changed Regex to include boundaries for Communigate Pro versions (5.2.x and later). per Bug 6413
+    last if (substr($_,0,5) eq "From " && @msg && /^From \S+  ?(\S\S\S \S\S\S .\d .\d:\d\d:\d\d \d{4}|.\d-\d\d-\d{4}_\d\d:\d\d:\d\d_)/);
     push (@msg, $_);
 
     # skip too-big mails
@@ -908,8 +909,9 @@
 	      $header .= $_;
 	    }
 	  }
+          #Changed Regex to include boundaries for Communigate Pro versions (5.2.x and later). per Bug 6413
 	  if (substr($_,0,5) eq "From " &&
-	      /^From \S+  ?\S\S\S \S\S\S .\d .\d:\d\d:\d\d \d{4}/) {
+	      /^From \S+  ?(\S\S\S \S\S\S .\d .\d:\d\d:\d\d \d{4}|.\d-\d\d-\d{4}_\d\d:\d\d:\d\d_)/) {
 	    $in_header = 1;
 	    $first = $_;
 	    $start = $where;


>Release-Note:
>Audit-Trail:
>Unformatted:



More information about the freebsd-ports-bugs mailing list