ports/103082: [patch][/usr/ports/mail/elm+ME] hdrdecode and add chinese Big5

pasear ©¬¿üº¸ wchunhao at csie.nctu.edu.tw
Sat Sep 9 21:50:28 UTC 2006


>Number:         103082
>Category:       ports
>Synopsis:       [patch][/usr/ports/mail/elm+ME] hdrdecode and add chinese Big5
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-ports-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Sep 09 21:50:21 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     pasear ©¬¿üº¸
>Release:        FreeBSD 6.1-STABLE amd64
>Organization:
NCTU CSIE
>Environment:
System: FreeBSD ccbsd12 6.1-STABLE FreeBSD 6.1-STABLE #0: Mon Jun 5 22:14:28 CST 2006 root at ccbsd12:/usr/obj/usr/src/sys/AMD64_BSD6 amd64


	
>Description:
    Two patches.

    patch_precompiled_sets.c for adding simple support of chinese Big5.
	Under LC_ALL=en_US.ISO8859-1,
	Chinese Big5 can be displayed well if elm treat it as ISO8859-1, that is, no treatment.
	This patch fails to work if the user specifies LC_ALL=zh_TW.Big5, 
	but elm cannot display very well in zh_TW.Big5.

	patch_hdrdecode.c:
	Many MUAs send encoded From:, To:, and Subject: with double quotes, 
	but elm just pass them undecoded.

	For example,
	From: pasear  "=?big5?B?qay//Lq4?="  <wchunhao at csie.nctu.edu.tw>
	To: pasear =?Big5?B?qay//Lq4?= <wchunhao at csie.nctu.edu.tw>
	Subject: abc""=?Big5?B?UmU6IKuixW9+?="def

	The To: works well, 
	but it cannot decode From:, where it is very common to quote encoded text.
	The Subject: line should also be decoded with prefix and postfix string intact, 
	though this usage is not common.

	Most users just thought elm cannot handle Chinese big5 when they see the undecoded
	text.

	
>How-To-Repeat:

    Copy the following mail to /var/mail/$USER, and run elm.


    From wchunhao at ccbsd12.csie.nctu.edu.tw Sat Sep  9 23:55:57 2006
    Received: from ccbsd12.csie.nctu.edu.tw (wchunhao at ccbsd12.csie.nctu.edu.tw [140.113.209.72])
	    by mailgate.csie.nctu.edu.tw (8.13.4/8.13.4) with ESMTP id k89FtuXe013926
		    for <wchunhao at csie.nctu.edu.tw>; Sat, 9 Sep 2006 23:55:56 +0800 (CST)
			    (envelope-from wchunhao at ccbsd12.csie.nctu.edu.tw)
	Received: (from wchunhao at localhost)
	    by ccbsd12.csie.nctu.edu.tw (8.13.6/8.13.6/Submit) id k89Ftvtd090499
		    for wchunhao at csie.nctu.edu.tw; Sat, 9 Sep 2006 23:55:57 +0800 (CST)
			    (envelope-from wchunhao)
	Date: Sat, 9 Sep 2006 23:55:57 +0800
	From: pasear  "=?big5?B?qay//Lq4?="  <wchunhao at csie.nctu.edu.tw>
	To: pasear ""=?Big5?B?qay//Lq4?="" <wchunhao at csie.nctu.edu.tw>
	Subject: abc""=?Big5?B?UmU6IKuixW9+?="def
	Message-ID: <20060909155557.GA90491 at csie.nctu.edu.tw>
	MIME-Version: 1.0
	Content-Type: text/plain; charset=Big5
	Content-Disposition: inline
	Content-Transfer-Encoding: 8bit
	User-Agent: Mutt/1.5.12-2006-07-14
	Status: RO

	¤j®a¦n¡A§Ú¬O¥¿Å餤¤å

	
>Fix:

	

--- patch_hdrdecode.c begins here ---
--- work/elm2.4.ME+.122/lib/hdrdecode.c	Sat Jul  9 18:03:15 2005
+++ work.bak/elm2.4.ME+.122/lib/hdrdecode.c	Sun Sep 10 04:57:10 2006
@@ -173,9 +173,16 @@
     char *encoded = NULL;
     struct string *ret = NULL;
     charset_t set;
+	char *front, *end;
+	struct string *fstr, *estr;
 
-    if ('=' != *p++)
+	/* Pasear: front, end are used to solve buffer: abc""=?...?=" problem */
+	front = p;
+	while (*p && '=' != *p) ++p;
+	if (front != p && '=' == *p && '"' == *(p-1)) *(p-1) = '\0';
+    if ('=' != *p)
 	goto fail;
+	*p = '\0'; ++p;
     if ('?' != *p++)
 	goto fail;
     sn = p;
@@ -209,8 +216,8 @@
     p++;
     if ('=' != *p++)
 	goto fail;
-    if (*p)
-	goto fail;
+	if ('"' == *p) ++p;
+	end = p;
 
     set = MIME_name_to_charset(sn,CHARSET_create);
 
@@ -225,6 +232,18 @@
 	break;
     }
 
+	/* Pasear */
+	if (ret){
+		estr = ret;
+		fstr = new_string2(system_charset,us_str(front));
+		fstr = ret = cat_strings(fstr, ret, 0);
+		free_string(&estr);
+		estr = new_string2(system_charset,us_str(end));
+		ret = cat_strings(ret, estr, 0);
+		free_string(&estr);
+		free_string(&fstr);
+	}
+
  fail:
     if (!ret) {
 	DPRINT(Debug,20,(&Debug, 
@@ -341,20 +360,31 @@
     struct string * ret = new_string(defcharset);
     char **tokenized = rfc822_tokenize(buffer);
     unsigned char * last_char = NULL;
-    int i;
+    int i, encoded;
+	char* p;
 
     for (i = 0; tokenized[i]; i++) {
 
 	struct string * ok = NULL;
 	int nostore = 0;
 
+	/* Pasear: detect if it is a encoded string */
+	encoded = 0;
+	if ('"' == tokenized[i][0]){
+		p = tokenized[i];
+		while (*p && *p != '=') ++p;
+		if (*p && *p == '=' && *(p+1) && *(p+1) == '?' )
+			encoded = 1;
+	}
+
+
 	if ('(' == tokenized[i][0]) {
 	    /* we need add last space */
 	    if (last_char) 
 		add_ascii_to_string(ret,last_char);
 	    ok = hdr_comment(tokenized[i],defcharset,demime);
 	    nostore = 1;
-	} else if ('"' == tokenized[i][0]) {
+	} else if (!encoded && '"' == tokenized[i][0]) {
 	    /* we need add last space */
 	    if (last_char) 
 		add_ascii_to_string(ret,last_char);
--- patch_hdrdecode.c ends here ---

--- patch_precompiled_sets.c begins here ---
--- work/elm2.4.ME+.122/lib/precompiled_sets.c	Sat Jul  9 18:03:15 2005
+++ work.bak/elm2.4.ME+.122/lib/precompiled_sets.c	Sun Sep 10 03:29:48 2006
@@ -400,7 +400,8 @@
     { &cs_euc,      &map_EUC_ascii,  SET_valid,  "GB2312",  NULL, 
       &set_EUCCN,         2025,  "GB2312-1980" }, /* ASCII + GB 2312-80 */
 
-    { &cs_unknown,  NULL,  SET_valid,  "Big5",  NULL, NULL,           2026,  NULL },
+    { &cs_ascii, &map_latin1, SET_valid,  "Big5", 
+      ASCII, &(sets_iso_8859_X[1]),                                      2026,   "Big5" },
     { &cs_ascii,    NULL,  SET_valid, "windows-1250", ASCII, NULL,     2250,  NULL },
     { &cs_ascii,    NULL,  SET_valid, "windows-1253", ASCII, NULL,     2253,  NULL },
     { &cs_ascii,    NULL,  SET_valid, "windows-1254", ASCII ,NULL,     2254,  NULL },
--- patch_precompiled_sets.c ends here ---


>Release-Note:
>Audit-Trail:
>Unformatted:



More information about the freebsd-ports-bugs mailing list