misc/100212: UTF-8 zero-width character patch

J.R. Oldroyd fbsd at opal.com
Thu Jul 13 15:00:35 UTC 2006


>Number:         100212
>Category:       misc
>Synopsis:       UTF-8 zero-width character patch
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Thu Jul 13 15:00:30 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     J.R. Oldroyd
>Release:        FreeBSD 6.1-STABLE i386
>Organization:
>Environment:
System: FreeBSD linwhf.opal.com 6.1-STABLE FreeBSD 6.1-STABLE #1: Thu May 18 16:03:24 EDT 2006 xxx at linwhf.opal.com:/usr/obj/usr/src/sys/LINWHF i386
>Description:
This patch makes the so-called zero-width, non-spacing, or
overstriking characters of the UTF-8 encoding exactly that.
At the present time, these characters are coded with a width
of 1 which is wrong.  They should have a width of 0.

>How-To-Repeat:
Save this file:
	http://opal.com/freebsd/unicode/utf8demo.txt

On an xterm, cat the file and examine the "Combining characters"
and the "Thai (UCS Level 2)" sections.  Without the patch, the
non-spacing characters do not overstrike the previous character.
With the patch, they do.

This patch has been posted to -current and downloaded and
reviewed many times following that posting:
	http://lists.freebsd.org/pipermail/freebsd-current/2006-June/064218.html

>Fix:
--- /usr/src/share/mklocale/UTF-8.src.orig	Sat Mar 27 03:14:14 2004
+++ /usr/src/share/mklocale/UTF-8.src	Mon Jun 26 23:15:34 2006
@@ -487,9 +487,9 @@
  * U+0300 - U+036F : Combining Diacritical Marks
  */
 
-GRAPH     0x0300 - 0x034f  0x0360 - 0x036f
-PRINT     0x0300 - 0x034f  0x0360 - 0x036f
-SWIDTH1   0x0300 - 0x034f  0x0360 - 0x036f
+GRAPH     0x0300 - 0x036f
+PRINT     0x0300 - 0x036f
+SWIDTH0   0x0300 - 0x036f
 
 MAPUPPER  < 0x0345 0x0399 >
 
@@ -593,7 +593,8 @@
 UPPER     0x04e2  0x04e4  0x04e6  0x04e8  0x04ea  0x04ec  0x04ee
 UPPER     0x04f0  0x04f2  0x04f4  0x04f8
 PRINT     0x0400 - 0x0486  0x0488 - 0x04ce  0x04d0 - 0x04f5  0x04f8  0x04f9
-SWIDTH1   0x0400 - 0x0486  0x0488 - 0x04ce  0x04d0 - 0x04f5  0x04f8  0x04f9
+SWIDTH1   0x0400 - 0x0482  0x048a - 0x04ce  0x04d0 - 0x04f5  0x04f8  0x04f9
+SWIDTH0   0x0483 - 0x0486  0x0488 - 0x0489
 
 MAPUPPER  < 0x0430 - 0x044f : 0x0410 >
 MAPUPPER  < 0x0450 - 0x045f : 0x0400 >
@@ -1016,7 +1017,8 @@
 GRAPH     0x0e01 - 0x0e3a  0x0e3f - 0x0e5b
 PUNCT     0x0e3f  0x0e4f  0x0e5a  0x0e5b
 PRINT     0x0e01 - 0x0e3a  0x0e3f - 0x0e5b
-SWIDTH1   0x0e01 - 0x0e3a  0x0e3f - 0x0e5b
+SWIDTH0   0x0e31  0x0e34 - 0x0e3a  0x0e47 - 0x0e4e
+SWIDTH1   0x0e01 - 0x0e30  0x0e32 - 0x0e33  0x0e3f - 0x0e46  0x0e4f - 0x0e5b
 
 
 /*
@@ -1647,9 +1649,9 @@
  * U+20D0 - U+20FF : Combining Diacritical Marks for Symbols
  */
 
-GRAPH     0x20d0 - 0x20ea
-PRINT     0x20d0 - 0x20ea
-SWIDTH1   0x20d0 - 0x20ea
+GRAPH     0x20d0 - 0x20ff
+PRINT     0x20d0 - 0x20ff
+SWIDTH0   0x20d0 - 0x20ff
 
 
 /*
@@ -1927,7 +1929,8 @@
 PUNCT     0x309b  0x309c
 PRINT     0x3041 - 0x3096  0x3099 - 0x309f
 PHONOGRAM 0x3041 - 0x3096  0x309f
-SWIDTH2   0x3041 - 0x3096  0x3099 - 0x309f
+SWIDTH2   0x3041 - 0x3096  0x309b - 0x309f
+SWIDTH0   0x3099 - 0x309a
 
 
 /*
@@ -2149,9 +2152,9 @@
  * U+FE20 - U+FE2F : Combining Half Marks
  */
 
-GRAPH     0xfe20 - 0xfe23
-PRINT     0xfe20 - 0xfe23
-SWIDTH1   0xfe20 - 0xfe23
+GRAPH     0xfe20 - 0xfe2f
+PRINT     0xfe20 - 0xfe2f
+SWIDTH0   0xfe20 - 0xfe2f
 
 
 /*
@@ -2272,7 +2275,8 @@
 PUNCT     0x1d100 - 0x1d126  0x1d12a - 0x1d164  0x1d16a - 0x1d16c
 PUNCT     0x1d183  0x1d184  0x1d18c - 0x1d1a9  0x1d1ae - 0x1d1dd
 PRINT     0x1d100 - 0x1d126  0x1d12a - 0x1d172  0x1d17b - 0x1d1dd
-SWIDTH1   0x1d100 - 0x1d126  0x1d12a - 0x1d172  0x1d17b - 0x1d1dd
+SWIDTH1   0x1d100 - 0x1d126  0x1d12a - 0x1d164  0x1d16a - 0x1d172  0x1d183  0x1d184  0x1d18c - 0x1d1a9  0x1d1ae - 0x1d1dd
+SWIDTH0   0x1d165 - 0x1d169  0x1d17b - 0x1d182  0x1d185 - 0x1d18b  0x1d1aa - 0x1d1ad
 
 
 /*
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list