Par must exclude non-breaking space from the class of space chars

Jean-Baptiste Quenot jb.quenot at
Sun Mar 28 06:19:49 PST 2004

>Submitter-Id:	current-users
>Originator:	Jean-Baptiste Quenot
>Confidential:	no
>Synopsis:	Par must exclude non-breaking space from the class of space chars
>Severity:	serious
>Priority:	medium
>Category:	ports
>Class:		update
>Release:	FreeBSD 5.1-CURRENT i386
System: FreeBSD 5.1-CURRENT FreeBSD 5.1-CURRENT #6: Tue Oct 14 19:03:28 CEST 2003 jbq at i386
Par 1.52 on FreeBSD does not work as expected by the upstreams author.  On
FreeBSD, the isspace() system call returns true for the non-breaking space
character 0xA0, but this is an unintended side effect.

Quoting a message from the upstreams author:
From: "Adam M. Costello" <amc+0zjyiz+ at>
Date: Tue, 2 Dec 2003 21:19:10 +0000
To: Jean-Baptiste Quenot <jb.quenot at>
User-Agent: Mutt/1.5.4i
> on FreeBSD, the locales definitions include non-breaking space in the
> list of spaces, thus isspace(160) is true, and as a result all my
> nbsps are filtered out, and lines are broken on them.
> I noticed that the GNU libc has removed 0xA0 from spaces on purpose.
> But the BSD guys seem to have another approach, as this kind of stuff
> is "implementation specific".
That's interesting.  This was not an issue in Par 1.51, because it
didn't call setlocale(), so only ASCII characters were recognized
by isspace(), isalnum(), islower(), etc.  In par 1.52, a call to
setlocale() was added so that non-ASCII letters and digits would be
recognized for the purpose of the g,B,P,Q options.
An unforseen side effect is that non-ASCII white-space characters are
now recognized.

Here is the fragment declaring SPACE and BLANK for the ISO Latin 1 locale on
 * Standard LOCALE_CTYPE for the ISO 8859-1 Locale
 * $FreeBSD: src/share/mklocale/la_LN.ISO8859-1.src,v 1.3 2001/11/30 05:05:53 ache Exp $
SPACE           0x09 - 0x0d ' ' 0xa0
UPPER           'A' - 'Z' 0xc0 - 0xd6 0xd8 - 0xde
XDIGIT          '0' - '9' 'a' - 'f' 'A' - 'F'
BLANK           ' ' '\t' 0xa0

Set your locale settings to an 8 bit character set like ISO8859-1.  Insert
non-breaking spaces in a text, and notice how par converts them to spaces, and
even wrapping the lines on them.
Apply the following patch:
--- par.c.orig	Sun Mar 28 16:00:15 2004
+++ par.c	Sun Mar 28 16:04:00 2004
@@ -403,7 +403,8 @@
-      if (isspace(c)) ch = ' ';
+      // Exclude non-breaking space from the class of space chars
+      if (isspace(c) && c != 0xA0) ch = ' ';
       else blank = 0;
       additem(cbuf, &ch, errmsg);
       if (*errmsg) goto rlcleanup;

Thanks in advance,
Jean-Baptiste Quenot

More information about the freebsd-ports mailing list