How to delete non-ASCII chars in file

Mark B. mkbucc at gmail.com
Fri Sep 5 16:27:14 UTC 2008


On Fri, Sep 5, 2008 at 10:58 AM, Giorgos Keramidas
<keramida at ceid.upatras.gr> wrote:

> $ echo '^Fhello^F' | sed -e 's/[^[:print:]]*//' | hd
> 00000000  68 65 6c 6c 6f 06 0a                              |hello..|
> 00000007
> $

Thanks.

> The matching pattern is wrong.  You need `[^[:print:]]'.  The character
> class of printable characters is `[:print:]', and you can negate the
> pattern with `[^xxxx]' where `xxxx' is the character class; hence the
> extra pair of brackets in `[^[:print:]]'.

In case you are interested, I've patched the re_format man page with this
example.  I had read it, and it says :print: is the "name of the character
class."  I think the concrete example helps clarify things.

A follow question--is it possible to use that statement in a Makefile (BSD)?
A straight cut 'n paste didn't work, and I couldn't figure out the escaping to
make it work.

Thanks,

m

cd to /usr/src/lib/libc/regex/ and apply this patch.

--- /dev/null Fri Sep  5 12:12:21 2008
+++ re_format.7        Fri Sep  5 12:18:29 2008
@@ -288,6 +288,10 @@
 A locale may provide others.
 A character class may not be used as an endpoint of a range.
 .Pp
+To match all characters not in a class, use a bracket expression
+like this:
+.Ql [^[:print:]] .
+.Pp
 There are two special cases\(dd of bracket expressions:
 the bracket expressions
 .Ql [[:<:]]


More information about the freebsd-questions mailing list