How to delete non-ASCII chars in file
keramida at ceid.upatras.gr
Fri Sep 5 14:58:37 UTC 2008
On Fri, 5 Sep 2008 10:14:08 -0400, "Mark B." <mkbucc at gmail.com> wrote:
> I have a text file that includes some non-ASCII characters
> For example, opening the file in vi shows lines like this:
> 'easth_0.541716776378' 0 \xe2\x80\x98dire' 2
> Is there a command-line tool I can use to delete these
> characters? I tried:
> cat f | tr -cd [:print:]
> but this removes the newlines.
It may be more useful to run the file through sed(1). The newlines
aren't deleted by sed:
$ echo '^Fhello^F' | sed -e 's/[^[:print:]]*//' | hd
00000000 68 65 6c 6c 6f 06 0a |hello..|
> I also tried
> cat f | sed "s/[^:print:]//g"
> but it didn't remove the characters.
The matching pattern is wrong. You need `[^[:print:]]'. The character
class of printable characters is `[:print:]', and you can negate the
pattern with `[^xxxx]' where `xxxx' is the character class; hence the
extra pair of brackets in `[^[:print:]]'.
More information about the freebsd-questions