Sed, shell and hexadecimal character codes

Oliver Fromme olli at lurza.secnetix.de
Fri May 23 15:23:26 UTC 2008


Karel Miklav wrote:
 > There's a tip in the FreeBSD fortunes database that says:
 > 
 > > Want to strip UTF-8 BOM(Bye Order Mark) from given files?
 > > 
 > > sed -e '1s/^\xef\xbb\xbf//' < bomfile > newfile

FreeBSD's sed(1) doesn't support hexadecimal or octal
sequences.  I think even gnu sed doesn't support it, but
you might try it yourself (/usr/ports/textprog/gsed).

I don't know why that fortunes entry exist.  It's wrong.

 > I can't make it work, and I can't find any other method to
 > work with hexa codes in scripts or on the command line so
 > I'm kind-a depressed :) I help myself with xxd now, but if
 > it is possible to avoid it, I'd like to hear about it.

There is no standard for handling octal and hexadecimal
sequences, unfortunately, so you have to consult the
manual page to find out.  For example, tr(1) supports
octal sequences only (no hexadecimal), while awk(1)
supports both.  So the above line could be rewritten
with awk:

awk '{if(NR==1)sub(/^\xef\xbb\xbf/, "");print}' < bomfile > newfile

Basically that's exactly the same instruction as the sed
one above, but awk is a little more verbose:

"1" in sed means that the following command should only
affect the first line.  That's what "if(NR==1)" does in
awk.

"s/OLD/NEW/" is the replacement command in sed.  In awk
it looks like "sub(/old/, "new")".

Finally, sed prints all resulting lines by default, while
awk has to be told with an explicit "print" command.
(awk prints lines automatically only if there are no
other commands at all.)

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

'Instead of asking why a piece of software is using "1970s technology,"
start asking why software is ignoring 30 years of accumulated wisdom.'


More information about the freebsd-questions mailing list