Sed, shell and hexadecimal character codes
Karel Miklav
karel at inetis.com
Tue May 27 06:27:29 UTC 2008
Oliver Fromme wrote:
> Karel Miklav wrote:
> > There's a tip in the FreeBSD fortunes database that says:
> >
> > > Want to strip UTF-8 BOM(Bye Order Mark) from given files?
> > >
> > > sed -e '1s/^\xef\xbb\xbf//' < bomfile > newfile
>
> FreeBSD's sed(1) doesn't support hexadecimal or octal
> sequences. I think even gnu sed doesn't support it, but
> you might try it yourself (/usr/ports/textprog/gsed).
>
> I don't know why that fortunes entry exist. It's wrong.
That's what I thought. Maybe we should replace the recipe with
the awk version Oliver proposed below?
> > I can't make it work, and I can't find any other method to
> > work with hexa codes in scripts or on the command line so
> > I'm kind-a depressed :) I help myself with xxd now, but if
> > it is possible to avoid it, I'd like to hear about it.
>
> There is no standard for handling octal and hexadecimal
> sequences, unfortunately, so you have to consult the
> manual page to find out. For example, tr(1) supports
> octal sequences only (no hexadecimal), while awk(1)
> supports both. So the above line could be rewritten
> with awk:
>
> awk '{if(NR==1)sub(/^\xef\xbb\xbf/, "");print}' < bomfile > newfile
More information about the freebsd-questions
mailing list