for perl wizards.

Warren Block wblock at wonkity.com
Fri Oct 9 18:06:30 UTC 2009


On Fri, 9 Oct 2009, Oliver Fromme wrote:
> Warren Block wrote:
> > Oliver Fromme wrote:
> > > Gary Kline wrote:
> > > >
> > > > Whenever I save a wordpeocessoe file [OOo, say] into a
> > > > text file, I get a slew of hex codes to indicate the char to be
> > > > used.  I'm looking for a perl one-liner or script to translate
> > > > hex back into ', ", -- [that's a dash), and so forth.  Why does
> > > > this fail to trans the hex code to an apostrophe?
> > > >
> > > > perl -pi.bak -e 's/\xe2\x80\x99/'/g'
> > >
> > > You need to escape the inner quote character, of course.
> > > I think sed is better suited for this task than perl.
> >
> > That's twice now people have suggested sed instead of perl.  Why?  For
> > many uses, perl is a better sed than sed.  The regex engine is far more
> > powerful and escapes are much simpler.
>
> Neither powerful regexes nor escapes will help in this case.

Certainly \x will not help in sed; sed doesn't have it.

> A simple basic regex is more than sufficient (in fact this
> isn't even a regex, it's a fixed string).  And the escaping
> is a problem of the shell, not perl or sed.  And by the way,
> I stongly disagree that perl's escapes are much simpler.
> In my opinion perl has the most complex escaping and quoting
> I have seen in any language so far.

I was thinking of the escapes needed for sed that should not be needed. 
Some of those are shell problems, many are due to the regex library. 
More basic things than \x are missing.  \t, for instance, or useful \s 
instead of picking spaces or tabs or trying to navigate using | in sed 
expressions.

> The basic UNIX philosophy is to use the smallest or simplest
> tool that does the job. In this case that's clearly sed.

Since sed doesn't have \x, it would appear that sed does not do the job. 
Maybe I just don't see it.  And in most cases, the external simplicity 
of a tool is more important to the user than its internals.  Put another 
way, if you have it, and it does a better/easier/faster job, why *not* 
use it?

> (Not to mention the fact that perl isn't even in FreeBSD's
> base system, so might not be available at all.)

But the OP is using it, so that's clearly not the case here.  Or in most 
FreeBSD installations.

It's possible "Mastering Regular Expressions" has influenced my thinking 
on this.

-Warren Block * Rapid City, South Dakota USA


More information about the freebsd-questions mailing list