sed html tags

An anmichel at gmail.com
Tue Aug 26 11:34:58 UTC 2008


Well, thanks, Yuri !

That worked much better than all that i had done ! But i have the problem
that I don't know what characters to expect... accents, ñ, etc... So i
really need a "get everything between the <span xxxx> and the first
</span>"...

Regarding perl, it is perfect ! thanks !

The ? is critical ! Is it what makes what makes the .* non greedy ?


Thanks,

An M


On Tue, Aug 26, 2008 at 1:53 AM, Yuri Pankov <yuri.pankov at gmail.com> wrote:

> An wrote:
> > unfortunately not... see:
> >
> > # cat file
> > <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
> >
> > # sed -e 's/<\/?span[^>]*>//g' file
> > <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
> >
> > (...nothing happens, the file is returned with no substitutions done)
> >
> >
> > I could do it with a perl script, which basically does what i would
> expect
> > sed would do:
> >
> > # cat pscript.pl
> > #!/usr/bin/perl -w
> > $text = "<span xxxx> 111 </span>   2222 <span yyyy> 3333 </span> <span
> xxxx>
> > 111 </span>    2222    <span yyyy> 3333 </span>";
> > $text =~ s/<span x[^>]*>[^\(<\/span>\)]*[\s]*<\/span>[\s]*//g;
> > print $text . "\n"
>
> $text =~ s#<span xxxx>.*?</span>\s*##g;
>
> > # perl pscript.pl
> > 2222 <span yyyy> 3333 </span> 2222    <span yyyy> 3333 </span>
> >
> > " <span xxx> ..... </span> " is removed... but i don't seem to be able to
> do
> > it with sed... : (
>
> regexps in sed are greedy and, sadly, you can't use *? as quantifier.
> try the following (adding characters that can be inside your 'xxxx'
> tags, of course):
> sed 's#<span xxxx>[ a-zA-Z0-9]*</span>[ ]*##g'
>
> > Im on fedora c9, maybe that's the problem ?
> >
> > siran
> >
> >
> > On Mon, Aug 25, 2008 at 8:35 PM, Paul A. Procacci <
> pprocacci at datapipe.com>wrote:
> >
> >> siran wrote:
> >>
> >>> Hi, I have the string
> >>>
> >>> <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
> >>>
> >>> And i wish to use sed to strip *only* the "<span xxxx>" tag and its
> >>> contents... is this possible ? I'm trying this expression, but it
> >>> doesn't work...
> >>>
> >>> sed 's/<span xxxx[^\(</span>\)]+<\/span>//g' file
> >>>
> >>> is there anything like it ?
> >>>
> >>> I would like to obtain
> >>>
> >>> 2222
> >>>
> >>>
> >>>
> >>> I hope someone can help,
> >>>
> >>> thank you,
> >>>
> >>> siran
> >>> _______________________________________________
> >>> freebsd-questions at freebsd.org mailing list
> >>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> >>> To unsubscribe, send any mail to "
> >>> freebsd-questions-unsubscribe at freebsd.org"
> >>>
> >>>
> >> sed -E 's/<\/?span[^>]*>//g'
> >>
> >> Myabe that's what you want?
> >>
>
>
> HTH,
> Yuri
>


More information about the freebsd-questions mailing list