sed html tags

Yuri Pankov yuri.pankov at gmail.com
Tue Aug 26 06:20:05 UTC 2008


An wrote:
> unfortunately not... see:
> 
> # cat file
> <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
> 
> # sed -e 's/<\/?span[^>]*>//g' file
> <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
> 
> (...nothing happens, the file is returned with no substitutions done)
> 
> 
> I could do it with a perl script, which basically does what i would expect
> sed would do:
> 
> # cat pscript.pl
> #!/usr/bin/perl -w
> $text = "<span xxxx> 111 </span>   2222 <span yyyy> 3333 </span> <span xxxx>
> 111 </span>    2222    <span yyyy> 3333 </span>";
> $text =~ s/<span x[^>]*>[^\(<\/span>\)]*[\s]*<\/span>[\s]*//g;
> print $text . "\n"

$text =~ s#<span xxxx>.*?</span>\s*##g;

> # perl pscript.pl
> 2222 <span yyyy> 3333 </span> 2222    <span yyyy> 3333 </span>
> 
> " <span xxx> ..... </span> " is removed... but i don't seem to be able to do
> it with sed... : (

regexps in sed are greedy and, sadly, you can't use *? as quantifier.
try the following (adding characters that can be inside your 'xxxx'
tags, of course):
sed 's#<span xxxx>[ a-zA-Z0-9]*</span>[ ]*##g'

> Im on fedora c9, maybe that's the problem ?
> 
> siran
> 
> 
> On Mon, Aug 25, 2008 at 8:35 PM, Paul A. Procacci <pprocacci at datapipe.com>wrote:
> 
>> siran wrote:
>>
>>> Hi, I have the string
>>>
>>> <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
>>>
>>> And i wish to use sed to strip *only* the "<span xxxx>" tag and its
>>> contents... is this possible ? I'm trying this expression, but it
>>> doesn't work...
>>>
>>> sed 's/<span xxxx[^\(</span>\)]+<\/span>//g' file
>>>
>>> is there anything like it ?
>>>
>>> I would like to obtain
>>>
>>> 2222
>>>
>>>
>>>
>>> I hope someone can help,
>>>
>>> thank you,
>>>
>>> siran
>>> _______________________________________________
>>> freebsd-questions at freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>>> To unsubscribe, send any mail to "
>>> freebsd-questions-unsubscribe at freebsd.org"
>>>
>>>
>> sed -E 's/<\/?span[^>]*>//g'
>>
>> Myabe that's what you want?
>>


HTH,
Yuri


More information about the freebsd-questions mailing list