sed html tags

Thu Aug 28 19:04:23 UTC 2008

yes, it does work perfectly with the example I gave... the actual file
is some like

... <span xxxx> 111 <span www> 1111no </span> </span> 2222 <span yyy>
3333 </span>  5555 <span yyy> 6666 </span> ...

your command only returns ]# sed 's/\(<span
.*>.*<\/span>\)\(.*\)\(<span .*>.*<\/span>\)/\2/' file

 5555

I wish to rip all <span xxx> .* </span> and obtain

... 2222 <span yyy> 3333 </span> 5555 <span yyy> 6666 </span>...

i think sed should be able to do it, but the operator [ ^ (  ) ]* is
not behaving as i think it would... perl does it alright, though : s

thanks,

siran

On Thu, Aug 28, 2008 at 12:49 PM, Joseph Olatt <joji at eskimo.com> wrote:
> <snip>
>
>> > >>> Hi, I have the string
>> > >>>
>> > >>> <span xxxx> 111 </span> 2222 <span yyyy> 3333 </span>
>> > >>>
>> > >>> And i wish to use sed to strip *only* the "<span xxxx>" tag and its
>> > >>> contents... is this possible ? I'm trying this expression, but it
>> > >>> doesn't work...
>> > >>>
>> > >>> sed 's/<span xxxx[^\(</span>\)]+<\/span>//g' file
>> > >>>
>> > >>> is there anything like it ?
>> > >>>
>> > >>> I would like to obtain
>> > >>>
>> > >>> 2222
>> > >>>
>> > >>>
>> > >>>
>> > >>> I hope someone can help,
>> > >>>
>> > >>> thank you,
>> > >>>
>> > >>> siran
>
> If you haven't yet solved the above problem, give the following a try:
>
> sed 's/\(<span .*>.*<\/span>\)\(.*\)\(<span .*>.*<\/span>\)/\2/'
>
>
> regards,
> joseph
>
> <snip>
>