sed html tags

Thu Aug 28 21:29:18 UTC 2008

On Thu, Aug 28, 2008 at 03:04:22PM -0400, An wrote:
> yes, it does work perfectly with the example I gave... the actual file
> is some like
> 
> ... <span xxxx> 111 <span www> 1111no </span> </span> 2222 <span yyy>
> 3333 </span>  5555 <span yyy> 6666 </span> ...
> 
> 
> your command only returns ]# sed 's/\(<span
> .*>.*<\/span>\)\(.*\)\(<span .*>.*<\/span>\)/\2/' file
> 
>  5555
> 
> 
> I wish to rip all <span xxx> .* </span> and obtain

If you wish to rip out all "<span xxx> .* </span>" then the output would
be:

  2222   5555

If that is what you want, then try the following:

sed 's/<span [a-z]*>[ 0-9a-z<>]*<\/span>//g; s/<\/span>//g'

But if Perl is already doing the job for you, I think this can be put to
rest.

regards,
joseph

> ... 2222 <span yyy> 3333 </span> 5555 <span yyy> 6666 </span>...
> 
> 
> i think sed should be able to do it, but the operator [ ^ (  ) ]* is
> not behaving as i think it would... perl does it alright, though : s

<snip>