Perl Help For Newbie

Tue Apr 27 15:09:19 PDT 2004

>> Can someone explain to me why people are suggesting to parse
>> markup languages manually? There's modules -- dozens -- for
>> this. Use CPAN.
>
> because he is a perl beginner and doesn't know about cpan and modules
> and stuff......
>
> how about being a bit more specific :-
>
> try :-
>
> cd /usr/ports/www/p5-HTML-parser && make install clean
>
> perldoc HTML::Parser (see the examples sections) or as a
> starter
>
> use HTML::TokeParser::Simple;
>   $p = HTML::TokeParser->new(shift||"index.html");
>
>   while (my $token = $p->get_tag("a")) {
>       my $url = $token->[1]{href} || "-";
>       my $text = $p->get_trimmed_text("/a");
>       print "$url\t$text\n";
>   }
>
> (HTML::TokeParser::Simple  is not in the ports tree yet but
> will be once the current port freeze is over but
>
> perl -MCPAN -e shell
> cpan> install HTML::TokeParser::Simple
> Running install for module HTML::TokeParser::
>
> will perform the necessary magic :-

perhaps I missed something, but the one thing i strongly discouraged was
manually trying to parse markup language like html, xml, and the gang.  i
haven't seen anyone else make any suggestions at all in this thread. 
besides, if you read the original post:

http://lists.freebsd.org/pipermail/freebsd-questions/2004-April/044899.html

what he really wants is to search for some values that just happen to be
in an html document, manipulate them then replace with new values.  these
values he's searching for have little to do with the surrounding markup
best I can tell.

none of this is to say i have offered the best possible solution.  i don't
claim to be wise, expert, or 1337 when it comes to perl.

aaron