Perl Help For Newbie -- SOLVED

Wed Apr 28 08:57:43 PDT 2004

> On 4/26/2004 9:50 AM Drew Tomlinson wrote:

> I'm trying to write a perl script to modify a web page.  The source 
> page is full of lines
> such as:
>
> <a href="../../catalog/books/html/amagicianamongthespirits.html">A 
> Magician Among the Spirits - Houdini </a>$75.00 $67.50 $125.00<br>
> <a href="../../catalog/books/html/amaterial.html">&quot;A&quot; 
> Material - Jim Pace</a> $18.00 $16.20 $29.95<br>
> <a href="../../catalog/books/html/absolutemagic.html">Absolute Magic - 
> Derren Brown</a> $24.00 $22.80 $39.95<br>
>
> I want to take the first amount and multiply it by 1.5 and replace it, 
> remove the second amount, and keep the third
> amount the same.  So for example, the first line would be converted to:
>
>  <a href="../../catalog/books/html/amagicianamongthespirits.html">A 
> Magician Among the Spirits - Houdini </a>$112.50 $125.00<br>
>
> I am brand new to Perl but have been reading and experimenting for the 
> past two weeks.
> I've managed to open my file and read the contents into an array 
> called "@page":
>
> open(DATA, "< $input")       or die "Couldn't read from datafile: $!\n";
> my @page = (<DATA>);
>
> Now I am trying to use the s/// operator to perform the math and 
> substitution.  I get
> close to what I want but I'm not quite there.  This code
>
> foreach (@page) {
>    $_=~ s/^\s+//gm;             #removes leading whitespace
>    $_=~ s/\d+\.\d\d/$&*1.5/e;   #finds 1st $ amount and adds 50%
> }
>
> produces this output:
>
>  <a href="../../catalog/books/html/amagicianamongthespirits.html">A 
> Magician Among the Spirits - Houdini </a>$112.5 $67.50 $125.00<br>
>
> How can I format the converted amount back to US dollars ($112.50)?  
> I've seen
> subroutines to format US currency but can those be used with my 
> current approach? Would "printf" be a possible choice? Should I use 
> the "split" function to separate the
> data in fields such as link, description, price1, price2, price3 and 
> then rebuild each
> line with concatenation?  Is there some other way?
>
> Any guidance as to the best way to approach this task would be most 
> appreciated.  I've
> done lots of reading but haven't found anything that teaches me how to 
> "think" about
> building this script.

Thank you for your responses.  The further I get into this the more I 
find I need to learn.  For the archives, the code that solves my initial 
question is this:

# assign each item in array to scalar then do stuff to scalar
foreach my $line (@inputpage)
{

  # discard leading and trailing and collapse internal whitespace.
  $line = join(" ", split " ", $line);

  # Add newline to end of each line as previous statement removes it.
  $line =~ s/(.*)/$&\n/;

  # assign dollar values to separate scalars
  if ( my ($val1,$val2,$val3) = $line =~
        /\$\s*(\d+\.\d\d)\s+\$\s*(\d+\.\d\d)\s+\$\s*(\d+\.\d\d)/)
    {

        # Perform math on $val1 and format to 2 decimals
        my $price = sprintf "%.2f",$val1 * 1.5;

        # Search $line for dollar values and replace with new
        $line =~ 
s/\$\s*(\d+\.\d\d)\s+\$\s*(\d+\.\d\d)\s+\$\s*(\d+\.\d\d)/\$$price \$$val3/;
        # Store $line in array
        push(@outputpage, $line);
    }
}

close DATA;

open(DATA, "> $outputfile")     or die "Couldn't open $outputfile: $!\n";
print DATA "@outputpage";
close DATA;

exit;

Although now that I'm getting into it, I can see where I want to 
manipulate the HTML code as well.  Thus I will start reading about perl 
modules, HTML::Parser in particular, and see where it takes me.  Any 
links to beginner material about modules, especially HTML::Parser will 
be most appreciated.

Thanks for your help!!!

Drew