Home  |  Linux  | Mysql  | PHP  | XML
From:Rob Coops Date:Wed May 14 10:24:16 2008
Subject:Re: perl and web
This has been asked at least as often as it has been anwsered and so far the
most flexible solution I have found, not the simplest though.

WWW::Mechanize<http://search.cpan.org/~petdance/WWW-Mechanize-1.34/lib/WWW/Mechanize.pm>

There are a lot of others out there and a lot of them have been build on top
of this little gem. If you are looking to scrape a major website like
hotmal, yahoo, imdb, gmail, etc, you will most likely find modules dedicated
to that website that do all the dirty work for you.

Be careful though the more specific the implementation the more likely it is
that a small layout change on the website can break your scrapping module.
But I assume that you figured that one out already.

Regards,

Rob Coops

On Wed, May 14, 2008 at 6:05 PM, Richard Lee <rich.japh@gmail.com> wrote:

> Hi guys again!
>
> I am sure this questions been around for while but I am not sure where to
> begin.
>
> I am trying to grep a html page given a URL and then extract some
> information from the source code.
> So something like
>
> open FH, "www.example.com/index.html | " , or die "no way : $!\n";
> @array = <FH>;
>
> my $code;
> while (@array) {
>       next if /bleh/;
>       if ( /^From: (.*)/ ) {
>           $code = $1;
>       }
> }
>
> You get the idea.. so anyway I did the search
> on google
>
> 'how to grep a web page source using perl'   -- no luck
> web perl modules   ---> reading
> http://www.perl.com/pub/a/2002/08/20/perlandlwp.html
>
> I guess the reason I wrote this out is to see if anyone else begining perl
> webpage can use my search or perhaps someone can tell me I am doing
> something stupid
> as perl and web seems to be pretty common operation but this is only way I
> know how to at this point.
>
> anyway, just sharing on this... and also look for feedback
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> http://learn.perl.org/
>
>
>

Navigate in group perl.beginners at sever nntp.perl.org
Previous Next




  
© No Copyright
You are free to use Anything
Site Maintained by PHP Developer
Powered By PHP Consultants