Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Download authorized pages via LWP

Status
Not open for further replies.

yerman01

Programmer
Mar 12, 2005
3
IT
Hi everyone!
I've developed a little perl program using the module LWP to download italian stock pages,
to make real time statistics.
The program works for delayed data, but it doesn't work for real time data because of the
authentication.
I mean the authentication is made at session level, so I need to authenticate in some
way in the same program.
I've tried to use $ua->credentials and $req->authorization_basic, but without results.
I confess I'm novice in perl programming and in using LWP, and surely it's my fault.
Can anyone help me?

My simple code is:

$protected_page="
use LWP::UserAgent;
$ua = LWP::UserAgent->new;

#
# Authentication block ???
#

$req = HTTP::Request->new(GET => "<$protected_page>");

$res = $ua->request($req);
if ($res->is_success)
{
$out="c:/mypage.txt";
open OUT, ">$out" or die "Cannot open $out for write :$!";
print OUT $res->content;
close OUT;
}
else
{
print "Time out: fault in download \n";
}

If you would like to test if it works I can leave the logging credentials.

Site logging page: username: yer
password: man

I thank you in advance for whatever help you can give me.
 
I recently wrote a script to access a site I have a membership to. It was a secure site with a login, but didn't use simple authentication that credentials would work for. The login page sent a session cookie that was appended to the login form action when you log in. So, every time I access the site, I get the login page first, parse out the cookie from the header and add it manually to the login form action, then I make a post to get the info I really want. I activated a cookie jar file, and caching, and all of that together let me access secure pages once I had first logged in.

here is my code:
[tt]
use LWP 5.803;
use LWP::ConnCache;
use HTTP::Cookies;

# Create a new UserAgent
my $browser = LWP::UserAgent->new;
$browser->agent('Name your user agent whatever you want');
$browser->conn_cache(LWP::ConnCache->new()); # Set up the cache
$browser->conn_cache->total_capacity(undef); # cache unlimited
# create a file for cookiejar
$browser->cookie_jar(HTTP::Cookies->new(
'file' => 'e:/web/whatever_folder/cookies.lpw', 'autosave' => 1
));

# Get the login page
my $resp = $browser->get($login_url);

# Pull the session cookie out of the headers (this gets any and all cookies)
my $headers = $resp->headers_as_string;
my %cookies;
while($headers =~ m/Set-Cookie: (.*?)=(.*?);/g) {
my $cname = $1;
$cookies{$cname} = $2;
}
# Add the sessionid to the login action url (login action set previeously)
$login_action .= ';jsessionid=' . $cookies{'jsessionid'};

# Log in
sleep 3;
$resp = $browser->post($login_action,[%login_data]); # %login_data setup previously

# Now, access whatever protected content with $browser
sleep 3;
$resp = $browser->post($get_report_url,[%form_data]); # %form_data setup previously
my $report = $resp->content;
[/tt]
 
Dear threetone,
thank you for your contribution.
I've almost understood your approach (unfortunately programming is not my field, being mainly interested in statistics and economy), and I've customized my url.
About the variables [%login_data] and [%form_data] I haven't understood how to specify them.
I know I should debug and go in deep of LWP, but it's not so simple, knowing perl in very novice way.

If it comes easy for you, I would be very grateful if you could drop some lines for my case.
As I wrote:
- the protected url is - the login page is - user = yer
- pass = man

In any case I thank you very much for your support.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top