azzazzello
Technical User
greetings everyone,
I have to connect to an https site, log in, browse to a particular page, input field selections, and extract certain information from it. I wish to automate this process. I am, however, running into a problem. Here is what I have so far
When I use a browser to log in, I see a Folder page. When I look at the output of the last line ($response->content), I see that it's the html of the initial login page. It's as if it simply doesn't see me passing my credentials. If it did, it would give me an "Invalid credentials" page html. Aldo, the status code is 302 FOUND, and the headers start with a redirect message of "Object moved to...". It's not generating the cookies.txt file either, and I am assuming it needs it, considering I see a bunch of authentication info in my Firefox cookies after I go to the website. What am I doing wrong here? Thank you for any help or pointers
I have to connect to an https site, log in, browse to a particular page, input field selections, and extract certain information from it. I wish to automate this process. I am, however, running into a problem. Here is what I have so far
Code:
#!/usr/bin/perl
use strict;
use URI::URL;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Request::Common;
use HTTP::Request::Form;
use HTML::TreeBuilder 3.0;
use HTTP::Cookies;
my $cookie_file = 'cookies.txt';
my $ua = LWP::UserAgent->new;
my $url = url '[URL unfurl="true"]https://some.page.com/UILogon.aspx?ReturnUrl=https%3a%2f%2fsome.page.com%2blah%2fFolder.aspx';[/URL]
$ua->cookie_jar( HTTP::Cookies->new( file => $cookie_file ));
my $res = $ua->request(GET $url);
my $tree = HTML::TreeBuilder->new;
$tree->parse($res->content);
$tree->eof();
my @forms = $tree->find_by_tag_name('FORM');
die "No forms found" unless @forms;
my $f = HTTP::Request::Form->new($forms[0], $url);
$f->field("user", 'my_username');
$f->field("pass", 'my_password');
my $response = $ua->request($f->press("submitButton"));
print $response->status_line;
print $response->content;
When I use a browser to log in, I see a Folder page. When I look at the output of the last line ($response->content), I see that it's the html of the initial login page. It's as if it simply doesn't see me passing my credentials. If it did, it would give me an "Invalid credentials" page html. Aldo, the status code is 302 FOUND, and the headers start with a redirect message of "Object moved to...". It's not generating the cookies.txt file either, and I am assuming it needs it, considering I see a bunch of authentication info in my Firefox cookies after I go to the website. What am I doing wrong here? Thank you for any help or pointers