Smart questions
Smart answers
Smart people
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Member Login

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips now!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

Join Tek-Tips
*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

LINK TO THIS FORUM!

Add Stickiness To Your Site By Linking To This Professionally Managed Technical Forum.
Just copy and paste the
code below into your site.

Partner With Us!

"Best Of Breed" Forums Add Stickiness To Your Site
Partner Button
(Download This Button Today!)

Feedback

"...This is easily the most helpful website I've ever used, and this is the best forum with the quickest response time bar none...."

Geography

Where in the world do Tek-Tips members come from?

Getting Text off of a Website

LaneM1234 (Programmer)
21 Jun 12 14:56
Perl Monks!

I am very new to Perl and am trying to create a script that will allow me to download my homework assignments off of my teacher's website for a specific day. He puts our HW on his website, http://staweb.sta.cathedral.org/departments/math/m.... I would like to make a script that when given a date, finds the corresponding assignment and prints it in a blank text file. I am able to create all of the mechanics except for the copying the assignment part

I have been able to use LWP::Simple to find the text, but don't know how to make the script choose the corresponding assignment. Nor do I know how to print that into a blank text file. I don't think this is very complicated, but I'm really bad at Perl, so any/all help would be appriciated!
Annihilannic (MIS)
4 Jul 12 2:02
Are you still stuck on this? What is your code so far?

Annihilannic
tgmlify - code syntax highlighting for your tek-tips posts

MrCBofBCinTX (TechnicalUser)
4 Jul 12 12:39
I looked at the web page's source.
This one looks like a real chore to pull out the sections with regex's.

Do not worry about getting it into a file until you get it to work. print "$blah"; will let you debug without having to peek inside your new file.

This page is "unique" in a sense, since it follows a strict pattern.
One (of many) ways might be to read the web page line by line.
If it matches <tr at the beginning, start to concatenate a variable ($cool .= $line) until a line matches </tr at beginning. Then push $cool into an array or just skip to next below.

Then you can pull out (with a regex) the date section and the HW section.

If date is correct, print that into your file. Done.

look at:
perldoc perlrequick
perldoc perlretut
perldoc perlfaq6
perldoc perlre
perldoc perlrebackslash
perldoc perlrecharclass
perldoc perlreref

and
perldoc -f open
Zhris (Programmer)
4 Jul 12 20:10
I would probably make use of HTML::TableExtract to break the html up before considering using other methods i.e. regexes to extract the specific elements. An alternative or combo would be to use HTML::TreeBuilder / HTML::Element which have html lookdown and address methods. From the supplied webpage I can immediately see common groups i.e. each dates container cell has a width of 10% and each descripions container cell has a width of 85% etc etc etc.

Chris

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Back To Forum

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close