Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Getting certain information from html files.

Status
Not open for further replies.

JBF89

Technical User
Oct 10, 2007
2
GB
Okay I have only recently started learning perl, and what I am aiming to do is to update an old perl program I used to use so that it still works.
The old perl program is H2SA or halo 2 stats aggregator, which basically would download all the games results pages off of bungie and extract your personal stats and present them as a website. But the owner has not updated this in ages. What I am trying to do is to get it to work with halo 3. Since H2SA the bungie website has changed so it has become obsolete.
I've got some of it to work in that I've got it to download the new pages but I am struggling in finding a way for it to take out the GameIds and put them into a notepad file.
I know how to check for a string appearing in the files, but I don't know how I would transfer only a part of that string into a notepad file. I can't put the whole string there as the whole string is:
"gameid=27968128&amp"
All I want returned is the number, in this case 27968128. Is there a way to do this? If so can someone please tell me how?
Thanks for any help.
Tom
 
You could use the split operator with = and & as the separator, as in:

my @number = split /=&/, $GameId;
print "$number[0] = $number[1]\n";


I hope that helps.

Mike.
 
Okay that looks like it would work cheers, but how would I get this to work over a large amount of numbers? You see what I have is a web page downloaded by wget which has a long list of links to games, so I use:
Code:
while($i<=$totalpages)
{
        open FILE, "page$i.$user.html" or die "Invalid page0 file.  Is wget installed?\n";
        @lines = <FILE>;
        close FILE;
        foreach $line (@lines)
        {
                if($line =~ m/gameid\=(.+)\&amp/){
                        

                        $numgameids++;
                }
        }
        $i++;
}

This checks for the gameid=...&amp string on each line so how would I integrate split into that?
Cheers.
Tom
 
Code:
[red]my @gameids;[/red]
while($i<=$totalpages)
{
        open FILE, "page$i.$user.html" or die "Invalid page0 file.  Is wget installed?\n";
        @lines = <FILE>;
        close FILE;
        foreach $line (@lines)
        {
                if($line =~ m/gameid\=([red]\d[/red]+)\&amp/){
                        
                        [red]push @gameids, $1;[/red]
                        $numgameids++;
                }
        }
        $i++;
}
[red]print join("\n", @gameids);[/red]
The parentheses in the regex already capture the gameid from the line. I've just tightened up the regex to only capture numeric game ids, and stuffed them into an array for later use.

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top