Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Rhinorhino on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

FETCH & STRIP via PERL FROM HTML

Status
Not open for further replies.

vynum

Technical User
Joined
May 26, 2001
Messages
4
Location
US
FETCH & STRIP via PERL-FROM HTML PAGE TABLE

CONNECT AND INPUT THE SELECTED RESULTS INTO A DATABASE "MySQL".

I HAVE A QUESTION. I WANT TO FETCH A WEB PAGE BUT FILTER/DISREGUARD MOST OF THE
CONTENTS AND ONLY KEEP AND INSERT IN A DATABASE & DISPLAY THE RESULTS THAT I
NEED.

There are many more things I will need to do with this data in "real time" and
after I place it in my database. but most important right now is finding out
how to Fetch & Strip from an HTML data source. After I get this down pat I
would like to insert the results into MySQL Data base so I can further Query
the results through another EXTERNAL program/script such as PHP or PerL. What
would be really great is if I could do all of this at once. 1.) Fetch Page 2.)
Strip 3.) Display Results 4.) Insert/Add to MySQL Database. Then on top of this
do some comparisons and display . An EXACT example; Lets say I have a guy who
wants to join my clan, But the rule is he has to have a minimum set of
kills/frags in order to become a full fledge clan member. Lets say 500 frags.
He can still apply, but when he signs-up to become a member I want to insure he
is a real multiplayer and verify his NICK NAME from which he is applying with &
FRAGS. I ask him to INPUT his new 'NICK NAME' and 'CLAN LOGO'. The when he
presses 'SUBMIT' I will "FETCH" his 'NICK NAME' AND 'CLAN LOGO' from
and check his current status. IF he does not meet certain
criterior, he will be labled as an "pending" member so to speak. Every day I
would like a perl or some type of script to Fetch & check to see how hes doing.
Since I do not have cron. Once he meets 500 frags under that particular name he
signed up under he will then become a FULL FLEDGE clan member. I hope I'm
making since here. Later on I would like to get kind of fancy with my html, php
or what ever and have a status page of all memembers and member wantabies.

Does anyone know of a Perl,Tcl or Python or PHP program/script that someone has
made up that would allow me to do this? I mean the "FETCHING & STRIPING" Part?
I know striping is not the proper term here but its just used so you will
understand me. Any code or help whould be apprecatied. Having a create a script
from scratch to do all the things I want to do is asking way to much, I know.
I'm just mainly concerned at this point on learning the "technique" to $fetch
and $strip.

WHAT I WANT TO DO AND WHY?

I want to fetch web stats from online multiplay gamers from the website
but strip out or filter in some way all of their layout and
garbage stuff and only "retain" or "keep" the info in the |frags| column. If I
could just learn how to &fetch and then strip the html output down to only what
I need, this would be great. If I had a script of some sort that could also at
the same time connect to my MySQL database and perform and INSERT into the
correct table I think I could handle the rest from their. the main thing is
that the script needs to do a little checking,verifying and response output on
what it found.

PRIME EXAMPLE:


Take the URL above for a real world example. If you go to this url it will
bring you to my own personal stats page at . Notice in the URL
is my info. My 'nickname' and 'clanlogo' after <name=..> VyNuM+%3DVyNuM%3D

In the html submit form I typed: VyNuM =VyNuM= but it looks like the .asp used
the &quot;%3d&quot; to ignore or disregard the '=' character. Which is ok. Now, If I
created a regular HTML form with a SUBMIT WITH A url LIKE THIS:


and place it on my own page it would work perfect. IF you go to the first
orignal URL up under PRIME EXAMPLE you will have a nice TABLE display of my
stats. The only data I'm really concerned in capturing after an '$fetch' for
now is the |frags| column. And of coarse I want to verfiy the NICK & LOGO also.
Yes, I want the NICK & LOGO to be an exact match.

I know I spent a lot of time typing here for really something that maybe
simple. Simple meaning &fetching and stripping. I read somewere today that you
can actually fetch and strip into plain .text. Well , I don't think this would
work because then I loose any TAG or markers to signify what I want to
strip/disreguard and what I want to keep or save into my database so to speak.

WHAT I AM NOT CONCERNED ABOUT:

I am not really concerned at this point on actually keeping a STATISTICAL
record. I simply want to verify an applicants NICKNAME & CLANLOGO and
catorgorize based on the response.


<td align=left><b><a
href=&quot;ranks.asp?gt=236&per=176&ofs=89&pid=6228929&quot;>Q2 Weapons
ofDestruction</a></b></td><td><b>100</b></td><td><b>5,103</b></td>
<td>309</td><td>801</ td><td>105</td><td colspan=2>2.59</td>

##### CAPTURING SPECIFIC DATA FROM A FETCHED HTML SOURCE ##### FETCH-N-STRIP V
1.0 - Is it possible a new script is in the making?

IN ADDITION TO MY POST I WOULD LIKE TO ADD THAT I HAVE NOTICED SOMETHING VERY
UNIQUE IN THE HTML CODE OUTPUT OF MY OWN PERSONAL CLAN STATS. THUS, I MIGHT BE
ABLE TO GET BY WITH SIMPLY CREATING OR ASSIGNING HTML <tags> as MARKER POINTS
IN A PERL SCRIPT THEN DISREGUARD THE JUNK AND OUTPUT THE RESULTS TO HTML AND
INPUT into MySQL in realtime.

I went to the URL:
I encourage you to do the same. This is so you can walk along with me. Also you
can quick save the html page above and open it up in a html/text editor. I
prefered a text editor because I can quickly do 'search' & 'count' functions.
Also the color coded syntax makes it fun and easier to work with.

I saved the page as an html page to my disk. I opened it up in DreamWeaver
Ultra DEV and also in both of my favorite text editors, &quot;Texturizer & Ultra
Edit 32&quot;. I wanted to see if I could find and use a particular HEADER
REFERENCE (<a heref=&quot;&quot;>) as a starting point and a '<td' as a ending point/or
marker in perl E.G. (NOTE: After saving to text I conclude this format would
probably cause more code to be written, but I'm not for sure. Simply save the
page as &quot;text only&quot; and you will see what I'm talking about.)

$beginmark = $'href=' # Start point for script to search $endmark = $'<td'
# End point were script stops its' search.

Furthermore to use this as a starting point. I also wanted to make sure that
's output was consistent and always the same. Well, low and
behold it is. This helps me greatly. Now I can narrow down the parts that
interest me the most and possibly create a script to &fetch and maybe save the
html to a /temp.txt.file for parsing, then maybe parse,copy or output only the
data I need into memory or better yet just CONNECT to MySQL and 'INSERT INTO'
<table> the particularly specified data found and marked to be saved in
realtime via perl.

FIRST THING - I noticed that the very first HEADER REFERENCE found in a SEARCH;

<a href=&quot;ranks.asp?gt=236&per=176&ofs=89&pid=6228929&quot;>

This ( <a href=&quot;&quot;> ) could represent the starting point from the stats that I
need. Now there are repeating HEADER REFERENCES exactly the same followed by,
in a <td |TABLE FORMAT| /td> the following:

(TYPE|RANK|RATING|MINUTES|*FRAG*|AVERAGE PING|AVERAGE FRAGS PER/MIN|)

IN THE SNIPIT BELOW YOU WILL SEE THIS along with the proper html <td></td> &
<b></b> tags.

* NOTE: AT THE CURRENT TIME I AM ONLY INTERESTED IN THE *FRAG* PART.

* ALSO NOTE:

Since I do also want to verify the 'NICK NAME' & 'CLAN LOGO' I would like to
mention that the &quot;Very First&quot; <INPUT TYPE=TEXT NAME=&quot;name&quot; VALUE=&quot;VyNuM
=VyNuM=&quot;> SEARCH gives me the 'NICKNAME' & 'CLANLOGO'. This could also be the
mark in perl to verify an existence TRUE or FALSE statement. Does the
applicant's name exist EXACTLY?

HERE IS A SNIPIT OF THE PAGE WHICH INCLUDES ALL OF THE STATS FROM <BEGINING TO
END\>. ANYTHING ELSE ABOVE OR BELOW THE SNIPIT WAS THROWN AWAY/DUMPED. EXCEPT
FOR THE &quot;Very First&quot; INPUT statement:

#BEGIN-SNIPIT

<INPUT TYPE=TEXT NAME=&quot;name&quot; VALUE=&quot;VyNuM =VyNuM=&quot;>

#SNIPIT

<td align=left><b><a href=&quot;ranks.asp?gt=236&per=176&ofs=89&pid=6228929&quot;>Q2
Weapons
ofDestruction</a></b></td><td><b>100</b></td><td><b>5,103</b></td><td>309</td><t
d> 801</ td><td>105</td><td colspan=2>2.59</td></tr><tr align=right><td
align=left><b>OFFICIAL</b></td><td align=left><b>May 13 - May 19
2001</b></td><td align=left><b><a
href=&quot;ranks.asp?gt=236&per=175&ofs=66&pid=6228929&quot;>Q2 Weapons of
Destruction</a></b></td><td><b>77</b></td><td><b>5,696</b></td><td>585</td><td>1
,228< /td><td>627</td><td colspan=2>2.10</td></tr><tr align=right><td
align=left><b>history</b></td><td align=left><b>May 6 - May 12 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=174&ofs=27&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>38</b></td><td><b>7,258</b></td><td>1,175</td><td
>2,37 1</td><td>130</td><td colspan=2>2.02</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Apr 29 - May 5 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=173&ofs=91&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>102</b></td><td><b>4,978</b></td><td>974</td><td>
1,631 </td><td>188</td><td colspan=2>1.67</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Apr 22 - Apr 28 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=172&ofs=111&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>122</b></td><td><b>4,474</b></td><td>548</td><td>
951</ td><td>113</td><td colspan=2>1.74</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Apr 15 - Apr 21 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=171&ofs=63&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>74</b></td><td><b>5,547</b></td><td>945</td><td>1
,548< /td><td>104</td><td colspan=2>1.64</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Apr 8 - Apr 14 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=170&ofs=46&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>57</b></td><td><b>5,863</b></td><td>669</td><td>1
,356< /td><td>86</td><td colspan=2>2.03</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Apr 1 - Apr 7 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=169&ofs=53&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>64</b></td><td><b>5,716</b></td><td>268</td><td>7
44</t d><td>72</td><td colspan=2>2.78</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Mrt 25 - Mrt 31 2001</b></td><td
align=left><b><a
href=&quot;ranks.asp?gt=4526&per=168&ofs=1242&pid=6228929&quot;>Half-Life
Frontline</a></b></td><td><b>1,253</b></td><td><b>3,552</b></td><td>338</td><td>
584</ td><td>0</td><td colspan=2>1.73</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b> </b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=168&ofs=522&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>533</b></td><td><b>2,733</b></td><td>391</td><td>
472</ td><td>96</td><td colspan=2>1.21</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Mrt 18 - Mrt 24 2001</b></td><td
align=left><b><a
href=&quot;ranks.asp?gt=4526&per=167&ofs=1544&pid=6228929&quot;>Half-Life
Frontline</a></b></td><td><b>1,555</b></td><td><b>2,348</b></td><td>306</td><td>
360</ td><td>0</td><td colspan=2>1.18</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b> </b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=167&ofs=65&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>76</b></td><td><b>6,870</b></td><td>408</td><td>1
,184< /td><td>103</td><td colspan=2>2.90</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Mrt 11 - Mrt 17 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=166&ofs=13&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>24</b></td><td><b>8,492</b></td><td>833</td><td>2
,657< /td><td>178</td><td colspan=2>3.19</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Mrt 4 - Mrt 10 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=165&ofs=56&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>67</b></td><td><b>6,058</b></td><td>713</td><td>1
,495< /td><td>117</td><td colspan=2>2.10</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Feb 25 - Mrt 3 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=164&ofs=1214&pid=6228929&quot;>Q2
Weapons of
Destruction</a></b></td><td><b>1,225</b></td><td><b>664</b></td><td>74</td><td>2
06</t d><td>77</td><td colspan=2>2.78</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Feb 18 - Feb 24 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=163&ofs=74&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>85</b></td><td><b>5,768</b></td><td>812</td><td>1
,407< /td><td>86</td><td colspan=2>1.73</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Feb 11 - Feb 17 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=162&ofs=123&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>134</b></td><td><b>4,381</b></td><td>504</td><td>
859</ td><td>76</td><td colspan=2>1.70</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Feb 4 - Feb 10 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=161&ofs=109&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>120</b></td><td><b>4,838</b></td><td>497</td><td>
963</ td><td>118</td><td colspan=2>1.94</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Jan 28 - Feb 3 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=160&ofs=141&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>152</b></td><td><b>4,494</b></td><td>440</td><td>
849</ td><td>99</td><td colspan=2>1.93</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Jan 14 - Jan 20 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=158&ofs=102&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>113</b></td><td><b>5,232</b></td><td>560</td><td>
1,137 </td><td>106</td><td colspan=2>2.03</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Jan 7 - Jan 13 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=157&ofs=63&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>74</b></td><td><b>5,786</b></td><td>775</td><td>1
,612< /td><td>147</td><td colspan=2>2.08</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Dec 31 - Jan 6 2001</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=156&ofs=63&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>74</b></td><td><b>5,719</b></td><td>777</td><td>1
,696< /td><td>111</td><td colspan=2>2.18</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Dec 24 - Dec 30 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=155&ofs=288&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>299</b></td><td><b>3,318</b></td><td>302</td><td>
506</ td><td>83</td><td colspan=2>1.68</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Dec 17 - Dec 23 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=154&ofs=52&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>63</b></td><td><b>6,176</b></td><td>535</td><td>1
,332< /td><td>132</td><td colspan=2>2.49</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Dec 10 - Dec 16 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=153&ofs=284&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>295</b></td><td><b>3,584</b></td><td>480</td><td>
747</ td><td>117</td><td colspan=2>1.56</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Dec 3 - Dec 9 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=152&ofs=118&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>129</b></td><td><b>4,858</b></td><td>645</td><td>
1,255 </td><td>102</td><td colspan=2>1.95</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Nov 26 - Dec 2 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=151&ofs=19&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>30</b></td><td><b>7,368</b></td><td>652</td><td>1
,829< /td><td>116</td><td colspan=2>2.81</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Nov 19 - Nov 25 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=150&ofs=45&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>56</b></td><td><b>6,392</b></td><td>746</td><td>1
,739< /td><td>113</td><td colspan=2>2.33</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Nov 12 - Nov 18 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=149&ofs=33&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>44</b></td><td><b>6,766</b></td><td>824</td><td>1
,921< /td><td>107</td><td colspan=2>2.33</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Nov 5 - Nov 11 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=148&ofs=86&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>97</b></td><td><b>5,557</b></td><td>914</td><td>1
,776< /td><td>174</td><td colspan=2>1.94</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Oct 29 - Nov 4 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=147&ofs=66&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>77</b></td><td><b>5,744</b></td><td>577</td><td>1
,263< /td><td>112</td><td colspan=2>2.19</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Oct 22 - Oct 28 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=146&ofs=178&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>189</b></td><td><b>4,653</b></td><td>534</td><td>
909</ td><td>113</td><td colspan=2>1.70</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Oct 15 - Oct 21 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=145&ofs=171&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>182</b></td><td><b>4,774</b></td><td>453</td><td>
954</ td><td>121</td><td colspan=2>2.11</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Oct 8 - Oct 14 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=144&ofs=312&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>323</b></td><td><b>3,758</b></td><td>166</td><td>
447</ td><td>121</td><td colspan=2>2.69</td></tr><tr align=right><td
align=left><b> </b></td><td align=left><b>Oct 1 - Oct 7 2000</b></td><td
align=left><b><a href=&quot;ranks.asp?gt=236&per=143&ofs=19&pid=6228929&quot;>Q2 Weapons
of
Destruction</a></b></td><td><b>30</b></td><td><b>7,368</b></td><td>722</td><td>1
,997< /td><td>134</td><td colspan=2>2.77</td></tr>

## END SNIPIT

Now lets take a closer look at the first <a href=&quot;&quot;> from the snippit above.

<td align=left><b><a href=&quot;ranks.asp?gt=236&per=176&ofs=89&pid=6228929&quot;>Q2
Weapons of
Destruction</a></b></td><td><b>100</b></td><td><b>5,103</b></td><td>309</td><td>
801</ td><td>105</td><td colspan=2>2.59</td>

IMPORTANT NOTE: I will and need to 'skip' this HEADER REFERENCE. Why? Because
if you take a look at the webpage in your browser you will notice in the first
column marked |CLQ| This first &quot;ranks.asp?gt=&quot; is (IN PROGRESS) thus it will
and could change in value either HIGHER or LOWER than the 500 frags needed for
the applicant. Or it could never ever reach the required minimum of 500 since
it is and always will be considered &quot;IN PROGRESS&quot;. So, with this in mind I need
to have the script skip to the next (SECOND) '<a href=&quot;&quot;>' HEADER REFERENCE.
The second HEADER REFERENCE and ALL following will be of a fixed values. It
will not change. I need to use this to verify credentials. Furthermore it is
possible, I think, and even a better idea if I could have the script go through
and &quot;add up&quot; the remaining |FRAGS| per each consecutive HEADER REFERENCE. Now
it shouldn't take long for anyone to kill 500 people. In the game of coarse. :)

Let me explain a little further. Exactly what number field do I need to get to
and add up. Well, we know it is under the |FRAG| column. So, from looking at
the SECOND &quot;header reference&quot; blow taken out of the snipit, I can easily count
the number of <td>'s and come to my |FRAG| column. But before we go to the
second <href=&quot;&quot;> lets look at the first and see how many frags I have
'INPROGRESS'? From looking at it below my INPROGRESS FRAGS = 801 .

First lets act hopefully like a perl script was designed to do and count up the
<td>'s and see how many it takes to get me to the 801 figure ( THE |FRAG|
COLUMN). Starting with the very first <td. Lets count to see how many licks it
will take to get to the |FRAG| numerical value.

The answer is: 5 # EXCLUDING ANY CLOSING TAGS </td>. THE MAGIC NUMBER = 801

<td align=left><b><a href=&quot;ranks.asp?gt=236&per=176&ofs=89&pid=6228929&quot;>Q2
Weapons of
Destruction</a></b></td><td><b>100</b></td><td><b>5,103</b></td><td>309</td><td>
801</ td><td>105</td><td colspan=2>2.59</td>

If you count over to the fifth <td> you will end up at the frag column and the
total amount thus far equals 801. I did not include the closing tags </td>.
Please remember that this still is the first &quot;IN PROGRESS&quot; frag count. So lets
tell perl to skip it and go on to the second &quot;ROW&quot; under the column |FRAG|.
Here it is taken from the SNIPIT. Now if you open up your browser again you
will notice we will be looking at the &quot;OFFICAL&quot; row. This is the row the script
needs to begin its work. The next and following rows are &quot;HISTORY&quot;. along with
OFFICIAL the HISTORY frags are all permanent values. WHOOPIE :).

<tr align=right><tdalign=left><b>OFFICIAL</b></td><td align=left><b>May 13 -
May 19 2001</b></td><tdalign=left><b><a
href=&quot;ranks.asp?gt=236&per=175&ofs=66&pid=6228929&quot;> Q2 Weapons of
Destruction</a></b></td><td><b>77</b></td><td><b>5,696</b></td>
<td>585</td><td>1,228< /td><td>627</td><td colspan=2>2.10</td>



QUESTION: If I applied for membership into the =VyNuM= clan, from looking at
the above, would I qualify in becoming a FULL FLEDGE MEMEBER? Or just a
&quot;pending&quot; member? You betch'a. I fragged 1,228 dudes from May 13 - May 19 of
this year. I know, were do I find the time to goof off? HEHE :)

QUESTION: What if the applicant just got started and didn't play as many of
days as I did for this time period. And his first week he got past the 500 but
this week he only fragged 499? Well, if the script was designed to only look at
this figure and this &quot;ROW&quot; alone, he would get pretty pissed and email me bad
junk mail. Nah, but this is the reason for me thinking that the script needs to
go ahead with the rest of the |FRAG|'S and add them all up to get a TRUE GRAND
TOTAL. Now if the applicant still doen't have at least 500 frags....well, he
needs to get to work and earn his rewards.

One last thing. I'm sure you've noticed by now that appears to be some more
things in the html that could probably be of great use to me. Well, there is.
The member's ID # from is very important. If I had the script to
also place, &quot;INPUT INTO&quot; MySQL I could create a form field on a web page and
have myclan memebers quickly look up their stats from this ID alone. Well, I
don't wanta get to fancy right now, but maybe I should. Well, I could always
have the script with configuration variables to turn on &quot;1&quot; or turn off &quot;0&quot;
fields I wish to fetch,include,add,edit, display etc..

By now, if you have read this thus far I want to thank you. Hopefully now you
can see exactly what I have been striving for. and yes maybe it does look like
I need to create a script from scratch. I just hope there is someone out there
whom are perl,php etc. experts who would love to donate some time to this and
help me out. I have the idea, I just don't have the PERL knowledge to do it all
by myself. Maybe we can create a generic script and have it configurable to
fetch and insert into SQL databases etc. Sale it, get rich and go fishing. I'm
game for anything in making money. But for now I need to get this thing a
crankin.

Any help from anyone is and will be greatly appreciated.

VyNuM =VyNuM= &quot;If it works, sumthenz wrong&quot;

P.S. Is it possible to use HTML <tags>X-MARKS-THE-SPOT</tags> as markers in
perl? Maybe striping is not what I want to do. capturing marked points in a
webpage sounds like a winner to me.
 
wow. that's alot of question there. i'll try get to it soon, but i don't have time while i'm at work. one question: have you looked into XML for the encapsulation of your data? XML is much easier to strip off of the data and get at their juicy inards. it's not too much different than HTML, just much more rigorously defined (and the better defined the wrapper is, the easier it is to take off).

regards,
stillflame &quot;If you think you're too small to make a difference, try spending a night in a closed tent with a mosquito.&quot;
 
Yes it is a bit long and I'm sorry for that but I just wanted to make sure someone completely understood my needs.
Trying not to repeat my self too much, Lets take for an example;

You come to my website and want to join and become a member. You click join button. You are presented with a somewhat standard form to fill out.

#FORM FIELDS - Not all listed but to show just as example.
'firstname'
'clanname' (example VyNuM)
'clanlogo' (example =VyNuM=)
'email'
'homepage'
'icq'
'country'
'picture'
'comments'

When you press submit, you are shown an intermission window saying, &quot;Please wait while we verify your statistics&quot;. In the back ground the script fetches your 'clanname','clanlogo' and performs a true false statement on the number of frags found. If the exact 'clanname' and 'clanlogo' has been found but the |FRAGS| returns false, ( below or NO 500 value ) you are accepted as a memeber into the clan but, as a pending member.

If all info returned is true you become a 'fullmemember'.
If the script retreives no information then you will have to go into the game and frag some people and build up a |FRAG| count first. This is so to show that you are and will be LOYAL to the clan. You see, the =VyNuM= clan will be offering some special benefits in the future and I want to make sure not any Tom, Dick & Harry simply fills out a signup form and reaps all benefits without actually playing Online Multiplay.

If someone could take my idea and cleaverly create a script in a GENERIC since, which would be easily modifiable and configurable and peform this type of fetch and strip. This script could actually be used for many puposes.

XML:

At the present I'm cramming my brain with MySQL and PHP4. I have read a little about XML. I will study some more on its capabilities. Thank you for sharing that with me. I try and learn about &quot;encapsulation&quot; and get a better understanding how xml utilizes this and MySQL incorperation.

VyNuM =VyNuM=
&quot;If it works, sumthenz wrong&quot;
 
okay, disregard my post about XML, i hadn't actually read all of your questions, and as you have no control over the format of the data you're reading, you can't very well convert it's format before-hand.

now, for the path to completing this. either of the modules LWP or LWP::Simple is what you need to &quot;fetch&quot; the html. download one of these (you will only really need the simple one, but you can get the full one for potential later use), and read through the documents that come with it. your server may already have it installed, otherwise, you may have to just have a copy of it in your cgi-bin (i think that would work...). these modules are really easy to use for what you're doing. you can search this forum for examples, or ask another question if you can't get it to work.

then, for the &quot;stripping&quot; process, you'll use regular expressions. after you've read in the html of the page into a single scalar variable (don't worry, this will make sense once you have LWP working), you'll do exactly what you thought you would: look for a certain html tag that signifies the beginning of the data you want, then separate it out. from here, the best idea i can come up with is to turn it into a two-dimentional array using the split function. basically, since it's just a simple table, first you'll split it up into rows by looking for &quot;</tr><tr>&quot;, then you'll split it up by columns on the &quot;</td><td>&quot; parts. here's how that would look:[tt]
$html_page =~ s~<tag>(.*?)</tag>~$1~im;
my @rows = split(m~\s*</tr>\s*<tr>\s*~, $html_page);
@rows = map {[split(m~\s*</td>\s*<td>\s*~, $_)]} @rows;

#@rows is now a 2D array
#you can then do things like:

my $total_frags;
map {$total_frags += $_->[6]} @rows;[/tt]

the most important part will be that first regualar expression: you need to make sure that it only matches the data inside the correct table, not including any &quot;<table>&quot; tags or anything. it may be that since the table in question is the last table on the page, you could use a regex more like, &quot;s~.*<table>(*.?)</table>~$1~im&quot;, but you'll have to play with it to get it to work correctly and in all circumstances.

also, if the nickname or whatever doesn't match an exact one, it should fail when you try to retrieve the page, yes? won't the webpage not even exist in the case of a fake user? you can just look for that: some sort of error message &quot;user does not exist&quot;... this would be easy to do right off the bat after you've grabbed the page, as it'll likely be easy to notice (a simple regex on &quot;m~user does not exist~im or whatever other unique message the page displays in this case).

for simulating the cron effects, do a search in this forum for &quot;cron&quot;. it was a recent thread.

i hope this puts you on the right track. if you do want to ask more questions that you can't find answers to in the archives, i'd suggest starting new threads. the size of your original question may scare most posters away.

regards,
stillflame &quot;If you think you're too small to make a difference, try spending a night in a closed tent with a mosquito.&quot;
 
Stillflame,

Thank you very much for you help. You've given me a good starting point. I'll work with the information you've given me and see what I can do. I will also take your suggestion on starting a new thread. I won't start not one for a while since I have alot of tinkering to do.

This is my LWP:
#
# $Id: LWP.pm,v 1.97 2001/03/14 21:25:28 gisle Exp $

package LWP;

$VERSION = &quot;5.51&quot;;
sub Version { $VERSION; }

require 5.004;
require LWP::UserAgent; # this should load everything you need

1;

# I Also have the LWPsimple.pl

I'll keep you informed from time to time and let you know
what I come up with.

VyNuM =VyNuM=
&quot;If it works,sumthenz wrong&quot;
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top