Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Using <CFFILE> to manipulate a text file 1

Status
Not open for further replies.

aleci

Programmer
May 22, 2001
40
GB
Hi,
I have the source code of a simple web page in a text file.

I need to 'pick' certain areas out
- such as the text between <a href='>...TEXT I NEED.. </a> -
for entry into a DB field.

Is there a way CFFILE could do this or does there exist a function that could recognise various strings ?

Thanks
 
Hi Aleci,

CFfile will read the contents into a variable for you. You'll then need to use the various string functions to strip out the text you need. In these types of problems, the structure of the html page and whether it will be changed by others will determine the best way to strip out the text.

If the text you want is a link and you know the link's url, you can use the findnocase() function to find the occurance of the text like this.

<cfset x=findnocase(&quot;<a href=&quot;&quot;index.html&quot;&quot;>&quot;,myPageVar)>
<cfset y=findnocase(&quot;</a>&quot;,myPageVar,x)>
<cfset myNewText=mid(myPageVar,x+21,y-x-21)>

Without being able to see the entire page you're working with, it's hard to tell you the best way to strip it out.

Hope this helps,
GJ
 
Hi GJ,
Thanks for your reply, its pointing me in the right direction but still a bit stumped .

I understand i will need some sort of recognition function as well as a loop to go through several 'headlines' but how to actually extract the text into a variable... well.

I have a text file called guardian.txt and need to extract the highlighted text from it for example.

<A HREF='Cricket rocked by corruption report</A>

Thanks
 
Hey Aleci,

In that particular case, I think this will work best.

<cfset myPageVar=&quot;<A HREF=' rocked by corruption report</A>&quot;>

<cfset x=findnocase(&quot;A HREF='<cfset x=findnocase(&quot;.html&quot;,myPageVar,x)>
<cfset y=findnocase(&quot;</a>&quot;,myPageVar,x)>
<cfset myNewText=mid(myPageVar,x+7,y-x-7)>

<cfoutput>#myNewText#</cfoutput>

I modified it to look for the occurance of the start of a link, then skip to the end of the link, and then strip out the text. This way, links to various parts of the site will all pass through fine instead of one specific link since this appears to be a regularly changing link.

Let me know if you still have trouble,
GJ
 
Hi GJ,
Much thanks for your help, it works fine now.
Next step for me is to modify the code to create a loop to run through 3 or 4 headlines at a time - shouldn't be a problem but if i get stuck i know who to ask!

Thanks once again

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top