Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Extracting text from a bundle of files.

Status
Not open for further replies.

jimoblak

Instructor
Joined
Oct 23, 2001
Messages
3,620
Location
US
I confused how to handle this: I have a bunch of individual text files that I want to extract text from. The text that I want to extract is conveniently packaged between lines that say '*** START OF TEXT ***' and '*** END OF TEXT ***'. Should I do some sort of strtok trick with the asterisk-laden lines or is there another approach that someone can recommend?

An example of one of the original files follows:

//////////
Junk text appears at the beginning of the file. It is of variable length.
*** START OF TEXT ***
This is the text that I want to extract. It is of variable length
*** END OF TEXT ***
Junk text appears at the end of the file. It is of variable length.

//////////


- - picklefish - -
 
It sounds like the text terminator is guaranteed to begin a line.

If so, then open the file, reading lines and discarding the input until the line reads your as your token ("*** START OF TEXT ***").

Then continue reading lines, keeping track of the content until you get to the text termination token ("*** END OF TEXT ***").

If there is only supposed to be one block of text to pay attention to, when you get to the terminator token, close the file.

Otherwise, start reading and discarding again.

Want the best answers? Ask the best questions: TANSTAAFL!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top