Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

read contents of a pdf file -- is it doable with perl?

Status
Not open for further replies.

max1x

Programmer
Jan 12, 2005
366
US
Can I search the contents of a pdf file, without conversion to doc, rtf html or text...etc first in a *nix env?

I've looked here and based on the README files thought I might be able to do this with a couple of module, with no success.
 
I have never done it, but the modules say you go do it so I'd have to assume it is possible.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Hey Travis,

Utilzing the modules, like PDF, PDF::FDF::Simple, I was able to extract header, title, footer...etc, but not the actual contents. I keep on working through this and wait for feedback [bigears]
 
I've just started writing pdf's using pdf::create but not done any reading.
 
You could use a utility program to convert the file to something more readable, then read that instead. For example there's pdftohtml which works on Linux and probably other systems.
 
Sorry, I see you don't want to do conversion first. So forget that solution.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top