pulling pages from PDFs

goBoating · Nov 6, 2000

I am working on publishing a large number of public domain documents on the web. We have them as PDFs and as OCR'd text files. The OCR files serve as a means to search the documents for key words, but they are fairly ugly, and the PDF's give an accurate picture of each page, but are fairly large. Unfortuanately, each doc = one file and the files are fairly large (10 megs up to about 60 megs). Needless to say (write) that takes a while with a 56k modem. So, does anyone know if any of the existing Perl PDF modules will pull a specific page from a PDF? Are there any examples of this being done? Any clues would be appreciated.

Thanks,

keep the rudder amid ship and beware the odd typo

MikeLacey · Nov 6, 2000

no luck on CPAN then i take it?
Mike
michael.j.lacey@ntlworld.com

goBoating · Nov 7, 2000

I looked and did not see any methods listed for any module that looked like they would pull a single specific page from a PDF. I'm hoping that someone has played with this before and knows of some un-documented or poorly documented opportunities. If no one knows, then I guess I'll have to get the modules that are on CPAN, install them and see if any of them will do the trick. I was hoping to avoid the hunting expedition.

keep the rudder amid ship and beware the odd typo

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

pulling pages from PDFs

goBoating

Programmer

MikeLacey

MIS

goBoating

Programmer

Similar threads

Part and Inventory Search

Sponsor