INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

Import PDF into Word Document

Import PDF into Word Document

(OP)
Well, I am stumped so I am back with another one. I have manually imported a PDF successfully using the insert -> object function in word, but am not really wanting to go with this method as of the long wait associated and it just doesn't format very nicely. Rather, if I open a PDF, use the "take a snapshot" in any of my PDF reader programs and paste that into word as an image it looks great. Is there anything you guys can point me to in which I can have VBA doing this process for me? Thank you for anything! :]

RE: Import PDF into Word Document

(OP)
So, I mostly have it.. I just need help with what reference to use to open a file, take a snapshot and close the PDF file pretty please!! :]

RE: Import PDF into Word Document

I don't believe you have the libraries you need to do it unless you have the full version of Acrobat installed.

Enjoy,
Tony

------------------------------------------------------------------------------------
We want to help you; help us to do it by reading this: Before you ask a question.

I'm working (slowly) on my own website

RE: Import PDF into Word Document

(OP)
I have made a bit of progress with being able to use this concept via Adobe Reader within the Internet Explorer window. Can anyone help me finesse this a bit please?? I use a time delay procedure hence the WaitSeconds(10) and when this file opens, I can't manually hit ctrl+c and ctrl+a. I have to left click on the document and then I can ctrl+c. Any help here would be GREATLY appreciated guys. Thanks!

CODE --> VBA

Sub PDFcopy()
    Dim ie As Object
    Dim sPath As String
    Dim fso, fls
    
    sPath = "C:\Folder Path\"
    
    Set ie = CreateObject("InternetExplorer.Application")
    
    ie.Visible = True
    
    Set fso = CreateObject("Scripting.FileSystemObject")
    Set fls = fso.GetFolder(sPath)

            ie.navigate "file://" & sPath & "M.pdf"
            
            WaitSeconds (10)
            'SendKeys "^(a)", True
            
            WaitSeconds (5)
            
            Err.Clear
            Do
                On Error Resume Next
                SendKeys "^(c)", True
            Loop While Err.Number <> 0 'And Len(ClipBoard_GetText) = 0
            
    
    ie.Quit
    
    Set ie = Nothing
    Set fls = Nothing
    Set fso = Nothing
End Sub 

RE: Import PDF into Word Document

(OP)
Well, I came across these commands and they don't seem to be triggering. I have stepped through the code with no success. Anyone here have any experience using these commands? I can open the PDF via reader in the InternetExplorer but just don't seem to be able to pass these commands to the explorer window. An error message pops up saying that the "object invoked has disconnected with the client".. any pointers would be amazing!! :]

CODE --> VBA

With ie
            .ExecWB OLECMDID_SELECTALL, OLECMDEXECOPT_DONTPROMPTUSER
            .ExecWB OLECMDID_COPY, OLECMDEXECOPT_DODEFAULT
            End With 

RE: Import PDF into Word Document

Given that the entire purpose of a PDF is to maintain a fixed layout no matter where/how it is displayed or printed, what do you mean by "it just doesn't format very nicely"?
http://www.tek-tips.com/viewthread.cfm?qid=1767687
>I came across these commands and they don't seem to be triggering.

Well, you seem to be missing some key code, where your constants are defined (e.g. OLECMDID_SELECTALL). This also suggests that you do not have Option Explicit set

So, you might like to add to your code

CODE

Option Explicit
Const OLECMDID_SELECTALL = 17 
Const OLECMDEXECOPT_DODEFAULT = 0
Const OLECMDID_COPY = 12 

Having said that, I'm not certain this is going to work quite the way you want it to work - neither your original PDFCopy function nor your OLECMDID_SELECTALL variant will copy/snapshot the PDF; they will simply try and copy the text in the PDF, if it is accessible (some PDFs are protcted against this).

RE: Import PDF into Word Document

(OP)
Well, I was able to use your recommendation and the code is operational now!.. It opens the PDF and saves it as a word doc now. I am thinking this code can be used in a different way without creating a new document.. can anyone see this process happening above in a different order?? It would be

1) use dialog box to select a PDF
2) open the pdf with a separate instance of Word (I believe this function is available only in 2013 or after?)
3) copy the content of the PDF as an image
4) paste in the active document
5) close the new instance of word

I will work on this, but you guys always seem to come up with better solutions than I. Thank you for your time and help!! :]

RE: Import PDF into Word Document

(OP)
Hey everyone.. sorry to keep beating this dead horse, but what if I used this in my code??..

ActiveDocument.CommandBars.ExecuteMso ("ObjectSaveAsPicture")

I step through the code and it's not pasting, but if I run this command manually either by right clicking the mouse or adding it to the ribbon bar, it will actually paste the image!!.. I just don't seem to be able to get this to work out. Maybe I have the focus on the wrong instance of Word??.. getting super close, I can feel it! Anyone able to help connect the dots here?.. here is the code I have running!..

CODE --> VBA

Sub convertToWord()
   Dim MyObj As Object, MySource As Object, file As Variant
   file = Dir("C:\Users\FilePathGoesHere\" & "examplefile.pdf") 'pdf path
   Do While (file <> "")
   ChangeFileOpenDirectory "C:\Users\FilePathGoesHere\"
          Documents.Open FileName:=file, ConfirmConversions:=True, ReadOnly:= _
        False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate:= _
        "", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="", _
        Format:=wdOpenFormatAuto, XMLTransform:=""
    
    ActiveDocument.SelectAllEditableRanges
    ActiveDocument.CommandBars.ExecuteMso ("ObjectSaveAsPicture")
    ActiveDocument.Close

'paste into original document (running this code instance)
selection.paste

End Sub 

RE: Import PDF into Word Document

(OP)
Can anyone figure out what type of shape/range/object is being created in the below mentioned code? It is meant to convert a SCANNED PDF file by opening it in Word 2013. If I select it (the resulting image) with the mouse I am then able to manually copy & paste it (the image) into the word document that I am trying to get it in to!.. I can't figure out how to select the image in VBA. That is all that I need and this code is done! Help please anyone!! :]

CODE --> VBA

Sub convertToWord()
   Dim MyObj As Object, MySource As Object, file As Variant
   file = Dir("C:\Users\FilePathGoesHere\" & "examplefile.pdf") 'pdf path

   ChangeFileOpenDirectory "C:\Users\FilePathGoesHere\"
          Documents.Open FileName:=file, ConfirmConversions:=True, ReadOnly:= _
        False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate:= _
        "", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="", _
        Format:=wdOpenFormatAuto, XMLTransform:=""
    
    'THIS IS WHERE I NEED TO SELECT THE IMAGE
    'THIS IS WHERE I NEED TO COPY THE IMAGE
    ActiveDocument.Close

'paste in to original document I am working with (running this code)
selection.paste

End Sub 

RE: Import PDF into Word Document

(OP)
Well.. I have it working. I will incorporate a File Dialog box into this code to select which file/s to select for the end user, but here is a working rough draft. Any of the higher up thinkers on this forum I would love a little help in case you see anything I could be doing better here. Thanks.

CODE --> VBA

Sub convertToWord()
   Dim MyObj As Object, MySource As Object, file As Variant
   Dim lHwnd As Long
   Dim wordApp
   
   wordApp = ActiveDocument
   file = Dir("FolderPathHere" & "FileNameHere.pdf") 'pdf path
   
   ChangeFileOpenDirectory "FolderPathHere"
          Documents.Open FileName:=file, ConfirmConversions:=False, ReadOnly:= _
        False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate:= _
        "", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="", _
        Format:=wdOpenFormatAuto, XMLTransform:=""
        
        'convert/select/copy/close PDF
        Selection.WholeStory 'Select whole document
        Selection.Expand wdParagraph 'Expands your selection to current paragraph
        Selection.Copy 'Copy your selection
        
        'close converted file
        ActiveDocument.Close (Word.WdSaveOptions.wdDoNotSaveChanges)
        
        'paste into active document
        Selection.EndKey wdStory 'Move to end of document
        Selection.PasteAndFormat wdPasteDefault 'Pastes in the content
        
    'clear clipboard
    ClearClipboard
    
    'save
    SaveDoc

End Sub 

RE: Import PDF into Word Document

I'm guessing that you must be using Word 2013, since previous versions do not have the ability to directly open PDFs (previous versions treat PDFs as text documents, and import all the contents as text ...)

RE: Import PDF into Word Document

Strongm: The OP mentions Word 2013 in his second most-recent post.

KristianDude: The code you posted copies the entire contents of the file, not just images. For images in PDFs opened in Word 2013, you should be able to access them as inlineshape objects. Regardless, your code is quite inefficient. You should be able to do what you're now doing without selecting anything and without using copy & paste. Furthermore, apart from the 'file' and 'wordApp' variable declarations, the rest seem redundant since you never use them. SaveDoc is not a valid Word VBA command. As for 'wordApp', that's a particularly poor choice for a variable assigned to a document rather than an application.

Assuming you're trying to convert an entire PDF to Word, try:

CODE

Sub ConvertPDF2Word()
  Dim StrFile As String
  With Application.FileDialog(msoFileDialogOpen)
    .Filters.Clear
    .Filters.Add "PDF Files", "*.pdf"
    .AllowMultiSelect = False
    .Show
    If .SelectedItems.Count = 0 Then Exit Sub
    StrFile = .SelectedItems(1)
  End With
  Documents.Open FileName:=StrFile, AddToRecentFiles:=False
  ActiveDocument.SaveAs2 FileName:=Split(StrFile, ".pdf")(0) & ".docx", _
    FileFormat:=wdFormatXMLDocument, AddToRecentFiles:=False
  ActiveDocument.Close False
End Sub 
Assuming you're trying to import an entire PDF to the end of an existing Word document, try:

CODE

Sub ImportPDF2Word()
  Dim StrFile As String, Rng As Range, DocSrc As Document
  With Application.FileDialog(msoFileDialogOpen)
    .Filters.Clear
    .Filters.Add "PDF Files", "*.pdf"
    .AllowMultiSelect = False
    .Show
    If .SelectedItems.Count = 0 Then Exit Sub
    StrFile = .SelectedItems(1)
  End With
  Set Rng = ActiveDocument.Range.Characters.Last
  Rng.InsertAfter vbCr
  Rng.Collapse wdCollapseEnd
  Set DocSrc = Documents.Open(FileName:=StrFile, AddToRecentFiles:=False)
  With DocSrc
    Rng.FormattedText = .Range.FormattedText
    .Close False
  End With
  Set Rng = Nothing: Set DocSrc = Nothing
End Sub 

Cheers
Paul Edstein
[MS MVP - Word]

RE: Import PDF into Word Document

(OP)
Strongm & Paul,

Sorry.. been over-busy these past few weeks!! Thank you very much for your insight. Due to a lack of coding skills on my end and this just being a side hobby of mine, the way I was looking to make this work (until now that is) is to only import scanned PDF documents. The code I posted will copy and paste the entire page as an image. Importing/Pasting the PDF into my document as an image is ideal as I need to size it to a 7" width while keeping the aspect ratio. My request for the copy/paste in the other post was directly related to this PDF import, but it seemed to be a dead post, but here we are!! :D ... I will definitely be trying out this ImportPDF2Word function!! Thank you guys!!

RE: Import PDF into Word Document

Quote:

The code I posted will copy and paste the entire page as an image.
Sorry to have to pop your balloon, but that's not what your code does. It merely copies & pastes the entire PDF - images & text alike. Of course, if the PDF only contains scanned images that haven't been OCR'd, all you'll get is a set of images. Conversely, if your PDF has a mix of text and images, including PDFs that contain page images that have been OCR'd, you'll get both. In the latter case, the OCR'd text will still be sitting behind the pasted images.

Cheers
Paul Edstein
[MS MVP - Word]

RE: Import PDF into Word Document

>not what your code does

Quite

RE: Import PDF into Word Document

(OP)
Good catch.. yes this is intended for non-OCR scanned PDF documents only. I tested it with other PDF versions with layers and text and it is not good at all for that. For my purposes, I don't use OCR so this is actually a great fit for what I am in need of. Thank you gentlemen for all the help. I hope to get some time to run through your latest post Paul and will report back!

A note to anyone else that may be reading this post.. there is a ton more functionality when using references to say, the full paid version of Adobe Acrobat. I like to make my stuff usable with default and/or free software.. ideally, packaged entirely within the Word Document that I am working in at the time that I am running the code (ie: how I am using this code). This is just a workaround for other easy to use features that do require upgraded PDF software and/or 3rd party software. The hope is that one day, the office suite will make it easier somehow... but until then. :]


-Kristian

RE: Import PDF into Word Document

(OP)
Paul,

This tidbit of code is working pretty great. Do you know what this import is? I combined it with the code from the other post (which is this one Link) and when I step through it sees "0" inlineshapes. I realize this was in reference to pasting an image and this is not exactly what we are doing here. Do I need to convert it to an inlineshape to be able to lock the aspect ratio and then resize it to a desired size?... Here's how I combined them if it helps better to see it?... thanks!

CODE --> VBA

Sub ImportPDF2Word()
  Dim dlgOpen As FileDialog, StrFile As String
  Dim Rng As Range, DocSrc As Document
  With Application.FileDialog(msoFileDialogOpen)
    .Filters.Clear
    .Filters.Add "PDF Files", "*.pdf"
    .AllowMultiSelect = False
    .Show
    If .SelectedItems.Count = 0 Then Exit Sub
    StrFile = .SelectedItems(1)
  End With
  Set Rng = ActiveDocument.Range.Characters.Last
  Rng.InsertAfter vbCr
  Rng.Collapse wdCollapseEnd
  Set DocSrc = Documents.Open(FileName:=StrFile, AddToRecentFiles:=False)
  With DocSrc
    Rng.FormattedText = .Range.FormattedText
    .Close False
  End With
  With Selection
  .Start = .Start - 1 'move the start back one postion to include the image
  If .InlineShapes.Count = 1 Then
    'resize the image
    With .InlineShapes(1)
    .LockAspectRatio = True
    .width = InchesToPoints(3)
    '.Height = InchesToPoints(2)
    End With
  End If
End With

  Set Rng = Nothing: Set DocSrc = Nothing
End Sub 

RE: Import PDF into Word Document

(OP)
Paul.. I should also followup on the comment regarding the "ClearClipboard" and "SaveDoc" items I posted in the code a few posts ago.. those are references to a public sub I use to do those very things. :]

RE: Import PDF into Word Document

I doubt your "ClearClipboard" and "SaveDoc" reference are of much benefit here. "ClearClipboard", especially, is irrelevanmt since the code I posted never uses the clipboard. "SaveDoc" may do something useful, but I doubt it's doing much that ActiveDocument.Save wouldn't. You also wouldn't use:

CODE

With Selection
  .Start = .Start - 1 'move the start back one postion to include the image
  If .InlineShapes.Count = 1 Then
    'resize the image
    With .InlineShapes(1)
    .LockAspectRatio = True
    .width = InchesToPoints(3)
    '.Height = InchesToPoints(2)
    End With
  End If
End With 
since nothing is being selected. Besides which, that code would only cope with a single scanned PDF page. Instead, you'd use code like:

CODE

Sub ImportPDF2Word()
  Dim StrFile As String, Rng As Range, DocSrc As Document, i As Long
  With Application.FileDialog(msoFileDialogOpen)
    .Filters.Clear
    .Filters.Add "PDF Files", "*.pdf"
    .AllowMultiSelect = False
    .Show
    If .SelectedItems.Count = 0 Then Exit Sub
    StrFile = .SelectedItems(1)
  End With
  Set Rng = ActiveDocument.Range.Characters.Last
  Rng.InsertAfter vbCr
  Rng.Collapse wdCollapseEnd
  Set DocSrc = Documents.Open(FileName:=StrFile, AddToRecentFiles:=False)
  With DocSrc
    Rng.FormattedText = .Range.FormattedText
    .Close False
  End With
  With Rng
    For i = 1 To .InlineShapes.Count
      With .InlineShapes(i)
        .LockAspectRatio = True
        .Width = InchesToPoints(3)
        '.Height = InchesToPoints(2)
      End With
    Next
  End With
  Set Rng = Nothing: Set DocSrc = Nothing
End Sub 
Indeed, if I were doing this, I'd probably modify the code to automatically get the page dimensions and fit the PDF images to that, so they always fill, or at least scale to, the page size.

Cheers
Paul Edstein
[MS MVP - Word]

RE: Import PDF into Word Document

(OP)
Thanks Paul. I stepped through it and it is still not recognizing the inserted image.. might you know what this object is so that I can use a simple resize like the example in this latest code... or how I can find it out? When I double click on it, it acts like a picture as the Format picture tab pops up on the ribbon.

Also, just curious of your opinion on these subs that I run. I use them in multiple functions which is why I ended up just putting them out there on their own and referring to them.

CODE --> VBA

Public Sub SaveDoc()
'Save Document No Prompt Original Format
    Documents.Save NoPrompt:=True, _
     OriginalFormat:=wdOriginalDocumentFormat
End Sub 

The ClearClipboard is a pretty robust function created or referred to by Strongm. I find it a must when working with any clipboard events... but as this topic doesn't currently have a clipboard function in it, it really is not necessary for this one! :]

RE: Import PDF into Word Document

In that case, the pages may not be getting inserted 'inline', in which case you might use:

CODE

With Rng
    For i = 1 To .ShapeRange.Count
      With .ShapeRange(i)
        .LockAspectRatio = True
        .Width = InchesToPoints(3)
        '.Height = InchesToPoints(2)
      End With
    Next
  End With 
Your 'SaveDoc' code saves all open documents, not just the one you're working on (including newly-created documents you haven't even named yet). This may or may not be desirable. Use with care.

Cheers
Paul Edstein
[MS MVP - Word]

RE: Import PDF into Word Document

(OP)
Paul,

That is exactly what I was looking for. Thank you so much!!



.. on another topic.. I think I remember you helping out with a few old projects of mine about ten years ago on another site called UtterAccess. Does that ring a bell at all?


-Kristian

RE: Import PDF into Word Document

I've never posted at a site named UtterAccess or anything like it...

Cheers
Paul Edstein
[MS MVP - Word]

RE: Import PDF into Word Document

(OP)
Ok thanks for the help Paul and Strongm!

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close