Forum Discussion

scsvel's avatar
scsvel
Frequent Contributor
14 years ago

Need to Read PDF file

Hi all,

I want to read a PDF file using MSAA objects technic and i can't do the same.. Actaully i got the object hierarchy as

Sys.Process("AcroRd32").Window("AcrobatSDIWindow", "Adobe Reader", 1).Window("AVL_AVView", "AVToolBarHostView", 1).Window("AVL_AVView", "AVSplitterView", 4).Window("AVL_AVView", "AVSplitationPageView", 3).Window("AVL_AVView", "AVSplitterView", 1).Window("AVL_AVView", "AVScrolledPageView", 1).Window("AVL_AVView", "AVScrollView", 1).Window("AVL_AVView", "AVPageView", 4)



But after that i have to get child object Document and its child object Grouping to get the content. But I  am not getting the same..

Can anybody help me on this..

Any other methods to read my PDF file....

7 Replies


  • Hi Shanmugavel,





    Right-click your project in the "Project Explorer" panel, select "Edit | Properties" from the context menu, select "Open Applications | MSAA" on the project's "Properties" page that opens, make a screenshot and post it here.


  • scsvel's avatar
    scsvel
    Frequent Contributor
    Hi Allen,



    Here is the screenshot you requested..Awaiting your reply....

    thanks...

  • Hi Shanmugavel,





    You need to add "AVL_AVView" to the list of accepted windows, not "AcrobatSDIWindow".
  • tempname's avatar
    tempname
    New Contributor
    Hi David,



          I configured the MSAA as suggested(please find the properties snapshot)

          But still unable to find the document object.

          Could you please suggest me how to read PDF contents.



  • Hi,





    Please try enabling the '*' item. Does this help? If it doesn't, select 'Document | Accessibility Quick Check' from the Adobe Reader main menu, make a screenshot of the window that appears and post it here.
  • tempname's avatar
    tempname
    New Contributor
    I Choose "*" in MSAA, but still unable to find the 'document' object.

    Please find the snapshot of "Document | Accessibility Quick Check' from the Adobe Reader main menu.



    Thanks,

    Chaitanya
  • Hi,



    Your PDF document is "untagged". In other words, this means that it doesn't contain a description of its structure. That's why, TC cannot "see" internal objects of the document. You need to tag your PDF document so that TC can recognize the document's elements. Let me quote the "Tag the PDF" article by Adobe:





    For best results, tag a document when converting it to PDF from an authoring application. Examples of these applications include Adobe FrameMaker®, Adobe InDesign®, Microsoft Word, or OpenOffice Writer. If you do not have access to an authoring application that can generate a tagged PDF, you can tag a PDF any time by using Acrobat.





    You can find more information here.



    Note that to access the data of the document, you can save the PDF document as a text file and retrieve the data from the file. Please see the Working With Files From Scripts help topic for more information.