Forum Discussion

petra's avatar
petra
Contributor
8 years ago

workaround converting pdf to txt

Hi Community

 

I have to a pdf/docx file into txt.

I v tried with foxit-reader, abbey-finereader, word, but at this moment when the window "open file" appears and test complete is running (recording or running the test) each programm is freezing. The only thing i can do is shutdown my PC.

 

Language for scripting is JS. but i don't "speak" JS. (I only leaned PHP) And yesterday i read that in JS it doesnt work to open and save documents.

 

  • tristaanogre's avatar
    tristaanogre
    Esteemed Contributor
    JavaScript shouldn't have any restrictions on opening or saving word documents. You should be able to do so using Sys.OleObject ('Word.Application') I believe.

    PDF files, in general, are going to be tricky to work with because of their nature but there are methods of doing so as well.

    Generally speaking, though, TestComplete can interact with the dialogs within any application so opening Word and selecting "Save As" is a means as well.

    Can you post what you have attempted to date?

    Also, I'm curious where you read that JavaScript can't open or save documents.
    • Colin_McCrae's avatar
      Colin_McCrae
      Community Hero

      Assuming you can load the PDF OK, and TC can "see it" (so at least can interact with it on the most basic level), and it's not a protected PDF (you'll need a third party bolt-on for those) ....

       

      You can simply click it to give it focus, then send it keyboard shortcuts (CTRL A then CTRL C) which will select all text and dump it into the clipboard. Then just take the clipboard content into a string in your script and do whatever you need to do with it.

       

      This only gets raw text. No images. No formatting. But if it's just the text you need, it's a simple and effective solution. I've used it many time for validating document content, in the same way you'd screen scrape from an application.

    • AlexKaras's avatar
      AlexKaras
      Champion Level 3

      > Also, I'm curious where you read that JavaScript can't open or save documents. 

      I think that the article was about JScript on web page and that JScript from web page is restricted to directly read and/or write files on the local hard drive.

      • petra's avatar
        petra
        Contributor

        AlexKaras I cant find anything about javascript out of a browser. 

         

        Creating an ActiveXObject on a remote server is not supported in Internet Explorer 9 standards mode, Internet Explorer 10 standards mode, Internet Explorer 11 standards mode, and Windows Store apps or later.

         

        https://msdn.microsoft.com/de-de/library/7sw4ddf8(v=vs.94).aspx
  • petra's avatar
    petra
    Contributor

    Colin_McCrae and tristaanogre many thanks. i will try this on tuesday. 

     

    I'll looking for the link where i found, thats not possible tristaanogre. 

     

     

     

     

     

    have a nice easterweekend.