Forum Discussion

MulgraveTester's avatar
MulgraveTester
Frequent Contributor
8 years ago
Solved

Reading PDF documents

I am trying to follow the steps here for reading the contents of a pdf document and have converted the code to VBS. I have done the setup successfully and now I am trying to get the pdf file object.

When I execute the following I get java.lang.IllegalArgumentException: argument type mismatch.

 

I have tried all of the 'load' overload options but can't get the code to work. What am I doing wrong?

 

For future reference - where can I get a list of each of the overload options, that expands on "Param1 as Object", for this or any other object?

 

VBScript

function loadDocument(byval fileName)
  dim docObj

  'Load the PDF file to the PDDocument object
  set docObj = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3(fileName) 'FAILS HERE
  'Return the resulting PDDocument object
  set loadDocument = docObj
end function

 

sub testPDF
  set docObj = loadDocument("C:\\Temp\\Document.pdf")
end sub

 

  • Fixed thanks to this post by AlexKaras. I was using the latest version pdfbox-app-2.0.3.jar and needed to used the old pdfbox-app-1.8.12.jar

  • Looks like my original thread has expanded a bit.

    I use VBScript and the code does apply but it looks like the "set" statement has been omitted

     

    set docObj = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3("C:\\Users\\Downloads\\Test.pdf")

16 Replies

  • Hi tsan123,

     

    try using double backslashes like this

    docObj = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3("C:\\Users\\Downloads\\Test.pdf")

    instead single ones

    docObj = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3("C:\Users\Downloads\Test.pdf") 

    • HKosova's avatar
      HKosova
      SmartBear Alumni (Retired)

      PhilippeVDB wrote:

      Hi tsan123,

       

      try using double backslashes like this

      docObj = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3("C:\\Users\\Downloads\\Test.pdf")

      instead single ones

      docObj = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3("C:\Users\Downloads\Test.pdf") 


      tsan123 uses VBScript, so this suggestion does not apply.

      • MulgraveTester's avatar
        MulgraveTester
        Frequent Contributor

        Looks like my original thread has expanded a bit.

        I use VBScript and the code does apply but it looks like the "set" statement has been omitted

         

        set docObj = JavaClasses.org_apache_pdfbox_pdmodel.PDDocument.load_3("C:\\Users\\Downloads\\Test.pdf")

  • MulgraveTester's avatar
    MulgraveTester
    Frequent Contributor

    Fixed thanks to this post by AlexKaras. I was using the latest version pdfbox-app-2.0.3.jar and needed to used the old pdfbox-app-1.8.12.jar

    • tsan123's avatar
      tsan123
      Occasional Contributor

      I tried using Pdfbox 1.8.4, Pdfbox 1.8.9, Pdfbox 1.8.12 but I still have the error. Please Help.

      • tristaanogre's avatar
        tristaanogre
        Esteemed Contributor

        You're getting a different error than the one this was originally posted under.  It seems like you haven't actually included the pdfbox jar in your java bridge.  Could you post a screenshot of Tools -> Current Project Properties -> Java Bridge?

         

        Also, a post of the code you're using would be helpful as well.

  • m_essaid's avatar
    m_essaid
    Valued Contributor

    if you have correctly add the dll in the CLR, did you unblock them (windows - files properties) ? you have to do that file per file

  • manjeetku's avatar
    manjeetku
    New Contributor
    • Read pdf to verify a report in test complete.
    • You have to import pdfbox.dll
    1. Read report pdf data and validate any string to verify a test functionality.

    function verifyPDFTextValue(){

     

     var status  = verifyPDFText("50 plus_ element");

     return status;

    }

     

    function readPDFData(strReportFilePath)

    {

      var filePath =  callPDF();

      var doc = dotNET.org_apache_pdfbox_pdmodel.PDDocument.load(filePath);

      var pdfStripper = dotNET.org_apache_pdfbox_util.PDFTextStripper.zctor_2();

      var str = pdfStripper.getText_2(doc);

      Log.Message("See Additional Info", str);

     

      doc["close"]();

     

      return str;

    }

     

    function verifyPDFText( strTextToVerify ){

    var status ;

    var pdfFileFullPath = callPDF();

    var pdfText = readPDFData(pdfFileFullPath);

     

      if(pdfText.Contains(strTextToVerify)){

        Log["Message"]("Verification successful. Text in PDF file is equal!");

        status = "Success";

        }   

      else {

        Log["Error"]("Text in PDF file is different from the parameter passed!");

        status = "Failure";

        }

     

        return status;

    }

     

    // We should call last modified file in place of below method in a dynamic environment.

     

    function callPDF(){

     

      var sPath = "../IrisProject/TestData/TestDataPDF/TestPDF.pdf"

     

      var sExpPath = aqFileSystem["ExpandFileName"](sPath);

     

      var fileName = aqFileSystem["GetFileNameWithoutExtension"](sExpPath);

     

      // Posts the file name and extension to the test log

      Log["Message"]("The file name is " + fileName);

     

      Files["TestPDF"]["Check"](fileName); 

     

      Log.Message(sExpPath);

     

      return sExpPath;

    }

     

     

    • For any means the report or pdf file path/name is dynamic then below method will help to get the latest file in a folder. pass that path to readPDFData function insted of calling  function callPDF(). 

     

    function getLastModifiedFileName(){

     var status ;

     Delay(5000);

     var reportsPath = CreateReportFolder();

     Delay(5000);

     var fileToSearch = FindLastModifiedFileInFolder(reportsPath, ".*");

     

     

      Log["Message"](fileToSearch);

      if(fileToSearch != null)

      {

        Log["Message"]("File found " + fileToSearch);

        status = "Success";

      }

      else

      {

          Log["Error"]("File not found!");

          status = "Failure";

      }

     return status;

    }

    //This method Creates report folder under project root folder, if report folder does not exists , ignores otherwise.

    function CreateReportFolder()

    {

       var reportsFolderRelativePath = "../IrisProject/PTPReports";

       var reportsFolderFullPath = aqFileSystem["ExpandFileName"](reportsFolderRelativePath);

       if(!aqFileSystem["Exists"](reportsFolderFullPath))

       {

          reportsFolderFullPath = aqFileSystem["CreateFolder"](reportsFolderFullPath);

       }

       

      return reportsFolderFullPath;

    }

     

    function FindLastModifiedFileInFolder(FolderPath,FileNameContains)

     

     {

     

      var FolderObject = aqFileSystem.GetFolderInfo(FolderPath); //The folder to look in

     

      var FileItems = FolderObject.Files     //Collection of all the files in the folder

     

      var FileDateModified = new Array();  //Array of all the relevant date modified info for the files

     

      var FileNumber = new Array();     //The absolute index number for the files sorted

     

      var LatestFileNumber;             //The index to the latest file modified

     

      var NewestTime;                   //Placeholder for the Newest time when searching

     

      //Builds up the arrays with the filnames containing FileNameContains reg exp

     

      for (var i=0; i < FileItems.Count; i++)

     

      {

        if(FileItems.Item(i).Name.search(FileNameContains) > -1)

     

        {

     

          FileNumber.push(i);

     

          FileDateModified.push(FileItems.Item(i).DateLastModified);

     

        }        

      }

     

      //Finds the most resent modified file

     

      NewestTime = FileDateModified[0];

     

      Log["Message"](NewestTime);

     

      LatestFileNumber = FileNumber[0];

     

      Log["Message"](FileDateModified);

     

      for (var i=0; i < FileNumber.length; i++)

     

      {    

     

        if(aqDateTime["Compare"](NewestTime, FileDateModified[i]) < 0)

         

        {

          NewestTime = FileDateModified;

     

          LatestFileNumber = FileNumber;

     

        } 

     

      }

      return FileItems.Item(LatestFileNumber).Path;  

    }

     

    •  Find the screenshot it will help in getting a clear idea.

    ***** For any further queries...............reply back:)