Getting exception while trying to read PDF file using pdfbox dll
I found solution to read PDF using dll file.
- Add IKVM.GNU.Classpath.dll, PDFBox-0.7.3.dll as references
- Copy FontBox-0.1.0-dev.dll, IKVM.Runtime.dll into TC’s bin directory
I added above dlls accordingly and used below code to get the text.
var filename = "C:\\iarf-carepricer.pdf" ;
var doc = dotNET.org_pdfbox_pdmodel.PDDocument.load(filename)
var pdfStripper = dotNET.org_pdfbox_util.PDFTextStripper.zctor();
var str = pdfStripper.getText_2(doc);
While executing i am getting below exception at last line.
System.NullReferenceException: Object reference not set to an instance of an object.
at org.pdfbox.pdmodel.PDPageNode.getAllKids(List , COSDictionary , Boolean )
at org.pdfbox.pdmodel.PDPageNode.getAllKids(List result)
at org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages()
at org.pdfbox.util.PDFTextStripper.writeText(PDDocument doc, Writer outputStream)
at org.pdfbox.util.PDFTextStripper.getText(PDDocument doc)
Can someone please help.
Thanks,
Vikas
Same here - Windows 7, 64-bit. Here's my working configuration (TestComplete 11.20):
Usually when downloaded DLLs don't show up under dotNET, it's because they're blocked by Windows and need to be unblocked in the file properties:
http://blog.codingoutloud.com/2010/03/05/the-project-location-is-not-trusted-dealing-with-the-dreaded-unblock/
http://superuser.com/questions/38476/this-file-came-from-another-computer-how-can-i-unblock-all-the-files-in-a