cancel
Showing results for 
Search instead for 
Did you mean: 

Does PDF.ConvertToText support PDFs which have protections to prevent text being "Copy and pasted"

SOLVED
mdang
New Contributor

Does PDF.ConvertToText support PDFs which have protections to prevent text being "Copy and pasted"

Hi,

 

I have a requirement to validate the contents of an invoice PDF. The PDF has a protection mechanism to prevent the contents from being copy and pasted (the contents come out as "garbled" text). This means I can't use libraries like Apache PDFBox (tried this with TestComplete, and the contents come out "garbled"). 

 

I wanted to know if anyone can confirm if TestComplete's PDF.ConvertToText method would support this type of PDF,, since it uses OCR to extract the text. My organization has a secured network, and port 443 is blocked, the process for me to get the port opened is quite lengthy with numerous approvals. I would hate to go through the process to find out the the functionality wouldn't be able to extract the text from my PDF. 

 

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
hkim5
Staff

Re: Does PDF.ConvertToText support PDFs which have protections to prevent text being "Copy and

if you were to get 443 opened the pdftotext should work since its not a copy and paste but ocr as you mentioned

Best,
Justin Kim

View solution in original post

1 REPLY 1
hkim5
Staff

Re: Does PDF.ConvertToText support PDFs which have protections to prevent text being "Copy and

if you were to get 443 opened the pdftotext should work since its not a copy and paste but ocr as you mentioned

Best,
Justin Kim

View solution in original post

New Here?
Join us and watch the welcome video:
Announcements