Forum Discussion

rusantos's avatar
rusantos
Contributor
8 years ago

What is the best way to compare PDF / Excel Files using Test Complete?

What is the best way to compare the content, format, size and timestamp of two pdf/excel files.

Please include a screenshot of the reports generated by your solution if possible.

 

Thanks,

2 Replies

  • AlexKaras's avatar
    AlexKaras
    Champion Level 3

    Hi,

     

    Both file formats that you mentioned are containers for different data (mentioned by you as well: text, pictures, formulas, etc.) and thus there is no best way to compare them. Everything depends on your actual needs.

    In the simplest case when you need to compare only the text, you may extract the text from the file (for example, using the PDFBox library for pdf files or using the SaveAs functionality of Excel and saving to cvs file for Excel), pre-process the obtained text to exclude dynamic information you are not interested in (e.g. date when the document was generated)  and do comparison.

    Also, you may open document in its native application (PDF Reader or Excel), take screenshots and compare them with the expected baseline using some third-party provider (e.g. https://applitools.com/).

    In the complex cases when you need, for example, to compare formulas from Excel file, you will need to resort to the object model of the given file (e.g. Excel Object model) and to use it to get access to the elements that you need to work with.

    All above topics were discussed here several times and you may search for them for more ideas/approaches.