Forum Discussion

jeff2tom's avatar
jeff2tom
New Contributor
13 years ago

extracting strikeout text from image










 






I'm using vbscript in test complete. I need to identify some
text from an image. I'm able to do it using OCR . OCR is able to
identify text fonts which are "bold", but there is one text which has
"strikeout " property, which OCR is not able to identify, it is giving
some gibberish value. Is there a way to identify text from images which
are striked out?
The following is the code I've used and is not working:


 Set Rect = Aliases.npApp.wndNP6Class.Picture.GetRect(5,47,234,520)
 Set OCRObj = OCR.CreateObject(Rect)
 Set ocrOpt = OCRObj.CreateOptions
 Set Font = ocrOpt.Fonts.Add
 Set Font = ocrOpt.Fonts.Add
 Font.Name="Arial"
 Font.Styles.Add 2  ' Choosing bold font type
 Font.Styles.Add 8   ' Choosing strike out font type
 ocrOpt.ExactSearch=false
 ocrOpt.GrayScaleBinarization=true
 ocrOpt.SkipFragmentation=false
 ocrOpt.BinarizationThreshold =100   'Low threshold value to capture text from image

 vTempStr1 = OCRObj.GetText(ocrOpt)

Is there an alternative to OCR, i.e. some other way of extracting text from image without using OCR


Thank you.




2 Replies

  • HKosova's avatar
    HKosova
    SmartBear Alumni (Retired)
    Hi Jefferson,



    I'm afraid, currently, the TestComplete OCR engine doesn't support the strikeout fonts. It supports only the regular, bold, italic and underlined font styles and their combinations.



    Is there an alternative to OCR, i.e. some other way of extracting text from image without using OCR


    From your code, I see that you want to get a text from an object in an application. If this object isn't actually a graphics image but rather a custom control (e.g. a rich text editor) that TestComplete doesn't identify out-of-the-box, there're more efficient alternatives to OCR.



    For example, if your tested application is Open to TestComplete so that you have access to native properties and methods of the application objects, you should be able to get the object text using a native property or method. Examine your object in the Object Browser to see what methods and properties are available for it.



    If your tested application is non-Open, Text Recognition, MSAA and UI Automation are the recommended alternatives that can provide better object identification. You can find more info and tips in the following articles:

    OCR vs. Text Recognition

    Ways to Interact With Application Objects

    About Improving Object Recognition





    If the object in question is an actual image/picture, the workaround I could suggest is to use a third-party OCR utility to get the text from the image and integrate this utility into your test.



    I hope this answers your question. Let me know if I should give you more details or if you have additional questions about this issue.