Ask a Question

More OCR questions?

Colin_McCrae
Community Hero

More OCR questions?

I'm continuing to mess about with this.

 

What I'm trying to determine, is the best "common" values to use in the settings that will give me the best percentage of accurate results.

 

The image I have to detect text on uses two sizes of font. One small, one large. The font in question is derived from the Windows Tahoma font, and then rendered onto a canvas, and coloured with some aliasing/dithering applied to smooth it. So it's a font the TC OCR engine can handle out of the box. Pretty much. It will never be perfect, I know that. But in smaller manual trials, I did manage to get it around 90% accurate on the larger text. Using the default available fonts.

 

But I wanted to see which size of font game me the best results as OCR can be pretty slow. Reducing it to a smaller set of fonts to attempt to match again gives a pretty considerable performance gain.

 

So, I set up a looped test. It went through all my stored images (40 in all - 22 large text and 18 small) and ran through the full image set using only a single font size.

 

I got best results on the large text using sizes 14, 16 & 24. On the small text, oddly enough, I got the best results with font size 30. Which seemed a little odd. But whatever, it produced the best results so I'll roll with that.

 

So then I re-ran the whole image set. But this time I gave it the four best font sizes from the previous single font trials. So it got 14, 16, 24 & 30 to work with.

 

Everything else the same. Same images. Same OCR options. The ONLY thing that changed was the font sizes available to the OCR engine as it ran through.

 

On all these runs, I'm using greyscale binarization. It loops though increasing the binarization by 25 on each loop. So I get a full set of results for all 40 images with binarization increasing by 25 on each run through. As I say, this has not changed though any of these test runs.

 

And yet, when I switch from single font, to four best fonts (from the single font runs), my results change completely?!?!?

 

Some of the large text results are better. Some are worse. The small text results are pretty much universally worse?!?!?

 

How is this possible?

 

If I matched 5 out of 18 with ONLY font size 30 available, why does this drop to only 2 out of 18 when it has font sizes 14/16/24/30 available? Surely the 5 at font size 30 should still produce matches?

 

I'll do two separate checks if I have to (one with optimal single font setting for each size) but this will come with a performance hit so I'd rather not.

 

Any ideas why it behaves this way?

 

I still have more (much more!) settings to play with. Both with the image files themselves and the OCR options applied scanning them. But I'm already seeing unexplained inconsistency just with adding font sizes ....

4 REPLIES 4
EnergizerBunny
Contributor

The application that we test uses the Tahoma font also.

 

I do not have a definitive list of the best common values, but I can relate to you some methods I use in helping the OCR function to work faster.

 

Our projects are in VB script.

 

I use code that works with the individual window objects, not the entire form or the entire desktop.

 

I have found that if you try to limit the area inside the window object that you are trying to read the text from, you will speed up the OCR process. Also, try to avoid horizontal and vertical lines as this will add time to the OCR process.

Since we are working with images on windows objects I decided to create a class to hold not only the window object, but the position and dimensions inside that object that I want to do OCR on.  (Top, Left, Height, Width)

 

I also found out that you can use the Picture.Stretch method on the window object before you pass it to your OCR code. This allows you to work with fonts outside of the range of the OCR engine. You can stretch larger or ‘stretch’ smaller. It can also be used to resize the images to the font sizes that your tests show work the fastest. You mentioned that font 30 was the best in your trials. Now you can limit the number of font sizes that you load into the OCR engine.

 

Hope this helps a little.

Thanks for the suggestions. Smiley Happy

 

I'm already cutting out small areas of the image. It's a touchscreen panel built up of multiple buttons. Which are in a grid. So I can extract only the button I want.

 

So already using a small area. Running it on the entire image was WAY too slow!

 

As I say, the font starts life as Tahoma, but it gets scaled and rendered by the software that produces the touch panel images so it's not 100% the same as standard system Tahoma by the time it gets there. The colours used also have a big effect. Foreground and background colours on the buttons are user configurable - but there will be limits around the colours you can use if you plan to use OCR. White text on a yellow background = not good!

 

So I've built myself a little test harness that loops through a lot of the possible options and tries each of my 40 saved button images so I can establish the most reliable settings for us. There are a ton of settings. Besides the actual OCR ones, there are also quite a few you can apply to the image file (saving as greyscale, re-sizing, compressing etc etc) before you run the OCR over it. Way too many permutations to figure it out manually as it's by no means an exact science.

 

The bit I can't figure out is how using:

 

font size 14 = 13 matches on large text

font size 16 = 13 matches on large text

font size 24 = 13 matches on large text

 

font size 30 = 5 matches on small text (?!?!? no idea how 30 is most effective - the text is TINY!)

 

But ....

 

Font sizes 14/16/24/30 all in one go, you would expect to match 13 large and 5 small. At least. But it doesn't?!?!? Instead I get 12 large matches and only 3 small. No idea how that's happening. Font size(s) available to the OCR engine is the only parameter changing between runs ...

One thing you could try is to turn OFF Clear Type Text on the test machines.

 

When Clear Type Text is set for your display device, it ‘smooths’ the pixels of the LCD monitor so the image looks better to the human eye, but it ‘skews’ the bitmap image on the screen and can interfere with the OCR function. It can also interfere with the TextObject identification.

 

On your machine, go to the Control Panel\Display\Adjust Clear Type text to make the change. You cannot do this via remote desktop, you must be on the console.

 

We do this as the standard setup on all our test machines.

Thanks for that. Produced some interesting results.

 

I hadn't even looked at the Clear Type settings as I wasn't aware they could affect images. But I tried it, and you're right, it does.

 

With it off, the aliasing was less pronounced. And it caused fewer letters to appear "joined" together by the aliasing artifacts.

 

Unfortunately, it didn't translate into a big gain in the accuracy of the OCR. Maybe 10%? If you're lucky?

 

I think I'm simply going to have to give a few caveats around the colours used (stick to white text on dark backgrounds - which most of them are anyway) and keep the text short and simple. We can control the colours and text used so it's not a problem. Just need to make sure people are aware of it. But, follow a few simple rules, and it should accurate enough for my purposes.

cancel
Showing results for 
Search instead for 
Did you mean: