

This means that it is practically impossible to apply convolutions to the lighting of an image without changing the colors. Remember that the RGB color space implicitly mixes the luminescence of the pixels with the colors. The image has been reformed, but we now see that there are some slight distortions. def convolver_rgb(image, kernel, iterations = 1): convolved_image_r = multi_convolver(image, kernel, iterations) convolved_image_g = multi_convolver(image, kernel, iterations) convolved_image_b = multi_convolver(image, kernel, iterations) reformed_image = np.dstack((np.rint(abs(convolved_image_r)), np.rint(abs(convolved_image_g)), np.rint(abs(convolved_image_b)))) / 255 fig, ax = plt.subplots(1,3, figsize = (17,10)) ax.imshow(abs(convolved_image_r), cmap='Reds') ax.set_title(f'Red', fontsize = 15) ax.imshow(abs(convolved_image_g), cmap='Greens') ax.set_title(f'Green', fontsize = 15) ax.imshow(abs(convolved_image_b), cmap='Blues') ax.set_title(f'Blue', fontsize = 15) return np.array(reformed_image).astype(np.uint8) convolved_rgb_gauss = convolver_rgb(dog, gaussian, 2)
#Blur image cleaner trial
In conclusion: the most straightforward method to improve recognition ratio is to binarize images yourself (most likely you will have find good threshold by trial and error) and then pass those binarized images to tesseract.Great! We can clearly see the continued blurring of the image due to the application of our kernel.īut what if you needed to blur the image and retain the color? Let us first try to apply the convolutions per color channel. To better understand what tesseract 'sees' is to apply Otsu's method to your image and then look at the resulting image.

If you invoke tesseract on colored image, then it first applies global Otsu's method to binarize it and then actual character recognition is run on binary (black and white) image.Īs it can be seen, 'global Otsu' may not always produce desirable result. I haven't used PyTesser, but I have done some experiments with tesseract (version: 3.02.02). Question "image processing to improve tesseract OCR accuracy" may also be of interest. Illustrated guide about "Improving the quality of the output". Hope this helped.Īs it turns out, tesseract wiki has an article that answers this question in best way I can imagine: It works great with images with just text. Also the text layout and formatting in the image makes a big difference.

The caveat is that it does not work on files with a lot of embedded images and I coudn't figure out a way to train Tesseract to ignore them. No prior image cleaning was required here. So the Tesseract Engine is without doubt the best open source OCR engine in the market. Coming straight to the point, this is what I found: 1) You can increase the resolution with ImageMagic(There are a bunch of simple shell commands you can use)Ģ) After increasing the resolution, the accuracy went up by 80-90%. This time I chucked PyTesser and used the Tesseract Engine with ImageMagik instead. I started my research again on OCR these last couple of days. Although there is much theory to be read about image processing and OCR, are there any standard procedures of image cleanup(apart from erasing icons and images) that needs to be done before applying PyTesser or other libraries irrespective of the language? So apparently Pytesser does not take care of font dimension or image stretching. In 2) above on x or y axis increased the accuracy by 10-20%
