Do you want to make an application that has OCR ability within vb6? Well, I did and I almost gave up because I could not find an OCR ActiveX. Then I found Tesseract.
Although it is not an ActiveX and seems only usable for Visual Basic 2008 or higher, but we can actually make use of its capability.
Download tesseract from here and install. Then insert this line in your VB code:
Shell "tesseract <image files> <output file>"
for example:
Shell "tesseract c:\cap.bmp c:\output", vbHide
Tesseract will convert the image ("cap.bmp") into text and save it into a TXT file ("output.txt"). Then, all you have to do is read the output file.
Here is a short demo video:
The VB6 Project can be downloaded here.
UPDATE
Shelling tesseract in Windows7 doesn't work. The quick solution is to create *.bat file, and shell it.
You can create *.bat file in notepad with following line:
tesseract c:\cap.bmp c:\output
For example, if you create "ocr.bat", the shell command in vb6 will look like:
Shell "c:\ocr.bat", vbHide
How do I get it to work? I mean I have tesseract installed, I downloaded your project, it runs fine upon F5, but it just doesn't seem to recognize any text at all.
ReplyDeleteI went through the code, you already have the code inserted in it. What could I be doing wrong?
Hope you respond :)
PS: I am not into this for a profit motive, just interested because this is something new with VB6.
In command prompt (Start>run>cmd) type: tesseract c:\cap.bmp c:\output (make sure that you have cap.bmp, which is an image that contains text, is in c:\).
DeleteIf output.txt exists in c:\ (pressumably there is no output.txt previously) then tesseract is running as expected. And I have no idea what goes wrong with my code.
However, if DOS complains that 'tesseract' is not recognized as internal or external commad, then reinstall tesseract. In the 'Choose Components Dialog' make sure that you leave all components checked as is (default).
I hope this will help.
Thanks for your response. The CMD part works fine, I get the output in the text file in C folder. However, when I run your project in VB6 and place the form (window) over any text (like you did in your demo video), I doesnt seem to be capturing any text at all.
DeleteIt just captures its borders, pink colored. Where am I going wrong?
And I am running Win 7 Ultimate, if that helps.
This is my cap.bmp after I run your project in VB6.
http://imageshack.us/scaled/landing/191/capou.png
Hope it helps. :)
Your 'cap.bmp' shows that the screen capture behaves strangely. My code contains API calls for capturing the screen, may be it is the problem. However, since I'm running WindowsXP, I cannot verify this.
DeleteI suggest you not to use my screen capture code, since it was originally intended for showing the capture and the OCR result in real time in the demo video. I don't think it is useful in real life applications :)
A common usage would be OCR-ing a scanned image. You can use Windows Image Acquisition (WIA) or EZTWAIN for accessing a scanner.
I'm sorry that I cannot help you to make my code work on your machine.
I have been kicking around the code for about two days now. I have got everything working, except for the screen capture part, I can get a whole screen captured, just not the PictureBox part.
DeleteAlso, on VB 6 there ain't TransparencyKey property, so had to get some snippet from the internet, still no luck. :P
Thanks anyway! For this has given me some interest. :)
Just make sure that you did not reduce your sleep time during those two days :)
DeleteHappy coding!
This comment has been removed by the author.
ReplyDeleteHello iMan, could you tell me how did you make it works in Win7? Beacause I´m having the same problem: tesseract works from cmd, but the text has nothing to do with the text in the original image.
ReplyDeleteRegarding to the Transparency, you could put at the load of the form the following code:
yourpicturebox.backcolor vbtrensparent
Obviously where it says yourpicturebox you have to write the name of the Pcture Box you are using.
Best regards
hi. how can I change font ?
ReplyDeletekhalid.sami
ReplyDeleteHi, i am interested in the vb6 tesseract tutorial. It is not available anymore
ReplyDelete