/*----------------------------------------------------- Script for AdSense -----------------------------------------------------*/ /* */ /* Footer ----------------------------------------------- */ #footer { clear: both; text-align: center; color: #333333; } #footer .widget { margin:.5em; padding-top: 20px; font-size: 85%; line-height: 1.5em; text-align: left; } /** Page structure tweaks for layout editor wireframe */ body#layout #header { width: 750px; } -->

Friday, February 13, 2009

OCR work...Google Open source announcement

Tesseract OCR:

I started OCR once again. After googling, I reach on the conclusion to use Tesseract library. This library is opensource and available in both Windows and Linux. This library is provided with Visual Studio project. I have compiled it with the .NET and also Visual Studio 6.0. This creates tesseract.exe. Which is successfully run on my pc. It converts the tiff image into the text. Though not perfect but can start exploring with this open source code.
for that we need apachi 2.0 license.: http://www.apache.org/licenses/LICENSE-2.0

A commercial quality OCR engine originally developed at HP between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by UNLV. It was open-sourced by HP and UNLV in 2005. (NOTE: We're migrating to code.google.com. Please see the forums.)
check more detail about google open source for OCR
@http://sourceforge.net/projects/tesseract-ocr
@http://google-code-updates.blogspot.com/2006/08/announcing-tesseract-ocr.html

No comments: