Key Attributes and Limitations
- Data is in human readable form
- Progressive area of optical-read technology, characterised by
systems capable of high speed, accurate recognition, capability of
handling multiple fonts and distorted characters, but with level
of performance naturally reflected in cost
- Low cost, PC-based systems available for document management
applications
- Variety of character formation techniques, primarily
printer-based technology, with costs determined by the type and
quality of printer.
- Generally close proximity scanning required to capture images
- No encoded error control (error detection and correction) -
reliant upon processing capability for recognizing characters
Forty years ago, before bar code technology was a gleam in the
grocery industry’s eye, OCR was being used in commercial
applications. The technology was initially designed to read highly
stylized human-readable fonts, such as OCR-A, which encodes the
alphanumeric character set as well as 60 other shapes.
In 1975, OCR was adopted by the National Retail Merchants
Association (now known as the National Retail Federation, or NRF) as
the standard font for merchandise identification, credit
authorization, and inventory control. However, poor supplier source
marking as well as unreliable scanning equipment prompted a shift in
the 1980s to barcode source marking of general merchandise, which
proved to be much more successful.
The evolution of high-powered desktop computing has benefited OCR
reading technology over the last few years, allowing for the
development of more powerful recognition software that can read a
variety of common printer fonts. High-end systems use sophisticated
neural networks, which enable the system to improve its read accuracy
over time by learning the nuances of a particular font and even
varying styles of unconstrained handwriting. Most OCR systems today
are font-independent and are available in three different
configurations: page readers, transaction readers (usually numerical
only), and handheld readers.
Characters are scanned with a light source, providing an image that
is interpreted by the recognition software. The software uses one of
two approaches to character analysis: template matching (whereby the
character is matched to a database of possibilities) and feature
extraction (analyzing structural elements of the character).
OCR shines in applications where human-readability is required, in
electronic document processing and management, and in high-volume
scanning of numerical transaction data. Neural net-based OCR systems
are making headway in reading unconstrained handwriting, but such
systems are as yet substantially inaccurate and prohibitively
expensive.
For large commercial applications, font-independent OCR systems are
considerably less accurate than those dedicated to an OCR font, and
even a dedicated OCR system is less accurate than a barcode-based
system. For this reason OCR technology is not likely to have a great
impact on AIDC applications. However, OCR will continue to serve and
grow in its established niches, particularly in electronic document
processing and management applications, and in those industrial
applications where the lack of human-readability of bar code symbology
rules it out as a suitable technology.
Acknowledgement: Some of the
information on AIDC pages is based on the information in AIMGlobal's
website. We would like to thank AIMGlobal for this.