Insurance & Technology is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data & Analytics

07:57 AM
Connect Directly

Teaching Your Computer To Read

Does optical character recognition really work?

Extending the Value

Having enjoyed the benefits of a Dakota Transform OCR system for processing health claims forms, The Guardian Life Insurance Co. (New York, $32.4 billion in assets) has moved on to other uses, according to Scott Schupbach, application engineer. ""In 1994 we set up a medical application, ran that for 18 months, and then began developing a dental forms scanning process,"" he says. Since all forms were previously entered manually, ""it was a tremendous turnaround,"" he adds. ""Character recognition accuracy has been right around 99 percent.""

Guardian is now in the process of developing OCR-capable life enrollment forms, but Schupbach sees that as only a beginning. ""They just happen to be the next forms in line,"" he says. ""We're looking at where we have the highest volume of forms done by data entry and eliminating that need by creating an OCR process.""

As long as the insurance industry continues to receive hard copy, ""the future is unlimited,"" Schupbach says. ""Any time you can standardize a form or piece of data you have the possibility of an OCR process.""

That future also includes more economical solutions than in the past. ""Volume of forms processed is a factor, but type of form and process are more important,"" Schupbach says. ""One of the most important things for any business trying to adopt this kind of process is to determine whether a basic off-the-shelf software is sufficient for its needs. One of our critical items for each is how it's captured in the mainframe system and how it is identified by the user. We didn't find an off-the-shelf package to handle those needs.""

If a company chooses to work with one of the major OCR solutions companies, such as RRI, Dakota and Captiva, it can expect ""a great deal of robust integration, finely tuned specifically to its operation,"" says Cardiff's Clerke. However, insurers will pay for this level of customization. ""The ratio of professional services to software investment could be as high as three dollars to one,"" Clerke adds.

The costs of such systems are such that companies may not be able to afford them unless they're running a minimum of 50,000 to 100,000 documents a day, Clerke adds. ""With Cardiff you can get into applications with 2,000 to 3,000 documents that can justify automation, because it's tailored to a more general audience.""

Another advantage Clerke believes Cardiff offers insurers is its progress in Web-based processing applications. Cardiff partnered with Adobe two-and-half years ago to develop PDF-based forms solutions, and its FormDesigner currently ships in every box of Adobe Acrobat, Clerke says. ""PDF has the inherent beauty of the document looking identical online to offline,"" he says. It also allows companies to use documents that are pre-approved, from a legal standpoint.

Noting that the e-signature act is helping pave the way for online document processing, Doculab's Turocy asks, ""Why not use the Web as an input device rather than having the paper forms come in? You can then get positioned to reap the benefits not only of inputting data, but also more enhanced customer service and self-service capabilities.""

Besides Web-enablement, several other trends are likely to make a OCR more usable technology in the insurance industry. Clerke identifies three: First is that the symbologies of OCR—the sets of characters it can read—has expanded to include not only machine print, but hand print, high bar codes and even cursive writing.

Second, ""voting"" technologies allow the combined use of different OCR engines. ""Instead of saying, 'Pick one engine,'"" Clerke explains, ""we say, 'Let's take all of these and leverage their strengths.'""

Lastly, information captured through OCR can now be interfaced with other information for a variety of purposes. ""We have assist programs that will do validation of every valid address in the US against an incoming address field on a form. In the insurance space, you can tie your existing back-end systems, which have all the logic with respect to claims, with valid coverage, which can be built as an additional assist or automation tool as these forms roll through,"" he says.


Fuzzy OCR

If a computer can read one kind of insurance document, such as a claim form, how hard can it be to read another, such as an invoice? The answer, according to Reynolds Bish, president, Captiva Software (San Diego), is, ""Pretty hard.

""Historically it has been difficult or impossible to apply this technology to these documents because we have to design a template for each document to know where to look for the relevant information,"" Bish says. ""For invoices, there are literally thousands of different formats flowing into an organization every day and they're very unpredictable.""

To address this, Captiva has released InvoicePack, an application that uses what might be called ""fuzzy OCR"" to overcome the need to create a unique template for every invoice. InvoicePack produces a set of loosely designed templates designed to address the more common invoices in a customer's business. ""It recognizes all the information on the forms, and then begins to parse through the results looking for the desired information"" with user-defined rules and keyword searches, Bish says.

Learning Curve

Results falling below a specified confidence level cause the image to be routed to an operator, who can refine the template to yield optimal results, automatically storing the newly created template for future use, according to Bish. The system thus acts like a neural network, constantly growing more intelligent and producing better results, with fewer exceptions, he says.

Anthony O'Donnell has covered technology in the insurance industry since 2000, when he joined the editorial staff of Insurance & Technology. As an editor and reporter for I&T and the InformationWeek Financial Services of TechWeb he has written on all areas of information ... View Full Bio

2 of 2
Register for Insurance & Technology Newsletters