07:57 AM
Teaching Your Computer To Read

Does optical character recognition really work?

Optical character recognition (OCR) is one of those inventions that seem to embody the magical possibilities of technology—you put something ordinary in at one end and it comes out marvelously transformed at the other. The fact that its process mimicks the use of our most sophisticated sense—sight—makes it all the more appealing. It also seems a perfect fit for insurance, an industry swimming in paper as it struggles to automate.

But OCR has had a checkered past in the insurance industry, and to the extent it has succeeded, it has been largely limited to processing health insurance claim forms. Its promise seems great, but many IT professionals remain skeptical—and not without some justification.

"The most common sentence you'll ever hear with the term OCR in it is, 'We tried OCR and it doesn't work,'" says Sandeep Goel, president and CEO of Dakota Imaging, a Columbia, MD-based provider of OCR solutions. "The reason is that, historically, too many projects have failed."

Those failures have to do with many factors, all having to do with misunderstandings of what goes into a successful solution—and what one ought to expect to come out of it, according to Goel. The first thing to get clear is where OCR technology fits in with insurance systems, he says. OCR engines, while they have different capabilities, all basically work and are used commonly by all the industry solution providers.

But the OCR engines are not the problem. "You don't want to deploy OCR, per se; you want to deploy a business solution that automates transaction processing for your market and your specific application," Goel says.

Such solutions have been in place for a couple of decades, according to Chris Thompson, executive vice president, Recognition Research, Inc. (RRI, Blacksburg, VA). "The old systems were very expensive, difficult to deploy and they could generally process only a third to a half of the claims they were fed. If the claim forms weren't close to perfect, they were rejected," he says. "They didn't have the same kind of accuracy and payback as the current systems do, and a number of the installations failed."

That state of affairs prevailed all the way to the mid-1990s, Thompson says. "But," he adds, "people who haven't looked at OCR solutions in a few years should look again."

The potential for OCR use across the insurance industry is great, Thompson says, and "in virtually every case it will pay for itself in less than a year." The technology is suitable for any area of insurance, he adds, but "payoff is generally greatest in health, because we've spent so much time tweaking it."

Dan Elam, president of content and process management consulting firm eVisory (Richmond, VA), says, "I think there are a lot of managers, particularly older ones, that don't realize this technology can work for them, so they're not even looking at it." The technology has changed dramatically since its early days, he adds, and perceptions of the technology do not always reflect how it has evolved.

Disappointments still occur, "but it's not that the technology doesn't work," Elam says. "It's either that it wasn't implemented properly, or it's been oversold. Companies need to understand what OCR can actually do."

What it cannot do is provide 100 percent automation of forms processing, according to Dennis Clerke, president and CEO of Cardiff (Vista, CA), which provides off-the-shelf OCR solutions. "If you have 10 or 20 people keying in data from claims forms, with OCR you'll bring it down to two. You'll take 80 to 90 percent of the manual entry away, but there will still be human intervention," he adds.

Rocket Science

Sometimes success or failure is only a matter of setting realistic expectations, RRI's Thompson says. But that is certainly not always the case. Constructing an OCR solution "is very difficult to do, and a lot depends on the skill of the engineer configuring for a given insurer. There can be a lot of complications based on how a company's legacy systems need to see the data," he comments, adding "There's a lot to these systems. Ours has over 3 million lines of code in it—that's 60 times as much code as runs the space shuttle."

That complexity is what makes the experience logged in the health insurance forms area so valuable, Thompson claims. "We've got tens of thousands of man-hours into just optimizing our system to do the standard HCFA 1500 and UB92 claim forms," he says.

The absence of that expertise can lead to the kind of thing that gives OCR a bad name, judging by the experience of Jeanne Smith, manager, data capture, GHS Data Management (Augusta, ME) a service bureau that captures health insurance form information for the state of Maine's Medicaid program. "Two years ago I was one of those people who would ask why anybody would want to use OCR and would say, 'It does not work: the process is slow and the information we're sending back to the client is incorrect.'"

GHS worked with a reseller of a software package that sent its own personnel to install it. "We almost lost our biggest contract because of it," Smith says. Though the project was scheduled to last six weeks, "It was pure hell for two years," Smith says.

GHS subsequently worked with RRI on a new system. "From the beginning, to where we passed the hurdle of where the client expected the accuracy to be, took about five months," she says. It took a good deal of tweaking, but the system now runs at an accuracy level of less than one percent inaccuracy, Smith reports.

While vendor experience has tended to keep OCR concentrated in health insurance, it is the standardization of health claims forms that made that area an attractive target for the technology in the first place. "OCR will be more viable for other areas of insurance once they get more control of the forms they use," says Pat Turocy, principal analyst, Doculabs (Chicago). "Anyone can take advantage of OCR where someone is filling out a standard form to apply for coverage, be it life, health, property and casualty, or whatever."

