What's The Big Deal About Big Data?

With the continuing growth in the volume of data and the variances in the types of information collected, there are many challenges and costs for IT departments when it comes to ensuring the accuracy, security and storage of that information.

Big data has become a hot button topic in recent months. The term is used to describe the multitude of information entering and being stored by businesses. There is an unprecedented amount of data currently available to organizations. In fact, IBM estimates that 2.5 quintillion bytes of data are created every day around the world.

Insurance organizations are also part of this movement. Insurers store a lot of information on their policyholders, including unstructured information from social media conversations, customer and product information, transactional information, video, location-based information – the list goes on and on.

Many insurers believe in collecting as much information as possible on the policyholder and are allocating budget and resources to do just that. But what is the point of all this information? Is the hassle and cost worth it?

Insurers need to remember that the raw information contained in a database isn’t valuable in and of itself – it’s the analysis that can be gleaned from the data that is important. Therefore, stakeholders need to ensure the quality and structure of the information they are collecting so data can be easily segmented and analyzed. Without that ability to gain business knowledge from the multitude of information, the massive big data effort is simply a waste of time.

To optimize the data for analysis, insurance organizations need to first take a step back and think about what they want to accomplish. Most will say they want to become more efficient, provide a better policyholder experience or make smarter business decisions. While these are all worthwhile initiatives, how will insurers achieve those goals through data?

The answer isn’t simple. All data sets are not created equally and each has its own challenges. In addition, the value of one piece of information will be different to each organization, department or even individual. There isn’t a golden rule that says that if you collect this type of information, you gain an understanding of X, Y and Z and achieve your business objectives.

Business leaders need to take some time to think about what information is valuable and how that data will help them achieve their goals and objectives. From there, they need to figure out how to mine that information and put into a digestible summary that can be utilized.

Part of mining that information is ensuring the standardization and accuracy of a core set of data elements. These elements can then be used as a unique identifier to help segment, merge and aggregate records.

For some insurers, this may be a policy number or a Social Security number. However, there is another set of basic records that needs to be kept accurate. Contact information is unique to each policyholder and helps identify households, allowing insurers to communicate with their customers.

While talking about contact data seems rudimentary when discussing a sophisticated and complex topic like big data, it is something insurers are still struggling to keep accurate. According to a recent Experian QAS study, 92% of organizations suspect their customer and prospect data might have inaccuracies. On average, respondents suspect that as much as 25% of information is inaccurate.

To make sure data can be effectively accessed, insurers need to start by cleaning and standardizing the basic contact data within their database. Contact information can easily be transposed to a set of standards through software tools. For instance, a mailing address can be changed based on the standards set by the country’s national postal authority. Then stakeholders can go about using matching techniques to remove duplicate records, which is another common problem for databases.

By ensuring that data is properly formatted and duplicates are removed, insurance organizations can guarantee that each policyholder only has one entry in a central database. This singular, clean customer view allows for accurate advanced analysis and gives one central record for each contact to append new information.

Once insurers identify and eliminate inaccuracies within their existing data, they need to identify data entry methods and channels to put software or processes in place to prevent new errors or duplicates from entering their database.

By ensuring the accuracy of basic contact information and a single record for each policyholder, insurance organizations can more easily aggregate and analyze the information in their big data initiative.

Insurers will still need to deal with large amount of information and big data will continue to be a challenge. But by identifying what information is actually useful and ensuring that information can be used for specific business goals, insurers will prevent collecting, storing and securing data simply for the sake of having a multitude of information. And with accurate contact data, insurance organizations can ensure they are able to better aggregate that information to gain the business knowledge that makes this initiative worthwhile.

About the author: Thomas Schutz is SVP and GM for Experian QAS North America, serving as the company's top executive for all strategic business decisions in the United States and Canada. He be reached at [email protected].