Sometimes Good Is Good Enough

Though design is important, with ever-decreasing cost of storage and increasing processor speeds it can make more sense to be less storage efficient if query efficiency is not compromised and the availability of the BI tools to the business can be expedited.

Step 2 -Getting Enterprise-Wide Information Ready for Use

A notable author once said, "If I worked on my book until it was perfect, I'd never get it published." One could say the same about how companies should organize their data and ready it for business intelligence applications (BI). In Part 1 of this article series, we discussed some basic tenets for why and how an enterprise-wide information inventory should be performed. If we roll the clock forward and assume that the inventory has been completed, it is now time to organize all that data into a form suitable for data exploration, data mining, predictive modeling and other BI efforts.

Data schema design exercises are important, but they should only be a means to an end. History has proven that many BI efforts have stalled or dragged on too long as database architects sought the perfect data warehouse design, pursued ideal relational data normalization, or embellished the business or technical scope of the effort with features that would rarely, if ever, be utilized. As an analogy, just think about how many features are in modern PC-based word processing programs that 95 percent of users will never use. With that in mind, be careful not to spend time designing information repository structures and features that include more "nice to haves" than "gotta haves." Focus the database schema effort on practicality and ease-of-use and how to capture and store the data most needed by the business to help answer the most pressing tactical and strategic questions.

Clearly design is important for technical data retrieval reasons, but with the ever-decreasing cost of storage and increasing processor speeds, it can make more sense to be less efficient in storage efficiency if query efficiency is not compromised and the availability of the BI tools to the business can be expedited.

With the schema built, now the databases need to be populated. Nearly every company says that their data needs extensive cleaning and repair. Frequently the lament is that enterprise data is too dirty, too disjointed or too incomplete to be rapidly useful without extensive data quality efforts. However, businesses should challenge themselves in this thinking. Remember, most of the internal data didn't come from some remote source; this data is the company's own data and has been used to run and manage various aspects of the business activities in the past. The data should already have basic credibility. Can the data really be that bad? Is disparate data that needs to be standardized into a common form really that hard to aggregate? Does the standardization of the data need to be perfect? Common reality is that most enterprise data is quite useable, since BI data typically doesn't need to be used in the same way accounting data is used (i.e., it doesn't necessarily have to balance to the penny or provide a perfect view of things). Often such data can provide enormous business value, despite imperfection, because much of BI is about looking for patterns and signals rather than precise accounting or auditing outcomes.

So given these inevitable data imperfections, how does a business know it has enough of or the right data for its BI objectives? In the same way an inventory of available internal and external data should be performed at the onset of the effort, an inventory of BI needs and questions needing answers should also be prepared. By then carefully analyzing the information needs of these questions, the majority of necessary data can be assembled and included in the BI repository design. Again, history shows that a lot of BI projects have not taken this pragmatic approach to planning - that is, only make available in the beginning that information most necessary to provide incremental business value and enable the project to be successful early on by delivering ROI.

The BI repository doesn't need every bell and whistle, every data feature or every potential data source. Rather the 80/20 rule should heavily influence the initial design and rollout of the BI effort - the design solution should respond to 80 percent of the needs and be prepared to leave the other 20 percent un-addressed until a later date. Use the success of each phase of the project to feed and fund future phases.

A lot of this discussion seems like common sense; however, many BI projects struggle with these very issues. When preparing data for enterprise-wide BI use, focus on broad needs that will provide value to the many rather than the few. This is the heart of the 80/20 rule and a very useful principle for successful BI project planning. By following this rule, the organization will challenge itself to avoid scope creep and recognize that, in many cases, project success can best be achieved when planners and managers say that what's been done and what is ready for use is good enough to start generating business value.

John Lucker is principal, Deloitte Consulting LLP, Practice Leader - Advanced Quantitative Services (Data Mining & Predictive Modeling). He can be reached at [email protected], 860-543-7322.

On The Net