What is your data trying to tell you?
When I start work on a project with a new client, I often hear the same anecdotes repeated again and again. The reason is that these examples are symptoms of the underlying problems; they demonstrate the pain that can result from poor data quality. The normal situation is that no detailed analysis of the data has been performed and there are no metrics showing the cost of non-quality.
You can learn so much from your data, just make the effort to understand it. The anecdotes are high-profile examples of problems that exist, but you need to open your eyes to the full extent of your exposure to poor quality data. Data Profiling has a role to play, whether you choose to use a tool or write your own queries, but there's a lot more to understanding your data. The interpretation of the Data Profiling results is critical; what do the results tell you about your data, your procedures and systems?
The method I use to understand data has 3 main steps. First, it requires me to identify who the key stakeholders are and understand what data entities and attributes are critical to the success of the business. This allows me to define what work needs to be done and plan what resources are needed. The second step is to Measure & Analyse the data quality to understand what impact non-quality data is having. This is an iterative process involving Business Impact Workshops, where we define key metrics and rules, and Information Quality Assessments, where we apply those rules and measure the data quality.
The final step is to present the findings. I insist on doing this formally and involving all of the stakeholders, if someone has claimed executive responsibility for data quality this is when they prove that they meant it. I've never gone through this process without unearthing something new and of interest to the sponsor. The usual response is for them to ask what they can do to improve the data quality, but I also point out the importance of protecting good data and having control of their data quality on an ongoing basis.

Steve, thanks for this post. I've been trying to explain the importance of a more rigorous approach to data quality to my boss (more than just anecdotal evidence). Your explanation and the diagram were both useful in convincing him of the case.
Cheers,
Andy
Posted by: Andrew Morecambe | 22 December 2005 at 00:46