Data Quality – Cause or Symptom?
It seems to me, that if you work hard enough, you could make a pitch for just about any problem being a data quality problem. But you don't have to work very hard with this example BBC News - Probe into Japan share sale error - instead of selling one share for 610,000 yen, a trader at Mizuho Securities entered the figures the wrong way around and tried to sell 610,000 shares at just 1 yen each.
Of course this is a data quality problem - the figures are clearly wrong, but this is only a symptom of the original problem, not the cause. If the trader had checked the price for these shares, he would have seen that the figure he was using was totally at odds with the normal trading range. If I was a process expert rather than a data quality one, I'd be claiming this as an example of poor business process.
So is that it? Should we quit wasting time trying to resolve data quality issues and put all of our efforts into process improvement instead? My answer is an emphatic no. Poor data quality is often a symptom of poor business processes, but improving and protecting the quality of an organization's information asset requires a rigorous data-centric approach.
Data validation should form part of any business process, not be regarded as something completely separate. We're used to systems validating data on a field by field basis when we enter it, but this rarely goes beyond making sure that the correct fields are populated and a valid format. This is not always enough, as Mizuho Securities discovered.
Imagine if the trading system had checked the share price, spotted the inconsistency and prevented the erroneous sale from proceeding. Now is that a process improvement or a data quality validation?

Comments