dq:View has moved...

Hi folks,


Thanks to everyone that has been following dq:View, but I've now taken the decision to move the blog to Datanomic's corporate site. I hope you'll continue to follow my witterings at http://www.datanomic.com/category/resources/blog/ or via an appropriate feed; RSS or Atom.

You can also find me online on LinkedIn as http://www.linkedin.com/in/SteveTuck, on Plaxo as http://stevetuck.myplaxo.com and if you're into Twitter, you can follow my Tweets at http://twitter.com/SteveTuck.

All the best,
Steve

p.s. There are already 2 new entries for you to read on data quality related topics at http://www.datanomic.com/category/resources/blog/.

Data quality carrots

When I want to teach my dog to do something, I generally find it helps to offer her something in return. A small piece of cheese, or other tasty morsel generally does the trick. It doesn't have to be anything big or expensive and, after a short while, when she's learnt what it is I want. she'll respond without the need for anything more than a "good girl" as a thank you.

I'd say the same is pretty much true for my kids (although they respond better to cash than cheese and can generally understand more complex requests). So why is it that some data governance regimes think that everything will be alright if they issue an edict and back it up only with strong-arm tactics - "do it this way, or else."

If you want to encourage the right behaviour from your front-line staff who collect and enter information that other knowledge workers consume, why not start by offering them some incentive to do it. If you only measure their performance by crude measures, such as call volumes, or numbers of records entered, you cannot expect them to worry too much about the quality of the data they're actually typing in.

By measuring the quality of the information they're entering, and rewarding them for doing it right, you'll increase the value of that information, remove costly scrap and re-work and improve the output of the downstream processes that use the data. Just like my dog, the reward doesn't have to be big or expensive and, after a short while, you'll find that the good behaviour becomes second nature, which can be positively reinforced by regular monitoring and a polite "thank you." There's a place for the stick, but it's better to lead with the carrot.

Please note, the author does not recommend the offering of either carrots or cheese as a reward for good data quality.

Dell's hot technology

Dell has today announced the recall of more than 4 million laptop batteries over fears that they could overheat and start a fire.

Dell recalls 4m laptop batteries Dell recalls batteries over fears of explosions

.

There may well have been some data quality issues involved in the manufacture of these batteries (by Sony), but what concerns me is the opportunity for error when checking whether a battery is potentially dangerous or not.  Here's a shot of the label on my own laptop battery:

Battery_labelCan anybody tell me how many times the number 0 and the letter O appear in the code?

[click on the picture to expand it]

I was so uncertain that I tried a number of different permutations, with the following results:

Battery_recall

Based on past experiences there is enormous potential for a significant number of these 4 million batteries to be left in circulation.  The website appears to offer no validation of the data entered whatsoever - you can type in anything and it won't complain.  This is no way to handle a safety recall, especially when the product labelling is so ambiguous.

For anyone that isn't squeamish, you can read a true story about how a laptop can get too hot to handle at The Register.

dn:Director - a fresh approach to data quality

DndirectorWhy do so many organisations turn a blind-eye to data quality?  One thing for sure is that the legacy data quality software providers have done little to help address this crucial business issue by delivering products that require years of expertise to successfully leverage all of the functionality available (and, just as importantly, to know when to use something else instead).  After a dozen years of working in the field, and having built a highly profitable consultancy business to help clients address this short-fall, I decided a year or so ago to join Datanomic.  I'm delighted to say that, last month, we celebrated the launch of dn:Director, a data quality product that is setting new standards for data quality management in the 21st Century.

I've been privileged to work on data quality projects with many leading, blue-chip companies over the years, but one of the things that struck me was that I was being asked the same questions by clients in 2004 as I was asking myself more than a decade earlier; they were identifying the same old deficiencies in data quality products and having to employ the same workarounds to resolve them.  Sure, the vendors have done something to smarten up the look of their software, but, under the covers sits essentially the same code that was initially developed for mail-room efficiency in the 1980's.

Two more things struck me:

  1. All of the software vendors talked about delivering a tool for "business users" but the reality was that just about every project relied on the IT department to develop the business rules.
  2. Because of the complexity of using the software to good effect, the cost and duration of projects was prohibitive; the reason I was working with so many blue-chip companies was that they were the only ones that could afford to undertake such major projects!

These were the things that motivated me to create Tranato and subsequently to join Datanomic in 2005 and bring together the two technologies under a shared approach.  Put simply, we feel that a data quality product needs to be much more accessible - you shouldn't need to be a software guru to get value from it.

Directorarch_1dn:Director is the result of many years experience in data quality and data management; not just my own, but that of people like Gerry Kelley (Datanomic's VP of Professional Services) and his team, and the shared experiences of our clients and partners.  Taking Datanomic's approach (The Four Cornerstones) and methodology as its foundations, dn:Director has been built from the ground up, using the best-available modern technology.

Developing dn:Director in Java and using standards-based interfaces (such as JDBC, JMS and XML) has enabled us to deliver a technically advanced and extensible data quality product that supports both batch and real-time processes (providing data quality services through SOA).  But the thing that everybody notices first is just how easy it is to use - you should hear what out customers and partners have had to say about it:

"This is great - it's so easy understand and configure business rules"

"I love the way that you can build rules from the data - it's so quick and intuitive"

"This will halve the time it takes to deliver a project"

Directorsample For more information visit Datanomic's website or call on +44 (0)1223 228400.

Note: I know this is very commercial for a blog entry, but given the amount of personal time, energy (and money) I've committed to making dn:Director a success, I hope you'll forgive me.

The resurrection of Mr. Smith

"Nothing in life is certain except death and taxes" - Benjamin Franklin

The truth of the second of these is undeniable, but you could be forgiven for doubting the first if you worked in the branch office of some banks.  How would you react if, as a bank teller, your computer records showed that the customer standing in front of you supposedly died a year ago?  “Um,… how are you feeling today Mr. Smith, you’re looking a little pale?”

What leads to this situation is often a muddle of processes, and people using workarounds to beat the system.  For instance, I’ve discovered that a common practice in some banks is to flag a favoured customer as deceased so that they can close a savings account and withdraw money without a penalty.

In other cases the confusion has come as the result of genuine bereavement.  Rather than comply with documented procedures, following the death of a customer somebody has decided that it is more expedient to over-type the original customer’s details with the name of the person who is granted probate.  The one field that can’t be changed by anyone once it has been entered is the date of death; so there it sits, alongside someone else's details.

Given that this is such a sensitive topic, I am, on the one hand, astonished at how some people are willing to act so flippantly, but I also understand why people find these workarounds so useful.  The consequence of their action may be something that can be regarded as a data quality problem, but unless the inadequacies of the underlying processes are resolved, any fix of the data will not be sustainable.

Does the Pope have a Dangerous Dog?

That may seem a strange question to ask, but it's one that I remember an IT Project Manager once boasting that their system could answer.  What she meant was that their customer management system could support the data because "His Holiness The Pope" was included in the drop-down list of personal titles and they had a check box to flag customers who had a vicious canine.

The project team in this case had spent a lot of time researching personal titles and come up with a list of several hundred; but while they had included the "Her Majesty The Queen" as well as the Pope, they'd not thought about those people that have composite titles such as "Rev. Dr."  I can state with confidence that this particular company does not count either Her Majesty or His Holiness as customers, but they do have customers with composite titles.

They also recognised that it would be useful for staff to know if a customer's pet might pose a threat should they have cause to visit.  But how exactly did they expect to collect this information in the first place?

It's all very well to try to capture data in a structured way, but why list hundreds of titles when a handful cover the vast majority of the population?  My advice is to list the common ones (Mr, Mrs, Miss, Ms, Dr, Rev) and then allow for Others through a free-text field.  Using the list reduces the number of typographical errors made in entering standard titles and the free-text field will allow for anything else to be entered exactly as the customer wants it.

This isn't rocket science, it's just a pragmatic approach to dealing with data entry screens and validation

Syndicate

RSS Feed


What is RSS?Copyright © 2005-2006
Steve Tuck and

Datanomic Ltd
All Rights Reserved

View Steve Tuck's profile on LinkedIn