Active Image

  Data Quality Disasters

 

Warning - Graphic content may be upsetting to some CIOs.


Poor data quality has the potential to bring down a company, or at least do significant harm to its reputation and bottom line. The stories below are just some of the hundreds of published examples we've heard about. With an estimated $600B cost to US industry each year (TDWI, 2002 study), we know  there must be tens of thousands of similar examples that are never made public. 


 A Presidential Election

What has data quality got to do with politics? Plenty!   Poor data (or information) quality may have decided the winner of the 2000 election for the President of the United States of America - and if so, it means that poor data quality completely changed world history! Many will remember how the Florida voting issues and subsequent recounts delayed the result for many weeks, resulting in stock market fluctuations and lawsuits. What role did data quality play in this?

  • Voter confusion  The design of the 'butterfly' ballot papers in Palm Beach County was so confusing to voters, that many marked them incorrectly and as a result a significant number of votes did not count.

  • Process errors The infamous 'hanging chads'. If a ballot card was improperly punched, resulting in the chad not being completely detached, the automated counting equipment did not count the vote correctly. 

 Did the resulting errors affect the outcome?

It was only one bad data value. How much harm could it cause?

Watch this video to find out. He tells it much better than we can!

Fraudsters only need half a chance..

  • A New Jersey man used 1,630 aliases to buy CDs at special introductory rates from on-line music clubs and then sold them at flea markets for 4 times the price he paid for them. He used fictitious apartment numbers and other variations of his name and mailing addresses to fool the inadequate filtering software that was supposed to identify duplicate customers. His fraud totaled $250,000 before he was caught.
  • A family owned manufacturing business in South Carolina was a small time supplier of nuts and bolts to the Pentagon. A new procurement system was set up to reduce overheads in processing invoices from small suppliers like this one. It validated that items were correctly priced as per contracts etc, but had no checks on expedited shipping charges. Once they discovered that they could charge whtaever they liked for shipping this supplier went to town:   $403,436,00 shipping for six machine screws costing $60.  $445,641.00 to send a tube-elbow costing $8.75. $998,798.00 to ship two washers valued at 19 cents each. They did eventually get caught and were prosecuted; but not before overcharging by $20 million over 9 years.

Sometimes it’s not the data, but your understanding of it

The 1999 NASA Mars Climate Observer mission failed – not because of a data quality problem, but rather an information quality issue.  Thrust calculation data was provided by engineers in the US measurement scale of pounds/square foot, but the group that programmed the flight plan mistakenly interpreted the data as being metric numbers representing Newtons/second.

The data was correct. The understanding and usage of it was not. As a result the thrusters fired way too long, the orbiter went missing (it probably crashed on Mars) and a $300M mission failed!

Another thrust problem – but this time lives were at risk

On 20th March 2009, poor data quality nearly resulted in the worst air traffic disaster in Australian history as an Airbus A340 narrowly avoided crashing on take off into a residential area of Melbourne.  The provisional report found that the root cause for the incident was the entry of an incorrect calculation for the weight of the aircraft of 262 tonnes, where as the plane was actually 362 tonnes in weight. This affected the calculations for airspeed required for take-off and the necessary thrust required to reach that speed.

The end result was that the plane failed to take off correctly and gain height as required, resulting in the tail of the plane hitting the runway and then proceeding to plough through a lighting array and airport instruments at the end of the runway.

Keep the regulators happy – or else

In 2009, Barclay’s Bank in the UK was fined £2.45 million for “failing to provide accurate transaction reports to the FSA and for serious weaknesses in systems and controls in relation to transaction reporting”.

Giving and taking

  • In 2008, a schoolboy in the UK found himself £300 in debt after his bank accidentally lodged £2 million to his account (he's a schoolboy - of course he tried to spend it).
  • Last we heard, police were still hunting for a couple who took advantage of a poor unsuspecting bank which deposited NZ$10 Million in their account instead of the NZ$10,000 that they had requested. They withdrew most of the money and left the country.
  • In 2007 a man in Georgia USA was shocked to receive a demand from his bank for an outstanding debt that exceeded the total national debt of the USA!

Try to be sensitive to those who've lost loved ones

  • In 2008 and 2009, the US Government sent out Stimulus Checks to people to help stimulate consumer spending in the US economy. One such check was sent Mrs. Rose Hagner. Her 83 year old son found it in the mail and was a bit surprised when he saw it: his mother has been dead for over 40 years. Social Security officials give the following explanation: Of the about 52 million checks that have been mailed out, about 10,000 of those have been sent to people who are deceased. The agency blames the error on the strict mid-June deadline of mailing out all of the checks, which didn’t leave officials much time to clean up all of their records.
  • A UK travel agency kept regular contact with its past customers by telephone, hoping to sell another vacation (we should probably say 'holiday' since this occurred in the UK).  On occasion, it happened that a customer had passed away, and of course the phone call could be upsetting to the family. Their system would not let them delete the customer, since there were transaction records tied to it. Some bright spark hit on the idea of appending the customer name with "** IS DEAD **", so operators would not call in the future and upset the family of the deceased. This worked fine until the company decided to use mailouts instead of costly phone operators. Imagine the grief caused to Mrs Jones, when she received a letter addressed to "Mr A. Jones ** IS DEAD **".  

Pets need credit too!

  • An Australian woman decided to test the identity screening processes that her bank uses for credit card applications. So she applied for a card in the name of her cat, which is 2 years old. The bank asked for ID documents, but hadn’t received them before they issued the card. Furthermore the cat’s owner wasn’t notified that a second card had been issued on her account. Her cat now has a $4,200 credit limit.
  • A bank in the UK sent out hundreds (or even thousands) of credit card offers to their customers' pets. The error occurred because the bank had previously offered pet insurance to its customers. The insurance application forms were confusing and many owners had entered their pet's name instead of their own. Policies got issued, premiums got paid - no harm no fowl (sorry) - until the bank mined their insurance database to send out the credit card offers! 

 You need to look in how many places?

When there are just too many data sources, don't expect them to agree:

  • An insurance company had a list of twelve "sacred data elements" that were considered so important that if the data was wrong, the company could fail. They undertook a data inventory and discovered that this data element was maintained in 43 separate databases by 43 independent applications with data entered by 43 different data producers.
  • A manufacturer discovered it had over 90 "parts" files, most using different part numbers, such that the same part number in different files could not even be cross- referenced.
  • A major bank found it needed to analyze data from 250 different customer files in order to answer the question, "Who is our best customer?"
  • A consumer goods company discovered they had over 400 "brand" files containing information about their products!

US accidentally bombs China

In May 1999, during the Bosnian War, the United States inadvertently bombed the Chinese Embassy (which is technically Chinese 'soil').  The bombing stemmed directly from a data error. The information used to determine what was located in the intended target was out of date. Instead of a legitimate target, the Chinese Embassy was bombed and three Chinese citizens were killed.

 Cheap Flights

In 2005, while operating under bankruptcy protecton, US Airways mistakenly advertised return flights on their website for $1.86. Before the error was discovered and the price corrected to $186.00, over 1,000 tickets had been sold.  The airline decided the cheapest option was for them to honor the tickets, rather than cancel them, argue with 1,000 plus customers and try to rebook them!

OK - That's enough for now

Some of these were outright scary and others were public relations nightmares – so we threw in a few humorous ones to help you out.

The point we are trying to make is that no-one is immune to data quality and information quality errors. They will happen. Guaranteed.  And when they do, they have the potential to cost your organization huge sums of money. Maybe even send you broke.

 

Conservative estimates suggest that poor data quality costs the typical company 10% of revenue and 20% of earnings. If you think that sounds too high, take another look at the examples above.

 

Many of the above situations occurred because the organizations involved believed that their data was trustworthy, when in fact it was not, and then blindly used this data in critical calculations, reports and other business processes. 

 

From a business intelligence perspective, it is crucial to recognize that when you use a front-end query tool to report directly against your operational data, it provides you with no opportunity to validate the data before use.

 

It's simple really:  untrustworthy data = untrustworthy reports = questionable or bad decisions  

 

When you build a data warehouse or data mart environment you have an opportunity to implement data validation based on your own business rules. Simple sensibility checks can set aside data that is outside an expected range, missing values can be identified, multiple disparate databases can be merged etc.  To do this, you need a good ETL tool.

 

Read how RODIN can help