SlideShare a Scribd company logo
page 1copyright Real-Time Technology Solutions, Inc., updated January, 2015
Enterprise Business Intelligence
& Data Warehousing:
The Data Quality Conundrum
A study by RTTS
For Business and IT Professionals
By Bill Hayduk
page 2copyright Real-Time Technology Solutions, Inc., updated January, 2015
Enterprise BI & Data Warehousing: the Data Quality Conundrum
Poor data quality is a huge and exponentially growing problem. Big Data is causing massive increases in
the volume (amount of data), velocity (speed of data in and out), and variety (range of data types and
sources) of data. Therefore, concern for poor data quality or ‘bad data’ is now a critical issue.
According to Gartner, “the average organization loses $8.2 million annually through poor data quality,
with 22% estimated their annual losses resulting from bad data at $20 million and 4% put that figure as
high as an astounding $100 million”. And InformationWeek found that “46% of companies cite data
quality as a barrier for adopting Business Intelligence products”.
We recently performed a study that included responses from over 200 companies interested in
improving the data quality in their Business Intelligence, ETL software and Enterprise Data Warehouse
implementations. Below are our findings, along with context around these results.
SECTION I.
Our first section polled customers on their architecture: specifically on their data warehouse, ETL and
business intelligence software and vendors.
Enterprise Data Warehouse Software
The top data warehouse vendors are
Oracle (plus MySQL and Exadata) as
number one and Teradata (and Aster Data)
as number two and every other one far
behind. This is in sync with research by
most analyst firms that track this platform.
Oracle and Teradata dominate the space
with both their loyal customer base and
their innovation.
Analyst firm Gartner, in its 2013 Magic
Quadrant for Data Warehouse Database
Management Systems report, projected a
10% growth in the database management
system market and it pinpointed a
significant increase in organizations
seeking to deploy data warehouses for the
first time.
page 3copyright Real-Time Technology Solutions, Inc., updated January, 2015
Business Intelligence Software
IBM leads the BI space with Cognos,
followed by the surprising performance of
Microsoft at second and Oracle’s combined
offering at 3rd. Unexpectedly, ‘other
vendors’ were chosen 18% of the time,
meaning there is still room for growth in the
BI space.
Analyst firm IDC stated that the market is
now forecast to continue to grow at a 9.8%
compound annual growth rate through
2016. One key observation they made was
“the media attention on Big Data has put
broader business analytics on the agenda of
more senior executives.”
ETL (Extract/Transform/Load) Software
Here we see another surprising result of our
survey. Microsoft finished first in the ETL
Vendor software survey. Informatica
PowerCenter, the most widely known vendor
finished second, closely followed by IBM’s
combined offering at third. It is interesting to
note that ‘other’ came in first above
Microsoft. There is still a large contingent of
companies using home-grown and open
source software in the marketplace.
According to analyst firm Forrester, “The
enterprise ETL market continues to grow at a
healthy pace as more enterprises replace
manual scripts with packaged ETL solutions.
This migration…toward ETL tools is driven by
the need to support growing and increasingly
complex data management requirements.”
page 4copyright Real-Time Technology Solutions, Inc., updated January, 2015
Current Data Warehouse Size
We inquired as to the current size of the data warehouse implementations. We discovered that 91%
were less than 100 Terabytes and 52% were less than 500 Gigabytes. Interestingly, the largest sector
(33%) of implementations was between 1 Terabyte and 100 Terabytes. This is a significant increase in
data size when the largest sector in our poll 2 years ago was measured in Gigabytes.
SECTION II.
In Section II, we analyzed the current quality situation of firms to determine their effectiveness.
page 5copyright Real-Time Technology Solutions, Inc., updated January, 2015
Current Testing Strategy
We inquired as to which test strategy was
being implemented: (1) testing across ETL legs
(source-to-target DWH, DWH to data mart, etc.), (2)
utilizing Minus Queries (see full definition of minus
queries here http://bit.ly/13lmp8N), and/or (3)
comparing row counts from source-to-target.
The preferred strategy is to utilize a
combination of the three, depending upon the
role (tester, ETL developer, operations). But if
one were going to choose a single strategy to
check for data quality, it would be (1). We
found that 30% of companies polled only verify
row counts and only 7% implemented all 3 as
their preferred strategy. Many singled out a
lack of automated testing and a lack of testing
resources as reasons they did not deploy a
more rigorous testing strategy.
Current Test Execution Method
When surveying customers on their current
test execution method, we found that 60% of
testing is currently performed manually.
Manual testing consists of extracting data from
the source databases, files and XML and also
extracting data from the data warehouse after
it goes through the ETL process and then
comparing these data sets manually, by eye.
This is quite extraordinary when considering
that the average data warehouse is measured
in gigabytes and typical tests return millions of
rows and upwards of hundreds of columns,
meaning millions of sets of data to compare.
Therefore, testers can only sample the
comparisons for practical purposes. Vendor
tools come in second and a home-grown
finished third.
Data Quality and Testing Challenges
When customers were asked about their biggest challenges when it came to testing the data for
accuracy, the clear top choice was ‘No Automation’. This goes hand-in-hand with the 2nd biggest
challenge – that testing manually is very time consuming. This was followed by the lack of a test
management process and/or tool, not enough data coverage and no reporting on the testing effort.
page 6copyright Real-Time Technology Solutions, Inc., updated January, 2015
Percent of data coverage by current test process
When determining the amount of
data coverage that companies’
current test process provides, it is
clear that there is not much data
coverage at all. Of those companies
surveyed, 84% had less than 50
percent data coverage, 58% had less
than 25 percent coverage, 33% had
less than 5 percent and 29% of
companies had less than 1 percent.
The 14% who had 100 percent
coverage had data warehouse
implementations less than 500GB in
size and were interested in finding a
data warehouse testing tool that could speed up the testing cycle. Also, the coverage represents the
amount of data brought back by SQL queries, not the amount compared. Since comparisons (as noted
above) are typically performed by visually reconciling the 2 data sets, it is impractical for more than 5-
10% of the data to be compared.
Ratio of Developers to Testers
Average ratio: 2.1 to 1
Median ratio: 1.7 to 1
Highest ratio: 20 to 1
Lowest ratio: 2 to 3
We found that, for the most part, the ratio of developer to tester was standard in principle. The issue
that most firms surveyed stated was that they could not get enough data coverage. The reason: ETL
developers are utilizing some form of tool to make their jobs faster and more efficient while most
testers are not.
page 7copyright Real-Time Technology Solutions, Inc., updated January, 2015
Effects of Bad Data
We surveyed customers on the impact of bad data on their organizations. Of the ones who answered,
100 percent said that they experienced some form of bad data in their data warehouses. Below are
samplings of their free-form answers on the effects that bad data caused them.
 Incorrect business intelligence reports
 Poor delivery quality & customer dissatisfaction resulting in re-work
 Missing revenue opportunities
 Critical business decisions rely on underlying bad data
 Bad quality of projects
 Negative business Impact
 SLA Issue with our customers
 Major embarrassments to our team
 Long working hours to fix bad data
Conclusion
Many companies are using Business Intelligence (BI) to make strategic decisions in the hope of gaining a
competitive advantage in a tough business landscape. But Bad Data will cause them to make decisions that
will cost their firms millions of dollars. It is clear from the results of this survey that companies are not
providing the level of data quality that C-level executives need to make a strategic decisions. Most firms
test far less than 10% of their data by sampling the data and the comparisons. Therefore, at least 90% of
data remains untested. Since bad data exists in all databases, firms need to test closer to 100% of their data
and guarantee that this critical information is accurate.
There is no practical way for testers to verify this level of coverage without the use of automated testing
tools. An automated testing solution will speed up the process, provide much more data coverage,
perform comparisons automated, and provide reports for audit trails. It is clear that as data grows
exponentially, a more complete solution is needed to keep enterprise-level data clean.
page 8copyright Real-Time Technology Solutions, Inc., updated January, 2015
About The Author
Bill Hayduk
Founder, CEO, President
Bill founded software and services firm RTTS in 1996. Under Bill's
guidance, RTTS has supported over 600 projects at over 400
corporations, from Fortune 500 to midsize firms. Bill is also the
business leader on QuerySurge, RTTS’ industry-leading data warehouse
testing tool.
Bill holds an MS in Computer Information Systems from the Zicklin
School of Business (Baruch College) and a BA in Economics from
Villanova University.
References
 Gartner: Magic Quadrant for Data Warehouse Database Management Systems (January 31, 2013)
 IDC: Worldwide Business Analytics Software 2012-2016 Forecast and 2011 Vendor Shares (June 2012)
 The Forrester Wave™: Enterprise ETL, Q1 2012 (February 27, 2012)
 InformationWeek: 2012 BI and Information Management Trends (November 2011)
 RTTS 2013 Client Survey on BI and Data Warehouse Quality

More Related Content

PDF
Big data-analytics-2013-peer-research-report
PDF
CS309A Final Paper_KM_DD
PDF
Disruption: Data and Analytics Modernization in the COVID-19 Era
PDF
Big Data Trends and Challenges Report - Whitepaper
PDF
Agile Drug Development Required in a New Age of Urgency
PDF
Idc info brief-choosing_dbms_to_address_challenges_of_the_third-platform
PDF
Vanson Bourne Research Report: Big Data
PDF
Addressing Cybersecurity Strategically
Big data-analytics-2013-peer-research-report
CS309A Final Paper_KM_DD
Disruption: Data and Analytics Modernization in the COVID-19 Era
Big Data Trends and Challenges Report - Whitepaper
Agile Drug Development Required in a New Age of Urgency
Idc info brief-choosing_dbms_to_address_challenges_of_the_third-platform
Vanson Bourne Research Report: Big Data
Addressing Cybersecurity Strategically

What's hot (19)

PDF
Survey Results Age Of Unbounded Data June 03 10
PDF
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
PDF
Analytics in-action-survey
PDF
"Big data in western europe today" Forrester / Xerox 2015
PDF
Ibm business trends
PDF
Datacenter industry survey 2015
PDF
Analytics solution
PDF
NI Automated Test Outlook 2016
PDF
Utilities Can Ramp Up CX with a Customer Data Platform
PDF
2019 Data Trends Survey Results
PDF
Augmented Data Management
PDF
Big Data Alchemy: How can Banks Maximize the Value of their Customer Data?
PDF
Accenture - GE Industrial Internet Changing Competitive Landscape Industries ...
PDF
Innovating with analytics
PDF
An Analysis of Big Data Computing for Efficiency of Business Operations Among...
PDF
To Become a Data-Driven Enterprise, Data Democratization is Essential
PDF
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
PDF
Impact of Data Analytics in Changing the Future of Business and Challenges Fa...
Survey Results Age Of Unbounded Data June 03 10
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
Analytics in-action-survey
"Big data in western europe today" Forrester / Xerox 2015
Ibm business trends
Datacenter industry survey 2015
Analytics solution
NI Automated Test Outlook 2016
Utilities Can Ramp Up CX with a Customer Data Platform
2019 Data Trends Survey Results
Augmented Data Management
Big Data Alchemy: How can Banks Maximize the Value of their Customer Data?
Accenture - GE Industrial Internet Changing Competitive Landscape Industries ...
Innovating with analytics
An Analysis of Big Data Computing for Efficiency of Business Operations Among...
To Become a Data-Driven Enterprise, Data Democratization is Essential
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
Impact of Data Analytics in Changing the Future of Business and Challenges Fa...
Ad

Similar to Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum (20)

PDF
A Better Understanding: Solving Business Challenges with Data
PDF
Creating a Data validation and Testing Strategy
PPT
Is Your Data Ready to Drive Your Company's Future?
PPTX
What is a Data Warehouse and How Do I Test It?
PDF
Earley Executive Roundtable Using Business Analytics to Drive Higher ROI and ...
PDF
Data Trends for 2019: Extracting Value from Data
PDF
Analytic platforms
PDF
go.datadriven.whitepaper
PDF
What to focus on when choosing a Business Intelligence tool?
PDF
Data Warehouse Testing—The Next Opportunity for QA Leaders
PDF
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
PDF
Data Quality and Governance in a Data Obsessed World
 
DOCX
1920191Analytical Competitiveness Right Data vs. .docx
PDF
Etl testing strategies
PPTX
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
PDF
Data warehouse-testing
PDF
My role as chief data officer
PDF
DAMA Big Data & The Cloud 2012-01-19
PDF
Managing Data Warehouse Growth in the New Era of Big Data
PDF
IRJET- Testing Improvement in Business Intelligence Area
A Better Understanding: Solving Business Challenges with Data
Creating a Data validation and Testing Strategy
Is Your Data Ready to Drive Your Company's Future?
What is a Data Warehouse and How Do I Test It?
Earley Executive Roundtable Using Business Analytics to Drive Higher ROI and ...
Data Trends for 2019: Extracting Value from Data
Analytic platforms
go.datadriven.whitepaper
What to focus on when choosing a Business Intelligence tool?
Data Warehouse Testing—The Next Opportunity for QA Leaders
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Data Quality and Governance in a Data Obsessed World
 
1920191Analytical Competitiveness Right Data vs. .docx
Etl testing strategies
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Data warehouse-testing
My role as chief data officer
DAMA Big Data & The Cloud 2012-01-19
Managing Data Warehouse Growth in the New Era of Big Data
IRJET- Testing Improvement in Business Intelligence Area
Ad

More from RTTS (20)

PDF
Leveraging AI to Simplify and Speed Up ETL Testing
PDF
Improving Automated Testing Projects with UFT
PDF
JMeter webinar - integration with InfluxDB and Grafana
PDF
Automated Testing of Microsoft Power BI Reports
PDF
QuerySurge AI webinar
PDF
State of the Market - Data Quality in 2023
PDF
TestGuild and QuerySurge Presentation -DevOps for Data Testing
PDF
Creating a Project Plan for a Data Warehouse Testing Assignment
PDF
RTTS Postman and API Testing Webinar Slides.pdf
PDF
How to Automate your Enterprise Application / ERP Testing
PDF
QuerySurge Slide Deck for Big Data Testing Webinar
PPTX
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
PPTX
Implementing Azure DevOps with your Testing Project
PPTX
An introduction to QuerySurge webinar
PPTX
Data Warehouse Testing in the Pharmaceutical Industry
PDF
Completing the Data Equation: Test Data + Data Validation = Success
PDF
the Data World Distilled
PDF
QuerySurge for DevOps
PDF
Leveraging HPE ALM & QuerySurge to test HPE Vertica
PDF
Whitepaper: Volume Testing Thick Clients and Databases
Leveraging AI to Simplify and Speed Up ETL Testing
Improving Automated Testing Projects with UFT
JMeter webinar - integration with InfluxDB and Grafana
Automated Testing of Microsoft Power BI Reports
QuerySurge AI webinar
State of the Market - Data Quality in 2023
TestGuild and QuerySurge Presentation -DevOps for Data Testing
Creating a Project Plan for a Data Warehouse Testing Assignment
RTTS Postman and API Testing Webinar Slides.pdf
How to Automate your Enterprise Application / ERP Testing
QuerySurge Slide Deck for Big Data Testing Webinar
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Implementing Azure DevOps with your Testing Project
An introduction to QuerySurge webinar
Data Warehouse Testing in the Pharmaceutical Industry
Completing the Data Equation: Test Data + Data Validation = Success
the Data World Distilled
QuerySurge for DevOps
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Whitepaper: Volume Testing Thick Clients and Databases

Recently uploaded (20)

PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
PDF
AutoCAD Professional Crack 2025 With License Key
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Salesforce Agentforce AI Implementation.pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PDF
Complete Guide to Website Development in Malaysia for SMEs
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
medical staffing services at VALiNTRY
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Nekopoi APK 2025 free lastest update
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Computer Software and OS of computer science of grade 11.pptx
Odoo Companies in India – Driving Business Transformation.pdf
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
AutoCAD Professional Crack 2025 With License Key
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Salesforce Agentforce AI Implementation.pdf
How to Choose the Right IT Partner for Your Business in Malaysia
Monitoring Stack: Grafana, Loki & Promtail
Complete Guide to Website Development in Malaysia for SMEs
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
medical staffing services at VALiNTRY
Internet Downloader Manager (IDM) Crack 6.42 Build 41
wealthsignaloriginal-com-DS-text-... (1).pdf
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Advanced SystemCare Ultimate Crack + Portable (2025)
Wondershare Filmora 15 Crack With Activation Key [2025
Nekopoi APK 2025 free lastest update
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...

Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum

  • 1. page 1copyright Real-Time Technology Solutions, Inc., updated January, 2015 Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum A study by RTTS For Business and IT Professionals By Bill Hayduk
  • 2. page 2copyright Real-Time Technology Solutions, Inc., updated January, 2015 Enterprise BI & Data Warehousing: the Data Quality Conundrum Poor data quality is a huge and exponentially growing problem. Big Data is causing massive increases in the volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources) of data. Therefore, concern for poor data quality or ‘bad data’ is now a critical issue. According to Gartner, “the average organization loses $8.2 million annually through poor data quality, with 22% estimated their annual losses resulting from bad data at $20 million and 4% put that figure as high as an astounding $100 million”. And InformationWeek found that “46% of companies cite data quality as a barrier for adopting Business Intelligence products”. We recently performed a study that included responses from over 200 companies interested in improving the data quality in their Business Intelligence, ETL software and Enterprise Data Warehouse implementations. Below are our findings, along with context around these results. SECTION I. Our first section polled customers on their architecture: specifically on their data warehouse, ETL and business intelligence software and vendors. Enterprise Data Warehouse Software The top data warehouse vendors are Oracle (plus MySQL and Exadata) as number one and Teradata (and Aster Data) as number two and every other one far behind. This is in sync with research by most analyst firms that track this platform. Oracle and Teradata dominate the space with both their loyal customer base and their innovation. Analyst firm Gartner, in its 2013 Magic Quadrant for Data Warehouse Database Management Systems report, projected a 10% growth in the database management system market and it pinpointed a significant increase in organizations seeking to deploy data warehouses for the first time.
  • 3. page 3copyright Real-Time Technology Solutions, Inc., updated January, 2015 Business Intelligence Software IBM leads the BI space with Cognos, followed by the surprising performance of Microsoft at second and Oracle’s combined offering at 3rd. Unexpectedly, ‘other vendors’ were chosen 18% of the time, meaning there is still room for growth in the BI space. Analyst firm IDC stated that the market is now forecast to continue to grow at a 9.8% compound annual growth rate through 2016. One key observation they made was “the media attention on Big Data has put broader business analytics on the agenda of more senior executives.” ETL (Extract/Transform/Load) Software Here we see another surprising result of our survey. Microsoft finished first in the ETL Vendor software survey. Informatica PowerCenter, the most widely known vendor finished second, closely followed by IBM’s combined offering at third. It is interesting to note that ‘other’ came in first above Microsoft. There is still a large contingent of companies using home-grown and open source software in the marketplace. According to analyst firm Forrester, “The enterprise ETL market continues to grow at a healthy pace as more enterprises replace manual scripts with packaged ETL solutions. This migration…toward ETL tools is driven by the need to support growing and increasingly complex data management requirements.”
  • 4. page 4copyright Real-Time Technology Solutions, Inc., updated January, 2015 Current Data Warehouse Size We inquired as to the current size of the data warehouse implementations. We discovered that 91% were less than 100 Terabytes and 52% were less than 500 Gigabytes. Interestingly, the largest sector (33%) of implementations was between 1 Terabyte and 100 Terabytes. This is a significant increase in data size when the largest sector in our poll 2 years ago was measured in Gigabytes. SECTION II. In Section II, we analyzed the current quality situation of firms to determine their effectiveness.
  • 5. page 5copyright Real-Time Technology Solutions, Inc., updated January, 2015 Current Testing Strategy We inquired as to which test strategy was being implemented: (1) testing across ETL legs (source-to-target DWH, DWH to data mart, etc.), (2) utilizing Minus Queries (see full definition of minus queries here http://bit.ly/13lmp8N), and/or (3) comparing row counts from source-to-target. The preferred strategy is to utilize a combination of the three, depending upon the role (tester, ETL developer, operations). But if one were going to choose a single strategy to check for data quality, it would be (1). We found that 30% of companies polled only verify row counts and only 7% implemented all 3 as their preferred strategy. Many singled out a lack of automated testing and a lack of testing resources as reasons they did not deploy a more rigorous testing strategy. Current Test Execution Method When surveying customers on their current test execution method, we found that 60% of testing is currently performed manually. Manual testing consists of extracting data from the source databases, files and XML and also extracting data from the data warehouse after it goes through the ETL process and then comparing these data sets manually, by eye. This is quite extraordinary when considering that the average data warehouse is measured in gigabytes and typical tests return millions of rows and upwards of hundreds of columns, meaning millions of sets of data to compare. Therefore, testers can only sample the comparisons for practical purposes. Vendor tools come in second and a home-grown finished third. Data Quality and Testing Challenges When customers were asked about their biggest challenges when it came to testing the data for accuracy, the clear top choice was ‘No Automation’. This goes hand-in-hand with the 2nd biggest challenge – that testing manually is very time consuming. This was followed by the lack of a test management process and/or tool, not enough data coverage and no reporting on the testing effort.
  • 6. page 6copyright Real-Time Technology Solutions, Inc., updated January, 2015 Percent of data coverage by current test process When determining the amount of data coverage that companies’ current test process provides, it is clear that there is not much data coverage at all. Of those companies surveyed, 84% had less than 50 percent data coverage, 58% had less than 25 percent coverage, 33% had less than 5 percent and 29% of companies had less than 1 percent. The 14% who had 100 percent coverage had data warehouse implementations less than 500GB in size and were interested in finding a data warehouse testing tool that could speed up the testing cycle. Also, the coverage represents the amount of data brought back by SQL queries, not the amount compared. Since comparisons (as noted above) are typically performed by visually reconciling the 2 data sets, it is impractical for more than 5- 10% of the data to be compared. Ratio of Developers to Testers Average ratio: 2.1 to 1 Median ratio: 1.7 to 1 Highest ratio: 20 to 1 Lowest ratio: 2 to 3 We found that, for the most part, the ratio of developer to tester was standard in principle. The issue that most firms surveyed stated was that they could not get enough data coverage. The reason: ETL developers are utilizing some form of tool to make their jobs faster and more efficient while most testers are not.
  • 7. page 7copyright Real-Time Technology Solutions, Inc., updated January, 2015 Effects of Bad Data We surveyed customers on the impact of bad data on their organizations. Of the ones who answered, 100 percent said that they experienced some form of bad data in their data warehouses. Below are samplings of their free-form answers on the effects that bad data caused them.  Incorrect business intelligence reports  Poor delivery quality & customer dissatisfaction resulting in re-work  Missing revenue opportunities  Critical business decisions rely on underlying bad data  Bad quality of projects  Negative business Impact  SLA Issue with our customers  Major embarrassments to our team  Long working hours to fix bad data Conclusion Many companies are using Business Intelligence (BI) to make strategic decisions in the hope of gaining a competitive advantage in a tough business landscape. But Bad Data will cause them to make decisions that will cost their firms millions of dollars. It is clear from the results of this survey that companies are not providing the level of data quality that C-level executives need to make a strategic decisions. Most firms test far less than 10% of their data by sampling the data and the comparisons. Therefore, at least 90% of data remains untested. Since bad data exists in all databases, firms need to test closer to 100% of their data and guarantee that this critical information is accurate. There is no practical way for testers to verify this level of coverage without the use of automated testing tools. An automated testing solution will speed up the process, provide much more data coverage, perform comparisons automated, and provide reports for audit trails. It is clear that as data grows exponentially, a more complete solution is needed to keep enterprise-level data clean.
  • 8. page 8copyright Real-Time Technology Solutions, Inc., updated January, 2015 About The Author Bill Hayduk Founder, CEO, President Bill founded software and services firm RTTS in 1996. Under Bill's guidance, RTTS has supported over 600 projects at over 400 corporations, from Fortune 500 to midsize firms. Bill is also the business leader on QuerySurge, RTTS’ industry-leading data warehouse testing tool. Bill holds an MS in Computer Information Systems from the Zicklin School of Business (Baruch College) and a BA in Economics from Villanova University. References  Gartner: Magic Quadrant for Data Warehouse Database Management Systems (January 31, 2013)  IDC: Worldwide Business Analytics Software 2012-2016 Forecast and 2011 Vendor Shares (June 2012)  The Forrester Wave™: Enterprise ETL, Q1 2012 (February 27, 2012)  InformationWeek: 2012 BI and Information Management Trends (November 2011)  RTTS 2013 Client Survey on BI and Data Warehouse Quality