Top 40 Data Quality Improvement Initiatives and Metrics
As you plan your Data Quality improvement strategy and roadmap, you'll have to determine whether your organization is achieving the intended results.
To help get you started, we want to provide a starting list of data quality improvement initiatives commonly undertaken by organizations looking to improve their data quality, and a good starting set of measures you can track to see if you're having an impact.
Data Quality Improvement Initiatives
This list will help you evaluate current and target capabilities for your data quality practice and related sub-practices, understand your current environment and plan how to achieve target-level performance.
You can use this list to create initiatives that address recommended action items and performance gaps. In addition, this serves as a resource for planning and road mapping data quality improvement projects based on business unit priority.
Develop a plan to increase awareness of data quality value and integrate data quality within the business process.
Align data quality vision to business vision, mission, and strategies.
Different business units will have additional information needs. Address the needs one business unit at a time.
Implement a data quality plan to improve data quality capabilities.
Define and measure data quality metrics. For example, there might be quality metrics at the corporate level and metrics at the business unit level.
Define the roles and responsibilities between the data governance program and the data quality program
Define and formalize the data ownership role. Train data owners on data quality to be aware of common data issues and how to engage IT to fix them.
Define and formalize the data steward role—train data stewards on data quality to fix some data issues in their respective domains.
Define overarching data principles and introduce the principles in data-related projects.
Develop data quality policies, standards, and guidelines, and communicate throughout the enterprise.
Formalize a business data glossary to define data elements consistently.
Develop processes and techniques to effectively perform data analysis, modeling, specification, design, and standardization.
Develop data quality reports using the above-defined metrics. Use them to guide planning, solutions, and change management.
Formalize data quality organizational structure using business relationship managers (BRMs), data owners, and data stewards.
Define and support Data Architect roles and responsibilities. Evaluate the proficiency of this role over time.
Define data quality skills and proficiency levels for the data architect role.
Develop processes and tools to regularly assess staff performance against pre-defined criteria and enact development plans.
Define and implement work and capacity management processes to ensure the data quality team can support current and planned data quality work.
Institute an architecture review governing body and a formal review process to ensure that architecture is reviewed, assessed, and approved.
Implement controls and audit mechanisms for overseeing adherence to data quality policies and procedures.
Formalize a repeatable escalation process for managing and resolving issues.
Educate the organization on the benefits of data quality by promoting the value and services of the data quality program.
Formulate a communication plan to facilitate frequent and ongoing communications on data quality.
Ensure that referential integrity is well-defined. There should be no gaps and overlaps in referential integrity.
Review existing data backup plans. Develop or improve the plans if necessary.
Review existing retention schedule. Ensure the retention schedule balances the need for historical data and storage costs.
Determine the need for historical data. Define a system of record (SOR) for storing historical data. Ensure a complete audit trail is achievable in the construct.
Evaluate if business logic and calculation are performed correctly.
Determine if disparate datasets and subject areas can be integrated and how.
Review the refresh frequency of your data warehouse. Review the ETL approaches in terms of batch, micro-batches, change data capture, and real-time streaming.
Ensure analytical data is organized and presented in a consumable format to simplify self-service.
Explore if you can centralize all metadata on one platform to visualize end-to-end data lineage.
Document all data quality logic and transformations so they can be referred to in the future.
Master data is controlled throughout its lifecycle.
Identify and acquire the necessary data profiling tools. Check if your existing ETL platform provides data profiling functions.
Identify and acquire the necessary data matching tools. Ensure the data matching tools can support fuzzy matching.
Identify and acquire the necessary standardization tools.
Identify and acquire the necessary metadata tools—re-enforce documentation in all future data projects. Share the metadata.
Identify and acquire the necessary collaboration tools.
Include Data Quality in your Data Literacy program.
Baseline Data Quality Metrics
You will want to create practice-level metrics to monitor your data quality practice. Make sure to establish metrics for both the business and IT that will be used to determine if the data quality practice development is effective.
Next, collect current data to calculate the metrics and establish a baseline. Also, set targets for each metric. Finally, and this is critical: assign an owner for tracking each metric to be accountable for performance.
This is critical: assign an owner for tracking each metric to be accountable for performance
Relating data governance success metrics to overall business benefits keeps executive management and executive sponsors engaged because they see actionable results. Review metrics on an ongoing basis with those data owners/stewards who are accountable, the data governance steering committee, and the executive sponsors.
There are many lists of metrics for data quality and various frameworks with multiple "dimensions of data quality," but the focus of the list below is improving data quality practices.
Usage (% of trained users using the data warehouse)
Performance (response time)
Availability (hours/day accessible by users)
Resource utilization (memory usage, number of machine cycles)
User satisfaction (quarterly user surveys)
% values outside valid values
% fields missing
% wrong data type
%data outside an acceptable range
%data that violates business rules
%data not making it in time for SLAs
The initial cost of installation and ongoing
Total Cost of Ownership including servers, software licenses, support staff
Security violations detected, where violations are coming from, breaches
Patterns that are used
Reduction in time to market for the data
Available completeness of data
How many "standard" data models are being used
How much time is spent on data prep by the BI & analytics team?
Conclusion
Data Quality capabilities and competencies are foundational to improving data quality. Use these lists to assess your organization's current practice state, and plan where you need to go. Incept can, of course, help you out with our Data Quality (Assessment and) Strategy service offering.