Data Quality Assessment Frameworks for Large-Scale Health Information Systems
Keywords:
Data Quality, Health Information Systems, Data Quality Assessment, Large-Scale Systems, Frameworks, Healthcare Data, Data Governance, Artificial Intelligence, Health Outcomes, Electronic Health RecordsAbstract
Large-scale health information systems (HIS) are foundational to clinical decision-making, public health surveillance, and evidence-based research; however, their effectiveness is contingent upon the quality of underlying data. This study presents a structured integrative review of data quality assessment frameworks applied to including Total Data Quality Management (TDQM), the Kahn harmonized model, and large-scale HIS, critically synthesizing theoretical models, operational methodologies, and emerging technological approaches. The analysis identifies five dominant data The findings reveal that existing frameworks exhibit limitations in scalability, interoperability, and real-time applicability, particularly within distributed and evaluated widely adopted dimensions: completeness, accuracy, timeliness, consistency, reliability and metrics, and implementation practices across heterogeneous healthcare environments. frameworks CIHI methodologies. A key contribution of this study is the quality identification of systemic fragmentation in data quality definitions, multi-institutional systems. Furthermore, the study highlights the misalignment of stakeholder priorities (clinicians, data stewards, policymakers) as a critical barrier to unified data quality governance. To address these gaps, this paper proposes a conceptual direction toward a unified, governance-integrated, and AI-augmented data quality assessment paradigm. The discussion further examines the implications of data quality deficiencies on clinical outcomes, research validity, and system-level decision-making. Finally, the study outlines actionable recommendations and future research directions focused on interoperability, automation, and privacy-preserving data quality assessment mechanisms. These contributions advance the discourse on scalable and resilient data quality management in modern health information ecosystems.


