Ensuring data quality is one of the main challenges faced by clinical database designers. Data quality in clinical applications built specifically for research purposes is protected by safeguards in the form of administrative policies and procedures along with software and database functions. EHR systems are primarily patient care tools and often lack such safeguards. As EHR adoption increases and the burgeoning pool of EHR data becomes more attractive to clinical researchers, EHR data quality is emerging as an important scientific and public policy issue.
In Methods and Dimensions of Electronic Health Record Data Quality Assessment: Enabling Reuse for Clinical Research, Weiskopf and Weng provide a sobering analysis of the current state of EHR data quality management. The authors conducted a literature review with the goal of understanding the methods used to assess EHR data quality. This paper is a major contribution to the field of informatics because, unlike papers that purport to directly measure data quality, it analyzes the strengths and weaknesses of both the definitions of EHR data quality and the methods used to measure it. After reading this paper, one can clearly see that more needs to be done in this critical area–much more.
Weiskopf and Weng comment on measures that are commonly used to assess EHR data quality in published reports. They state:
The strategies used to assess the dimensions of data quality fell into seven broad categories of methods, many of which were used to assess multiple dimensions. These general methods are listed and defined below.
- Gold standard: A dataset drawn from another source or multiple sources, with or without information from the EHR, is used as a gold standard.
- Data element agreement: Two or more elements within an EHR are compared to see if they report the same or compatible information.
- Element presence: A determination is made as to whether or not desired or expected data elements are present.
- Data source agreement: Data from the EHR are compared with data from another source to determine if they are in agreement.
- Distribution comparison: Distributions or summary statistics of aggregated data from the EHR are compared with the expected distributions for the clinical concepts of interest.
- Validity check: Data in the EHR are assessed using various techniques that determine if values ‘make sense’.
- Log review: Information on the actual data entry practices (eg, dates, times, edits) is examined.
The authors are appropriately critical of these approaches. In commenting on the reliability of paper records as a gold standard, they offer:
Paper records, for example, may sometimes be more trusted than electronic records, but they should not be considered entirely correct or complete. Perhaps, more importantly, a gold standard for EHR data is simply not available in most cases.
Their final statement deserves additional emphasis because, once EHR adoption is substantially complete, there will be fewer external data sources (other than patients) to use for comparison. In addition, external data sources themselves raise new issues. For example, if two or more data sources disagree, one has to deal with the problem of determining which is correct. Finally, without the ability to ensure semantic homogeneity among data sources, comparing data from disparate sources is difficult, especially when working with large datasets.
Having looked at frequently used quality assessment methods, the authors, noting the looseness of the terminology used to describe aspects of EHR data quality, turn their attention to arriving at more satisfactory definitions for key terms. Subsequent to their analyses, they suggest the following definitions.
Overall, we empirically derived five substantively different dimensions of data quality from the literature. The dimensions are defined below.
- Completeness: Is a truth about a patient present in the EHR?
- Correctness: Is an element that is present in the EHR true?
- Concordance: Is there agreement between elements in the EHR, or between the EHR and another data source?
- Plausibility: Does an element in the EHR makes sense in light of other knowledge about what that element is measuring?
- Currency: Is an element in the EHR a relevant representation of the patient state at a given point in time?
From these five dimensions, they settled on completeness, correctness, and currency as being the cornerstones of data quality–the remaining two properties being derivable from these three. This paper strikes a chord with me.
I recall reading Accuracy of Data in Computer-based Patient Records by Hogan and Wagner, and stressing over how to address data quality management issues while working on the 1917 Clinic EHR Project. Data quality was a real concern because a major portion of the project consisted of migrating data from existing clinical databases. A post is too short a medium to share all that I learned about data quality management from that project and others over the years, so I will focus on just one aspect, completeness.
Weiskopf and Weng define completeness as the degree to which the truth about a patient is present in the EHR. This is an apparently reasonable definition. The problem, of course, is the pesky notion of truth. The whole truth, one might say, depends on the circumstances. More precisely, I would say it depends, more often than not, on both the diagnosis and the patient’s clinical state.
For example, the number of BP measurements required to yield a clinically-valid picture of a patient differs by diagnosis and current clinical state. In an outpatient setting, one measurement per visit might do for a patient known to be normotensive. However, there are times when the diagnosis being entertained or the disease being treated requires sitting, standing and supine BPs or obtaining readings from both arms. Therefore, there are times when having a single accurate BP reading is sufficient while at other times a single measurement would not be.
Complete requires a denominator. Without a denominator, complete is impossible to define objectively. Since clinical care standards do not dictate all data elements that should be collected per diagnosis or clinical state, it is not possible to say, other than by some local or personal preference, what a complete dataset necessarily contains. Without specific data element requirements, completeness is in the eye of the beholder.
Data granularity should not be overlooked as an attribute or aspect of completeness. Granularity issues call attention to the differences between data capture for clinical needs as opposed to research requirements. Clinicians may feel comfortable with less detail than researchers may and might resent the extra time required for detailed data collection. Smoking history capture is a good example of this potential problem. Consider the difference in detail between the following smoking histories.
- Hx #1 — Current every day smoker—Yes (MU compliant)
- Hx #2 — Current every day smoker—Yes
- Cigarettes only, 1 pack/day
- Started smoking in 2001
- Has never used chewing tobacco or snuff
The first is MU compliant, but lacks detail. The second offers more detail, but it takes longer to record. Is either of the two complete? It depends on the research question. The same problem arises for the HPI, ROS, PMH, etc., when trying to determine the level of detail required for routine data capture.
Weiskopf and Weng encourage researchers to adopt stricter criteria for assessing quality and to employ more precisely defined terms for the dimensions of data quality. They do not offer specific suggestions for addressing these issues, which is easy to understand as these are difficult challenges with no simple solutions.
Establishing truth in an EHR requires much more than controlled terminologies, ontologies, etc. Data quality is an issue that touches on everything from the user interface to the database. Therefore, in addition to these tools, ensuring data quality also requires a clear statement of the data required for specific circumstances along with software and database mechanisms to enforce those requirements. I think the best approach is one that begins with a standard dataset.
The 2003 IOM report, Key Capabilities of an Electronic Health Record System, mentions the notion of an EHR having a defined dataset. The report’s concept of an EHR dataset is more general than what I am advocating; nevertheless, I think the basic concept is worth adopting and expanding. Why? First, it immediately solves the denominator problem. Second, disease registries already feature standard datasets for common ailments that can be used as prototypes.
Standard datasets derived from disease registries offer the immediate benefit of testable data standards that can be quickly proposed, refined, and tested in applications. Completed dataset specifications would provide guidance for validation requirements (normal ranges, gender, permissible values, etc.), data elements, data types, formats, and terms, which would decrease variations in EHR data and make mapping between EHRs easier. Further, dataset specifications would provide the information required for designing database-level controls and software validation routines. Since we would not be starting from scratch, dataset specifications should not take years to complete. Once defined, yearly reviews could be used to refine and update specifications. Using such an approach, the value/validity of standard datasets could be tested quickly, eliminating the need to wait years in order to determine if they, in fact, aid in improving EHR data quality.
The solution to the EHR data quality conundrum requires, among other things, the creation of objective measures of completeness. My struggles with data quality issues have convinced me that standard datasets are a good place to start. Detailed clinical models appear to be a move in this general direction. Are they the solution? Only time will tell…