We gain understanding by asking good questions. Questions about even the most common, mundane things can yield wisdom and point the way to important discoveries. According to William Stukeley, Isaac Newton began his research into gravity because he wondered why apples always fell down and never sideways or up (1). How many people had seen apples fall, but never bothered to consider why they always fell down? Solving problems requires asking questions that challenge the conventional wisdom. This series of posts is about asking questions concerning clinical software design and looking for new pathways for clinical information system research, not offering a specific set of approaches or solutions. Yes, I have a few ideas and suggestions to share, but not in a dogmatic way. I have, by far, many more questions than answers.
Are the computational and informational properties of patient data identical?
Patient information, it seems to me, has always been taken for granted when it comes to clinical software design. Electronic health record systems are the successors of paper charts, and paper-based thinking is evident in their design (see Is the Electronic Health Record Defunct?). Having paper charts as a source of inspiration makes some software design decisions seem as obvious and uninteresting as apples falling.
The informational aspects of patient data are familiar to everyone because those aspects are well known from paper charts. Problem lists, medication lists, SOAP notes, lab reports, etc. provide the information needed for clinical care. And when building electronic systems that replace paper charts, it is obvious that the informational aspects of patient data must be maintained. No argument there. However, when building an electronic system to replicate the informational aspects of paper charts, it is easy to overlook the fact that patient data in electronic form should be structured in a way that allows their computational properties to be fully expressed.
The recent JASON report from AHRQ (2) mentions “atomic” data primarily as a means of promoting data exchange. However, I find the notion of atomic data intriguing because it acknowledges the computational aspects of patient data. The report suggests that atomic data be accompanied by “…metadata, context, and provenance information…” Given that data may be used for direct patient care and/or computationally, is there a way to determine the minimum collection of metadata, context, and provenance information that yields a viable information unit? For example, a serum calcium level must come with a minimal set of properties to make it useful (informational) to a clinician–the date, time, patientID, numerical value, units, and normal-range values. However, what if an automated care pathway were to use the calcium level? What else might be required?
This topic is not as esoteric as it sounds. While working on the CNICS project at UAB, we had to create a protocol for pooling data from different clinics. We created codes for a range of missing data concepts, which among other things, denoted why values were missing and how likely they were to become available in the future. We had reliability codes for how trustworthy a diagnosis might be, and codes that indicated the source of specific pieces of information. Much of this was designed into the EHR we built.
I have mentioned openEHR before (3). It is an effort to create a formal information model for electronic health records. Using archetypes, openEHR captures many aspects of the extended properties alluded to by the JASON report (4). Archetypes are interesting, and may be a viable solution for managing the computational aspect of patient data—time will tell.
Is it time for a standard patient data set?
Determining the computational aspects of patient data requires research. At present, this is being done mostly at individual companies. As a result, every EHR system has its own database design with proprietary names, formats, properties, and groupings for data elements. The same freedom exists for representing higher concepts such as the problem list, medication list, or family history. Currently, it is difficult for clinical software designers to build on the work of others because so little is discussed publicly concerning clinical system design and construction. What if clinical software designers setting out to build a clinical system did not have to reinvent the patient-data-model wheel?
Wouldn’t life be easier for everyone if a standard data set specification existed that provided a data model that included the metadata, context, and provenance information suggested by JASON? The patient data set specification could be managed at a central location (e.g., the National Library of Medicine), and protocols would be in place for adding, naming, and revising data elements and their extended properties. The latest version of the data set definition and a working implementation example would be available for download.
Such an approach might actually stimulate innovation because modeling for patient data is a huge headache for anyone designing clinical software. I know this from personal experience… When designing the EHR at UAB, I spent over eight months tweaking the data model. I would have welcomed a “drop in” patient data specification—especially one that had been vetted by a community of researchers and software companies.
Conventional wisdom suggests that commercial systems must maintain proprietary data models in order to be competitive. But is this true? Patient data are not the only type of information found in clinical systems. The user interface, algorithms, data structures, features, price, service quality and other characteristics are sufficient to differentiate one product from another even if the patient data were standardized.
Would data exchange benefit from a standard patient data set?
Consider what the lack of standardized patient data means for sharing data. In order to move patient data from one EHR to another, one must map patient data properties in the originating system to something acceptable to the receiving system. The absence of constraints on originating and receiving systems makes exchange harder than it need be. If there were a single standard that everyone could use for patient data no one would have to waste time creating their own, and this would still leave plenty of room for product differentiation and innovation.
Would a standard patient data set be a boon for clinical software research?
A standard patient data set would also make research on clinical systems more comparable across studies. The availability of a reference implementation that could be populated with local data would allow informatics researchers and software designers to work from the same model. This would encourage an open dialog about data modeling for clinical systems that, for all intents and purposes, does not currently exist. Terminologies could be incorporated into the reference implementation, reducing headaches for local implementers.
Work on standard data sets is already underway at both the ACC (5) and the ACOG (6). MU stage 2 also describes a common data set (7). Why not pool these and similar efforts and share methods, knowledge and best practices? The creation of a standard patient data set does not have to start from scratch. LOINC would be a great starting point for testing atomic data concepts.
Would a reference set of relations be a useful part of a patient information component?
One area that I think is ripe for research is the mathematical representation of clinical concepts. The series Mathematics and Clinical Concepts demonstrated a number of areas in which discrete math could be used for clinical concepts. One math structure that is readily used this way is a relation (see Mathematics and Clinical Concepts, Part V: Using Relations to Organize Clinical Information). Everyone who has used a relational database knows about relations. Unfortunately, relations are rarely discussed outside of their database usage. Relations existed before relational databases, and they can be used without a database.
Relations can be represented using structures called tuples (8). Tuples allow for grouping of related values in ways that are very familiar to anyone who has read a medical chart, and are supported by most modern programming languages. The problem list, medication list, a prescription, etc. can all be represented using relations. It seems that just as a standard patient data set would lower the barrier to creating clinical software and promote innovation, a reference set of clinical relations would do the same.
Can data quality be made more objective?
In Methods and Dimensions of Electronic Health Record Data Quality Assessment: Enabling Reuse for Clinical Research, Weiskopf and Weng provide a sobering view of data quality as it concerns current EHR systems. There is no current measure of data quality that can be applied across all clinical system systems (or, for that matter, any two). A standard patient data set would make data quality determinations easier and more straightforward. A standard data set could also lead to sharable routines that address a range of common issues such as validation, security, care quality, and decision support. It is worth considering…
Patient information as a service???
Current software design practices use a data access layer through which all data requests are served. This makes for a clean design by preventing database accesses from arbitrary points in the code. In its simplest incarnation, the data access layer contains SQL commands; more complex versions contain detailed business rules. Since the data access layer could be easily accessed via an API, it could be remote from software systems that use it. With this in mind, is it feasible for patient information to be managed as a physically-separate component from the rest of a clinical care system? Stated differently, might it be possible for companies to exist that do nothing but sell patient information components to other companies? There are companies that sell workflow technology and those that sell terminology services; why not a patient information component? I have more questions, but this post is already way too long.
We have been living with the idea of the electronic health record and what it encompasses for a few decades, and doing so has colored how we think about clinical care systems. Now is as good a time as any to rethink all of those cherished assumptions and start asking questions. What are the computational properties of clinical data? Is there a simpler way to enable data exchange using standard data sets? Can a theory tie clinical care systems to clinical practice and aid in software design? Are there engineering principles for clinical care systems awaiting discovery? Progress starts with asking questions, and falling apples are everywhere…