Exploring EHR Design with Python

Finally, I have had time to play with Python.   I have been trying to find time since last spring when I got my shiny new MacBook Pro.   Having spent recent years using C-inspired languages that are compiled and strictly typed, Python is proving to be a refreshing change.  Python can be used interactively, which makes experimentation easy and fun.  Even better, it has built-in support for sets, lists, and tuples (think relations), which makes it easy to try new ways of organizing data for computational use.  This makes it great for experimenting with EHR concepts.

Getting started with Python can be confusing because there are currently two versions available — 2.7 and 3.3.  Though the two versions are similar, they are different enough to trip up someone new to the language.   I have chosen to use 2.7 because, being older, it has wider support in terms of libraries and other add-ons.   Python is a big topic, so I am going to focus on just a few data types in demonstrating why I like the language so much.   All examples use the IDLE IDE, which permits one to type in Python code and immediately see its effects.

Sets, lists, and tuples
Support for sets, lists, and tuples is present in most modern programming languages.   However, as I have discovered, Python implements them in a way that retains their discrete mathematics flavor.

Sets behave as one would expect.    Standard operations such as membership, intersection, subset, and union are available.

Creating a set is simple:

>>> pneumonia_SS= set([‘fever’,’cough’,’sputum’,’rales’])

This is a set containing signs and symptoms of pneumonia.   Testing whether a given sign is typical in pneumonia is easy as well:

>>> ‘fever’ in pneumonia_SS

This query returns ‘True.’  Sets do not permit duplicates, and they are not ordered.

Tuples are ordered structures that may hold any type of information.  They are analogous to relations in discrete mathematics, which makes it easy to represent common EHR constructs — many of which are relations.   Here is a tuple illustrating a medication history entry.

>>> newMed = (‘Jones’,’Lasix’,25, ‘mg’,’po’,’qid’,’12/14/2013′)

Tuples in Python are immutable. This means that values cannot be changed once a tuple has been created.  This is not as limiting as it sounds.  Tuples are useful for representing data that one does not want to be changed during use.

Like tuples, lists may contain variables of different types; however, they are more flexible.   Lists are similar to arrays in other programming languages.   Here is a statement that creates a list of medications.

Meds=[“lasix”, “aspirin”, “nifedipine”]

Items are easily added to a list using append.   Adding penicillin to the list is easy.


This results in the list [“lasix”, “aspirin”, “nifedipine”, “penicillin”].

Lists may be manipulated in a variety of ways: insertion, deletions, sorts, etc.   What makes Python so compelling for experimentation is how easily one can combine its data types to represent concepts. For example, lists can contain other Python data types, so one can create a list of tuples, which means it is possible to create list representations of EHR structures.   A medication list can be created by adding medication tuples to a list.

Med1 = (‘Jones’,’Lasix’,20, ‘mg’,’po’,’qd’,’12/14/2013′)

Med2 = (‘Jones’,’aspirin’,325, ‘mg’,’po’,’prn’,’12/14/2013′)




The above code sequence would create the following medication list.

[ (‘Jones’,’Lasix’,20, ‘mg’,’po’,’qd’,’12/14/2013′), (‘Jones’,’aspirin’,325, ‘mg’,’po’,’prn’,’12/14/2013′)]

Having the ability to experiment in this manner makes it much easier to explore the properties of key EHR structures such as problem lists, medication lists, notes, provider profiles, patient demographics, etc.

The EHR as an object worthy of study
In prior posts, I have touched on the idea that an EHR should be considered as more than a front-end to a database because building systems that support care processes requires more than supplying users with data.   The most extreme version of “database front-end” thinking results in information displays that are little more than relational database views.    Creating systems that intimately support care processes requires considering EHR structures as information units that have specific properties.

There seem to be at least four categories of properties worth investigating: user interface, computational (includes data structures), semantic, and storage.  To date, semantic properties have received the most attention in the form of research focused on terminologies and interoperability protocols.  The growing interest in usability is drawing attention to user interfaces, but even here, the focus is more on what appears on the screen than it is on the intricacies of how those screens are generated and the intrinsic properties of what is displayed. Excluding semantics, let’s take a look at the types of issues worth exploring in each of these categories using the problem list as an example.

User Interface
The problem list is a major user interface component providing important clinical information to EHR users.  As such, the ability to manipulate a problem list is important. For example, a user may wish to sort a list chronologically, highlight urgent issues, flag a problem for follow-up, or hide inactive problems.   How much of the way problems are displayed should be under user control?  What care functions should users be able to initiate from inside the problem list?

Computational Properties
There are multiple ways in which problem list elements can be moved from a database and presented to the user.   The easiest approach is executing a database query and sending the results directly to a grid, or a similar user interface component, for presentation to the user.    One major downside to this approach is that it can couple the presentation of the data too tightly to the database design.   However, for me the larger issue is that this design approach treats a problem list as nothing more than text from a database—nothing more than a list.  This is a serious failing, because it assumes that ALL that one should do with a problem list is look at it.  While this might be true for paper, there is no reason to port this limitation to an electronic medium.

There is something to be gained by considering the problem list as a computational unit with properties that can be exploited for the benefit of EHR users and patients.   For example, what if a problem list was a smart component that could automatically cross-check its elements against a patient’s medications or test results?

This paper refers to a means of improving problem list accuracy by using algorithms that search through patient records for specific disease criteria.  Should this not be a standard capability of all problem lists?  Making a problem list smarter requires imbuing it with the ability to ask questions, answer queries, and gather data.  A problem list with such capabilities could then interact with lower-level EHR functions/components (e.g., security protocols, workflow engines, decision support routines, etc.), enabling better process support.

Storage format
The third aspect of a problem list is its storage format.   There is no standard for storing problems; it is up to the EHR designer.   In a relational database, the simplest approach is a table with a patient ID, problem/diagnosis code, start date and stop date–what appears in a paper chart.  More complex concepts such as pending diagnoses, revisions after-the-fact, ties to specific actions (i.e., tests ordered, or additional provider visits) can of course be accommodated by any modern database.  The question is not whether such information can be stored, but rather how explicit the linkages are and how easily new links and concepts can be added.  If the problem list is considered to be a passive list, these questions are moot.

Back to Python.  Python is allowing me to explore EHR design by making it easy to build complex data structures interactively and experiment with them.  Emulating EHR concepts with sets, lists, and tuples is a snap. While it is possible to explore these ideas in any language, the time from idea to working code is much shorter using Python.   The idea of what constitutes an EHR system is due for a major overhaul.   I’ll let you know what I come up with…



  1. Interesting. Any update? Despite the broad popularity of Python I haven’t found much Python development in EHRs.

    1. Author

      EHR design is receiving much more attention than when this post was written. Workflow, decision support, and interoperability are seen as critical, so programming language matters are less interesting for the moment. The same holds for me. My work has focused on theory, architecture and design issues for the last year, away from coding and languages. Thanks for reading!

Leave a Reply

Your email address will not be published. Required fields are marked *