Database management systems are central to healthcare information technology. This fact served as a source of frustration in the mid-‘80s when I began creating applications. My first real healthcare application was written in Apple BASIC. It stored information on Swan-Ganz catheter readings. Data were stored using file commands (e.g., READ, WRITE, POSITION)–messy, but it worked. Next, came Turbo Pascal. The Turbo Pascal Database ToolBox was supposed to solve my database needs. Unfortunately, I could never quite get it to work. Later, using Turbo Prolog, I created a B+ Tree-based filing system that worked well. However, like the Apple BASIC application, every time a new feature was required, it had to be written and tested—not an optimal solution.
Having to create my own file systems provided an understanding of how the storage format complements data type and usage. Healthcare data come in a variety of types and must be organized correctly in order to maintain their meaning. Clinical systems may contain many types of information aside from patient-related data. For example, there may be a knowledge base of reference materials, rules for guidelines and workflow, ontologies and/or terminologies. While each of these may be stored in a relational database, relational systems are not necessarily the best storage option in all situations. The flat, two-dimensional table metaphor sometimes makes modeling concepts, relationships, and interactions too much work. In addition, it can lead to overly complex queries. Consequently, though relational databases are obviously quite useful, there are times when I have found that the relational model seems to get in the way. Fortunately, over the last decade the number of data management options has exploded.
Last fall I began collecting information on database options while reviewing development environments. As of today, I have reviewed information on the following types of database systems: relational, object-oriented, in-memory, and NoSQL. The astonishing fact is that there are free open-source versions in each category.
Object-oriented database systems (OODBMS) have been around since the ‘80s. They have not had a lot of commercial success for a number of reasons. Compared to relational systems, they are often said to under-perform in terms of transactions/second. I have never researched this, and have no idea if that is true today (or ever). My main reason for not experimenting with them was cost. About 10 years ago, I was quoted a price in the neighborhood of $20,000 for a basic license—that quickly cooled my ardor. Today, there are affordable commercial versions and free open-source systems that can be used to develop applications. OODBMS find their greatest value in modeling complex data (e.g., geographic/spatial, computer-aided design).
In-memory databases liberate data management performance from disk drives. Speed is the major advantage of this approach. Once upon a time, computer memory and disk storage were very expensive–today, not so much. As we move to 64-bit computers with multicore processors and gigabytes of RAM, in-memory database applications become more feasible and attractive.
NoSQL databases have piqued my curiosity more than any other database system. Confusingly, the term “NoSQL” covers a lot of territory. As best as I can determine, there are four major categories of NoSQL systems based on how they store information: document stores, key-value stores, wide-column stores, and graph databases. NoSQL databases are designed to run high-access internet sites. They integrate well with programming languages used for web development, and are considered easier to scale when demand increases. They are behind sites such as Amazon and Google. I became interested in NoSQL because of my interest in Petri nets, which led me to graph databases. (Petri nets are based on graph theory as are graph databases.)
As you can see, nowadays there are many choices for data management. The question is choosing the right tool for the job. Considering the clunky file access routines I had to live with in the 80s, the current smorgasbord of data management options makes me positively giddy. I will be evaluating key-value and document stores for my startup. I’ll let you know what I find.