We saw in the last post that taking the Cartesian product of two sets results in a collection of ordered pairs. Now, we are going to explore how ordered pairs and larger groupings can be used to organize information using *relations*.

Here is the definition of a relation taken from *Discrete Mathematics with Applications*, by Epp:

Let

AandBbe sets. A relationfromRAtoBis a subset ofA x B. Given an ordered pairin(x, y)A x B, x is related to y by, writtenR, if, and only if, (x, y) is inx R y. The setRAis called the domain ofand setRBis called its co-domain. (1)

This definition makes it clear that a relation ** R** is a subset of the Cartesian product of two sets. It further clarifies that the subset of ordered pairs contained in the relation is determined by whatever

**is. Here is an example. Suppose we have two sets:**

*R***A= {1, 2, 3}, B = {2,3,4}.**The Cartesian product of these two sets is the set of ordered pairs:

**A x B = {(1, 2), (1,3), (1,4), (2,2), (2,3), (2,4), (3,2), (3,3), (3,4)}**

From here, we can define a relationship on **A x B** that selects out the type of ordered pairs we are interested in. Let’s say we are interested in a relation **x R y **where

**Analyzing**

*R*= x < y.**A x B**, we can see that the ordered pairs that fulfill the

**x < y**relation are:

**x R y = {1,2), (1,3), (1,4), (2,3), (2,4), (3,4)}**

**Using relations**

Even though relations are defined as a subset of a Cartesian product, there is no requirement that they be generated in that way. Here is a clinical example. Suppose we are building a clinical app and the goal is to track patients and their diagnoses. We can began by defining a relation from the set of patients to the set of diagnoses – specifically that a patient **x **has diagnosis **y**. We will call this relation Problem List (** PL**). (The ellipses indicates that there are many more pairs than are listed.)

*PL* = {(Doe, headache), (Smith, chest pain), (Green, asthma), …}

This relation could have been created by taking a list of patients and a list of diagnoses, generating the Cartesian product, then checking each pair to see if it actually represents a real situation (i.e., Mr. Doe actually has complained of headaches). However, it could just as easily have been generated by clinicians adding patient problems into our app. The important thing is that relations obey certain mathematical rules, and we can make use of those rules for any relation regardless of how it was created.

**Databases and relations**

Since we know that relations are derived from sets, we can use a relation to generate a set. For example, we know that ordered pairs in the relation ** PL **have a patient’s name for the first value and a diagnosis for the second. With this knowledge, we could reconstruct the set of all known ailments (at least among this patient population) by simply collecting into a set all of the diagnosis elements from the ordered pairs. We could do the same with the patient elements and find everyone who has at least one problem. Assuming the pairs exist in our software as a collection of

**objects, we could generate the list of all ailments using code similar to this:**

*PL***FOREACH** PL Object

Get_SecondValue

Add_SecondValue to Ailment_List

**End FOREACH**

Of course, there is no reason to limit the relation to ordered pairs (referred as a binary relation); relations may have an arbitrary number of elements. Such relations are referred to as *n-ary* relations, and they are the starting point for relational databases.

We can expand our ** PL** relation to contain more elements by adding DOB, diagnosis date and doctor.

Expanded PL = (Name, DOB, Diagnosis, Diagnosis_Date, Doctor)

*Instances*

**(Doe, 12/15/1967, headache, 03/15/2013, Jones)
(Smith, 06/07/1949, chest pain, 05/11/2013, Brown)
(Green, 10/09/1988, asthma, 09/25/2013, Smythe)**

Using this relation, we can ask more complex questions such as: “How many doctors are there?”; “Who has asthma”; “When was Mr. Doe diagnosed with a headache?” Relational algebra, the mathematical basis of relational databases, provides the formal rules for manipulating n-ary relations (queries, joins, selections, projections, etc.).

**Equivalence relations
**Thus far, I have discussed relations between two or more sets. However, relations on a single set are interesting and useful as well. Using relations on a single set, we can explore general relation properties such as

*reflexivity, symmetry, and transivity*(1). Let’s start with a clinical example. Suppose we have a set of terms for medical ailments.

**A = {MI, myocardial infarction, hypertension, HBP, high blood pressure, …}**

We will apply the relation: **x R y** such that

**=**

*R***x is a synonym for y**. A set of ordered pairs that meet this requirement is listed below.

**x R y = x is a synonym for y = **{(MI, MI), (myocardial infarction, myocardial infarction), (MI, myocardial infarction) , (myocardial infarction, MI), (hypertension, HBP) , (HBP, high blood pressure), (HBP, hypertension), (hypertension, high blood pressure), (high blood pressure, hypertension), high blood pressure, HBP), (HBP, HBP), (hypertension, hypertension), (high blood pressure, high blood pressure) }

*Reflexivity*

Pairs such as **(MI, MI) **and** (high blood pressure, high blood pressure) **demonstrate reflexivity because each term is synonymous with itself. This is the same as saying **x** is a synonym for **x**.

*Symmetry*

Symmetry is illustrated by pairs such as **(MI, myocardial infarction), (myocardial infarction, MI**), which demonstrate that the relation is bi-directional. In other words, it shows that just as **x** is related to **y**,** y** is related to **x**. In plain English: MI is a synonym for myocardial infarction **(MI, myocardial infarction)** and myocardial infarction is a synonym for MI **(myocardial infarction, MI**).

*Transivity*

The pairs **(hypertension, HBP), (HBP, high blood pressure), (hypertension, high blood pressure) **demonstrate the link between terms. Namely, hypertension is a synonym for HBP; HBP is a synonym for high blood pressure; therefore, hypertension is a synonym for high blood pressure.

Any relation that exhibits reflexivity, symmetry, and transivity is referred to as an *equivalence relation* (1).

If a relation is an equivalence relation, it can be used to generate *equivalence classes*. An equivalence class is the set of all terms that are related to a specific term in the relation (1). An equivalence class is represented symbolically as **[a] **(1).** **Thus the equivalence classes for hypertension and myocardial infarction are represented as:

**[hypertension] = {HBP, high blood pressure, hypertension}**

**[myocardial infarction] = {MI, myocardial infarction}**

Equivalence classes can be useful in a number of situations. For example, using the above synonym relation, we could create a thesaurus that could automatically map synonyms to a preferred term. One possible use case would be mapping various terms in an EHR to the preferred SNOMED term for that ailment.

**Partial-order relations**

Mathematically speaking, partial order relations are relations on a single set that are reflexive, transitive, and *anti-symmetric.* Partial order relations are useful for organizing things into hierarchies. A good example of a partial order relation for real numbers is “**<=**” . Using this relation, numbers can be sorted in ascending order: **1<=2<=3<=4…**

Partial order relations are everywhere. Courses for a college major have prerequisites and must be taken in a specific order. If we say that for a biology degree, the courses that must be completed are the set **{101, 102, 115, 201, 202, 234, 267, 301, 305, 410, 421, 431}**, and lower-numbered courses must be completed before those with higher numbers, this represents a type of *less-than* relation.

This prerequisite relation will contain pairs such as **(101,101), (202,202)**, thereby demonstrating reflexivity. (Of course, we can ignore this property when advising students.)

Anti-symmetry enters the picture because pairs such as **(101, 201)** and **(201, 301)** are included, indicating that 101 must be taken before 201 and 201 must be taken before 301. However, it is impossible to take 201 before 101 or 301 prior to 201. Therefore, these relations are not symmetric; they go only one way—from lower to higher.

Finally, we can see that this relation is transitive, because if 101 is required before taking 201, and 201 is a prerequisite for 301, obviously, 101 is a prerequisite for 301. The hierarchal nature of partial order relations means that they can describe many types of workflows and process-based activities.

The value of uncovering mathematical constructs in everyday activities becomes apparent when creating software. Entire books are dedicated to algorithms and data structures, both of which are based on well-defined mathematical concepts. Therefore, if a real-world informatics problem can be mapped to a mathematical construct, then all of the math and computer science materials available for that construct can be used to solve the problem. Why reinvent the wheel?

The next post will discuss functions, one of my favorite topics.

1. Epp SS. *Discrete Mathematics with Application, **Fourth Edition.* Belmont, CA: Thomson-Brooks/Cole; 2010

{ 0 comments… add one now }