We examined a hypothesis of a possible association between allergic conditions and uveitis in JIA patients by querying a clinical data warehouse. To validate our methods in the current use case, we demonstrated known positive and negative associations with uveitis in JIA. The onset of arthritis has been demonstrated to be earlier in patients who develop uveitis , as shown in the current study. The established associations between uveitis and positive ANA status [22–25], oligoarticular onset pattern of JIA [3, 22–25, 27], and psoriasis (in patient or first degree family member)  is supported by our data. An inverse relationship has been established between RF positivity and uveitis [22, 23, 28], also confirmed in our study. The ability to also identify negative associations helps to provide an internal validation of the study approach.
After confirming these associations, we present evidence supporting a clinically formed hypothesis about an association between allergic conditions and uveitis. This study highlights the potential of text analytics on unstructured data in clinical data warehouses to examine hypotheses formed during clinical care using practice-based evidence. Without such a computational approach, such hypotheses might be impractical to answer using traditional chart review studies.
We argue that data-mining of electronic medical records—which researchers currently use to inform therapy decisions  and enable phase IV surveillance [9, 18, 19]—should be extended to learn associations and predictors of hard to detect, yet severe, disease complications. Taking such an approach also allows a spectrum of variables to be assessed. Understudied subgroups such as children, the elderly, underrepresented ethnic groups, and pregnant women can be investigated with this approach.
Despite the efficacy of such text-based analyses demonstrated in pharmacovigilance, off-label drug use, and in studying chronic conditions [7, 9, 18–21], this study has several potential limitations that warrant discussion. Although text analysis techniques achieve 97% accuracy in detecting negated terms, 93% accuracy in detecting drug mentions, and 86% accuracy in recognizing disease conditions in validation studies [7, 9, 19, 20], events occurring outside of the hospital can lead to false negatives. Additionally, it is possible that there is increased reporting of allergic conditions and allergy medications among the chronic uveitis sub-group due to a higher level of concern given the eye disease complication. For these reasons, we feel that a prospective study must validate our findings before allergic conditions can be used as a clinically useful predictor.
In order to account for under-reporting bias arising from fewer visits in patients with less co-morbidity (i.e., the non-uveitis cohort), cases and controls were matched on the number of clinical dictations. The unbalanced JIA cohort without uveitis had approximately half of the number of notes as the cohort with uveitis, and initial analysis including patients with fewer notes revealed stronger positive associations in the uveitis cohort. This was interpreted as falsely strong associations since we could not ensure that patients with less clinical record content were truly negative for a given factor or negative due to less thorough documentation. However, it is possible that patients with fewer notes have less severe disease, biasing the study population to those patients with more severe JIA. This tradeoff to ensure similar medical record information content must be recognized.
Finally, this study does not address causation. Indeed, the association between allergy and uveitis is unanticipated since autoimmune disorders are thought to have a Th1/Th17 bias while allergic disorders tend to be associated with a Th2 cytokine profile. However, this immune classification may be an oversimplification in patients with complex and overlapping diseases, such as those with both uveitis and JIA. Recent investigation suggests that Th2 cytokines in the anterior chamber of the eye distinguish patients with idiopathic uveitis from controls using cluster analysis methods . Furthermore, we argue that allergies may be a surrogate risk factor possibly reflecting a heightened immune response to antigens in certain tissues, or a predisposition to both symptomatic and asymptomatic sinusitis where associated bacterial antigens may be the driver of an immune reaction.
If this type of research is performed widely and reliably, then it will become a key aspect of meaningfully using electronic medical records, summarizing practice-based evidence, and can help prioritize prospective clinical trials [6, 11, 30].