OJPHI: Vol. 5
Journal Information
Journal ID (publisher-id): OJPHI
ISSN: 1947-2579
Publisher: University of Illinois at Chicago Library
Article Information
©2013 the author(s)
open-access: This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes.
Electronic publication date: Day: 4 Month: 4 Year: 2013
collection publication date: Year: 2013
Volume: 5E-location ID: e74
Publisher Id: ojphi-05-74

Computerized Text Analysis to Enhance Automated Pneumonia Detection
Sylvain DeLisle*12
Tariq Siddiqui12
Adi Gundlapalli34
Matthew Samore34
Leonard D’Avolio56
1VA Maryland Health Care System, Baltimore, MD, USA;
2Medicine, University of Maryland, Baltimore, MD, USA;
3VA Salt Lake City Health Care System, Salt Lake City, UT, USA;
4University of Utah, Salt Lake City, UT, USA;
5VA Boston Health Care System, Boston, MA, USA;
6Harvard Medical School, Boston, MA, USA
*Sylvain DeLisle, E-mail: sdelisle@umaryland.edu


To improve the surveillance for pneumonia using the free-text of electronic medical records (EMR).


Information about disease severity could help with both detection and situational awareness during outbreaks of acute respiratory infections (ARI). In this work, we use data from the EMR to identify patients with pneumonia, a key landmark of ARI severity. We asked if computerized analysis of the free-text of clinical notes or imaging reports could complement structured EMR data to uncover pneumonia cases.


A previously validated ARI case-detection algorithm (CDA) (sensitivity, 99%; PPV, 14%) [1] flagged VAMHCS outpatient visits with associated chest imaging (n = 2737). Manually categorized imaging reports (Non-Negative if they could support the diagnosis of pneumonia, Negative otherwise; kappa = 0.88), served as a reference for the development of an automated report classifier through machine-learning [2]. EMR entries related to visits with Non-Negative chest imaging were manually reviewed to identify cases with Possible Pneumonia (new symptom(s) of cough, sputum, fever/chills/night sweats, dyspnea, pleuritic chest pain) or with Pneumonia-in-Plan (pneumonia listed as one of two most likely diagnoses in a physician’s note). These cases were used as reference for the development of the EMR-based CDAs. CDA components included ICD-9 codes for the full spectrum of ARI [1] or for the pneumonia subset, text analysis aimed at non-negated ARI symptoms in the clinical note [1] and the above-mentioned imaging report text classifier.


The manual review identified 370 reference cases with Possible Pneumonia and 250 with Pneumonia-in-Plan. Statistical performance for illustrative CDAs that combined structured EMR parameters with or without text analyses are shown in the Table. Addition of the “Text of Imaging Report” analyses increased PPV by 38–70% in absolute terms. Despite attendant losses in sensitivity, this classifier increased the F-Measure of all CDAs based on a broad ARI ICD-9 codeset. With the possible exception is CDA 6, whose F-measure was the highest achieved in this study, the text analysis seeking ARI symptoms in the clinical note did not add further value to those CDAs that also included analyses of the chest imaging reports.


Automated text analysis of chest imaging reports can improve our ability to separate outpatients with pneumonia from those with a milder form of ARI.

[1]. DeLisle S, South B, Anthony JA, Kalp E, Gundlapalli A, et al. Combining Free Text and Structured Electronic Medical Record Entries to Detect Acute Respiratory InfectionsPLoS ONE 2010;5(10):e13377.
[2]. D’Avolio L, Nguyen T, Goryachev S, Fiore L. Automated Concept-Level Information Extraction to Reduce the Need for Custom Software and Rules DevelopmentJournal of the American Medical Informatics Association 2011;18(5):607.

[Figure ID:f1-ojphi-05-74]
Figure 1: 

Article Categories:
  • ISDS 2012 Conference Abstracts

Keywords: situational awareness, influenza, surveillance, electronic medical record, pneumonia.

Online Journal of Public Health Informatics * ISSN 1947-2579 * http://ojphi.org