20851 Natural Language Processing for Lyme Disease Reporting

Wednesday, September 2, 2009: 2:10 PM
Courtland
Frances Morrison, MD, MPH, MA , Department of Biomedical Informatics, Columbia University, New York, NY
Albert M. Lai, PhD , Department of Biomedical informatics, The Ohio State University, Columbus, OH
George Hripcsak, MD, MS , Department of Biomedical Informatics, Columbia University, New York, NY
Reporting of notifiable conditions to health departments has been inconsistent, causing researchers to focus on automated electronic reporting systems. Much of this research centers around structured data such as laboratory results to improve reporting rates without increasing the burden on clinicians. However, local public health practitioners still spend a large amount of time on case follow-up to collect additional data elements, which are often risk factors and exposures not captured in structured fields. We are developing a system that could ease this burden by extracting the required data from text using natural language processing (NLP). We created a lexicon to capture terms required for a complete NYC DOHMH Lyme disease (LD) case report, adding additional data elements through literature review and clinical note inspection. We selected 46 local outpatient notes from 7/06 to 6/07 that mentioned LD, ran them through MedLEE, and queried for the relevant terms. The system recognized 73 distinct concepts from 31 notes, including 60 condition and 13 exposure instances, with a positive predictive value of 65%. We applied the system to 2594 Institute for Family Health outpatient clinical notes on patients with positive LD tests from the same period, resulting in 2598 LD concepts identified from 1206 notes. Error analysis uncovered a high false positive rate due to both difficulty in handling symptoms in the past or family history and the use of LD tests for “rule out” when not seriously considered as a diagnosis. With ongoing improvements in both MedLEE’s handling of disease status and our query logic, we plan to improve the performance of the system to make it a feasible way to assist public health practitioners and clinicians by pre-filling disease reporting forms. We will report our efforts, demonstrate resources potentially saved, and describe other possible uses for the system.
See more of: Utilization of Data
See more of: Submissions