6th Annual Public Health Information Network Conference: Validation of Clinical Data Submitted to Biosense with Whole Record Surveillance Using Natural Language Processing

Validation of Clinical Data Submitted to Biosense with Whole Record Surveillance Using Natural Language Processing

Sunday, August 24, 2008
South/West Halls
Gail A. Welsh, MD , General Internal Medicine, Mayo Clinic, Rochester, MN
Peter L. Elkin, MD , Internal Medicine, Mayo Clinic College of Medicine, Rochester, MN
Brett Trusko , Informatics and Quality Research, Mayo Clinic, Rochester, MN
Katherine A Skeen-Morris, MPH , General Internal Medicine, Mayo Clinic, Rochester, MN
David A Froehling, MD , General Internal Medicine, Mayo Clinic, Rochester, MN
Dietlind Wahner-Roedler, MD , General Internal Medicine, Mayo Clinic, Rochester, MN

Validation of Clinical Data Submitted to Biosense with Whole Clinical Record Surveillance Using Natural Language Processing

Gail A Welsh MD, Peter L Elkin MD, Brett E Trusko PhD, David A Froehling MD ,  Dietlind Wahner-Roedler MD

Aim

 Our study is a collaborative effort to validate the accuracy of data submitted from patient clinical records at the Johns Hopkins University Biosense site to the CDC Biosense database.  We will assess whether data is coded and transmitted correctly and whether it accurately reflects information from source clinical records.  We will compare the accuracy of ICD9 codes with SNOMED CT codes in representing the data.

Background

Volunteer institutions submit clinical data, including chief complaint, from patient records to the CDC Biosense database for biosurveillance of disease outbreaks. It is not clear how accurately submitted data represents the source patient record.  The process may overlook important data or report false positives.

Method

Two reviewers will review 1000 randomly selected patient records from the JHU Biosense site.  Mayo Clinic Vocabulary Server (MCVS), a natural language processor, will assign ICD9 codes independently and reviewers will compare the codes with Biosense data for accuracy.  The text of the whole patient record from JHU Biosense will then be parsed and coded in SNOMED CT.  A reviewer will compare the ICD9 diagnoses with SNOMED mappings for accuracy using our web-review software and MCVS browser.  Data will be analyzed and statistics developed and reported to confirm that the data accepted and stored by the system is the same as that transmitted by the source and that the concepts are accurately reflected in the derived terms.

Results

Our project is in process but has been approved by IRB at both institutions.  We plan to have interim data before PHIN.

Conclusion:

We are in active collaboration for Biosense evaluation.

See more of: Poster Session
See more of: Abstracts