20854 Construction and Validation of Completely Synthetic Background Electronic Medical Records

Sunday, August 30, 2009
Grand Hall/Exhibit Hall
Linda Moniz, PhD , NSTD-STH, Johns Hopkins University Applied Physics Laboratory, Laurel, MD
Anna L. Buczak, PhD , Johns Hopkins University Applied Physics Laboratory, Laurel, MD
Using timelines, frequencies, data mining attributes and summary measures from authentic electronic medical records, we construct a set of fully synthetic electronic medical records for a subset of background patients from the 4-11 age group. These synthetic records replicate both seasonal and non-seasonal timelines in the occurrences of illnesses and injuries as well as the demographics of the patient pool.  Each synthetic patient’s EMR replicates summary care patterns for particular illnesses and injuries in the original EMRs.  These background EMRs can be used as a canvas for injection of artificial outbreaks, or used for testing of new algorithms that utilize the entire EMR (e.g. for disease surveillance or development and testing of diagnostic tools and decision aids).  The synthetic EMRs as a whole replicate background disease levels, patient distributions timelines, seasonal,  and day-of-week effects as they occur in the authentic data, yet each patient is synthetic and no one record can be matched to an individual patient from the authentic data set.   The synthetic records include a visit summary, detailed clinical activity record, laboratory and radiology orders, and laboratory and radiology results. We present a sampling of the set of electronic records and indicate the procedure for download of the records from the proposed Health Informatics Grid.  We outline the synthesis procedure and the validation steps that guarantee the statistical and dynamical properties of the synthetic records.
See more of: Posters
See more of: Submissions