21123 Optimizing SaTScan for Spatio-Temporal Analysis of National Biosurveillance Data

Monday, August 31, 2009: 11:10 AM
Courtland
Jerome I. Tokars, MD, MPH , National Center for Public Health Informatics, Centers for Disease Control and Prevention, Atlanta, GA
Jian Xing, PhD, MS , National Center for Public Health Informatics, Centers for Disease Control and Prevention, Atlanta, GA
Howard S. Burkom, PhD , Johns Hopkins University Applied Physics Laboratory, Laurel, MD
Background: Detection of multi-state and trans-border disease clusters is an important function of BioSense, a national automated surveillance system.  The software SaTScan, used widely to determine significant clusters, provides various estimation and randomization options, and judicious choice among these is critical to finding appropriate clusters.  We present preliminary results of the performance of 5 SaTScan option combinations.  Methods: The study data were records of hospital emergency department (ED) chief complaints from 4 sub-syndromes (asthma, cough, fever, and nausea/vomiting) among 750 facilities during January-December, 2008.  Treatment facility zip code centroids were used as point locations for clustering. First, the number of one-day clusters was determined for the original data; SaTScan was run separately for each sub-syndrome and calendar month.  Next, 244 one-day synthetic signals with additional counts centered at randomly-chosen facilities were added and cluster determination repeated. Methods used include the space-time permutation model (STP); Poisson model, expected value based on facility-specific average sub-syndrome count (Poi-C); Poisson model, expected value based on facility-specific sub-syndrome rate per total ED visits (Poi-R); Poi-C with non-parametric temporal adjustment (Poi-CT); and Poi-R with non-parametric temporal adjustment (Poi-RT).  Results:  Baseline mean counts/facility/day (without injection) were: asthma, 1.2 ; cough, 3.7; fever 5.5; and nausea/vomiting, 6.2.  Total baseline clusters (p<0.001) were: STP, 121; Poi-C, 250; Poi-R, 88; Poi-CT, 133; Poi-RT, 19.  The number of synthetic signals detected (p<0.001) was: STP, 132 (sensitivity 54.1%); Poi-C, 147 (60.2%); Poi-R, 124 (50.8%); Poi-CT, 144 (50.0%); and Poi-RT, 121 (49.6%).  For Poi-RT with p<0.01, there were 37 baseline clusters and 152 detected signals (sensitivity 62.3%).  Conclusions:  At p<0.001, the 5 methods varied more in numbers of baseline clusters than sensitivity. Detailed comparisons of the clusters found by the various methods will be presented.  Non-parametric temporal adjustment may be an attractive option to control the number of baseline clusters while preserving sensitivity.