STD Infection Link Data Mining and Visualization

Tuesday, March 11, 2008
Continental Ballroom
Joy Alamgir, BS , Research and Development, Consilience Software, Austin, TX

Background:
STD infection investigation and containment requires data mining and visualization to find high risk contacts and sources.

Objective:
Describe computational methods of mining infection data and determine links and patterns of infections using various “house-holding” and speciation detection algorithms.

Method:
De-identified test data was generated (2000 infections) pseudo-stochastically. House-holding and speciation algorithms were applied to visualize and correctly predict infection networks. Geo-coding was also applied to correlate exposure sites to augment pattern detection.

Result:
The algorithms correctly created networks that could be identified based on similar traits (addresses, exposure sites, and speciation). The algorithm also failed to create networks if the above similar traits were not properly populated.

Conclusion:
Computation assisted data mining and visualization can rapidly increase efficiency of STD epidemiologists by visualizing such patterns immediately.

Implications:
Using such productivity improving tools for data mining and visualization can make epidemiologists more efficient and effective across program areas including STD. Further research is warranted in expanding the “similar traits” mentioned above to include additional demographic and clinical information (race, clinical findings, and symptoms).
See more of: Poster Session 1
See more of: Oral and Poster