Skip Navigation Links
Centers for Disease Control and Prevention
CDC
CDC CDC Home Search Health Topics A-Z
Contact Help Travelers Health n i p Home NIP header
Family

Tuesday, October 19, 2004 - 10:35 AM
1

Smart Search: A Probabilistic Matching Tool for Data Extraction and Exchange

Vikki Papadouka, New York City Department of Health, 2 Lafayette Street, 19th Floor, New York, USA, David Lyons, HLN Consulting, LLC, 105 Peabody Lane, Marlton, NJ, USA, Rezaul Kabir, New York City Department of Health and Mental Hygiene, 2 Lafayette Street, 19th Floor, New York, USA, Alexandra Ternier, Citywide Immunization Registry, New York City Dept of Health, 2 Lafayette Street, 19th floor, New York, NY, USA, and Paul S. Schaeffer, NYC DOHMH, 2 Lafayette St. 19th Floor, new york, NY, USA.


BACKGROUND:
Every year, the New York Citywide Immunization Registry (CIR) receives requests for large volumes of children’s immunization records from Managed Care Organizations, the Bureau of School Health and a local immunization registry for public health purposes. The CIR provides over 70,000 records in batch electronic format to these organizations. The CIR has been using Smart Search, a program that matches the requested list of children to the database and extracts the immunizations into a file. Recently Smart Search was enhanced to take advantage of CIR’s internal probabilistic matching algorithm enabling approximate matching on large files and increasing match rate.

OBJECTIVE:
To demonstrate how a probabilistic matching tool, which requires minimal or no human review to find records in batch, is more effective than an exact matching method.

METHOD:
One hundred names of children were randomly selected from a file submitted by the Bureau of School Health. The list was passed through both the old and enhanced version of Smart Search. The old program matched the input file exactly on first and last names, DOB and gender. In the new program, records matching above a selected threshold probability were considered a match. Matching rates and accuracy between the two programs were compared.

RESULT:
The enhanced Smart Search recovered 17% more records with no false matches. Records with slightly different first, last name, DOB or gender were found because the probabilistic matching algorithm correctly assigned a probability above the match threshold.

CONCLUSION:
A probabilistic matching tool is more effective in finding records in a children’s database.

LEARNING OBJECTIVES:
To understand the advantages of a probabilistic matching tool for finding records in batch.

[ Recorded presentation ]   Recorded presentation

See more of Methods of Providing School Access to Immunization Registries
See more of The 2004 Immunization Registry Conference