The 36th National Immunization Conference of CDC

Tuesday, April 30, 2002 - 11:10 AM
481

Accurate, Customizable Matching: The Heart of the NYC Master Child Index

Martin Buechi1, Paul Schaeffer2, Deborah Walker2, Alexandra Ternier2, and Andrew Borthwick1. (1) ChoiceMaker Technologies, 41 East 11th Street, 11th Floor, New York, NY, USA, (2) Citywide Immunization Registry, New York City Dept of Health, 2 Lafayette Street, 19th floor, New York, NY, USA


KEYWORDS:
Immunization Registry, Record matching, Customization, Architecture, Record matching language

BACKGROUND:
The New York City Department of Health is creating a Master Child Index (MCI) as a first step toward creating a comprehensive child health record to facilitate surveillance, information sharing with providers, and identification of children in need of immunizations and/or lead blood level screening tests. The MCI links the Citywide Immunization Registry (CIR) and the Lead Poisoning and Prevention database (LeadQuest), and will contain all children from these linked databases, primarily children ages 0-7 years old. Approximate record matching finds a record in spite of spelling variations, errors, and (address) changes. ChoiceMaker Technologies’ MEDD (Maximum-entropy de-duper) system, an earlier version used successfully by the CIR, provides this capability in the MCI.

OBJECTIVE(S):
To discuss the requirements for accurate, customizable transactional approximate record matching between linked databases, and to describe the MEDD record matching process developed for the MCI project.

METHOD(S):
The Java-based modular software architecture of MEDD provides for easy integration into a wide variety of computing environments. Customization for specific database properties and quirks is easy with ChoiceMaker’s own Java-based ClueMaker language. Both generic errors, such as misspellings, and specific errors, e.g., a provider consistently reporting the wrong day in the date of birth, can easily be captured with ClueMaker. ClueMaker allows any Java library, including ChoiceMaker’s matching library, to be used in record comparison.

RESULT(S):
Accurate record matching increases the data quality of NYC DOH childhood data by minimizing duplication of records and aiding in linking records across multiple databases.

CONCLUSIONS(S):
Accurate record matching is the centerpiece of the NYC MCI database linkage project. The modular architecture and its own matching language allow MEDD to be efficiently customized.

LEARNING OBJECTIVES:
To understand successful record matching criteria requires a customizable architecture and matching language around a sound core.


Web Page: www.choicemaker.com

See more of Improving Registry Data Quality
See more of The 36th National Immunization Conference