6th Annual Public Health Information Network Conference: Building a knowledge base for human genome epidemiology: information retrieval, text mining, knowledge synthesis and data visualization

Building a knowledge base for human genome epidemiology: information retrieval, text mining, knowledge synthesis and data visualization

Tuesday, August 26, 2008: 3:30 PM
International C
Wei Yu, PhD, MS , National Office of Public Health Genomics, CDC, Atlanta, GA
Ajay Yesupriya, MPH , National Office of Public Health Genomics, CDC, Atlanta, GA
Marta Gwinn, MD , National Office of Public Health Genomics, CDC, Atlanta, GA
Muin Khoury, PhD, MD , National Office of Public Health Genomics, CDC, Atlanta, GA
Advances in human genomics are stimulating translation research that seeks to apply gene discoveries to public health and clinical practice (1). Informatics is the key to managing massive amounts of genomic data and exploring it for information and knowledge. Human genome epidemiology (HuGE) is an evolving multi-disciplinary field that systematically applies epidemiologic methods to population-based studies of genetic variation in relation to health and disease. To serve the human genome epidemiology research community, we developed a web-based knowledge base system, HuGE Navigator, which is constructed according to the principles of open source, standardization, interoperability, and extensibility (2). Multiple applications were built into the system to serve different purposes, such as searching published literature, finding investigators or collaborators, looking for candidate genes, and tracking evolution of the field. Standard vocabularies (e.g. UMLS/MeSH) were used for indexing. The HuGE Navigator allows users to navigate and search the database in a seamless manner. We developed robust text mining software to facilitate continuous updating of the database from PubMed (3).  We also developed a data mining algorithm to automatically perform information extraction (4). The basic infrastructure of this system is an open source project that can be freely downloaded (5).  The system can be accessed via http://www.hugenavigator.net.
  1. Khoury MJ, et al. Genet in Med. 2007; 9 (10): 665-674.  
  2. Yu W, et alNat Genet. 2008; 40, 124-125
  3. Yu W, et al. BMC Bioinformatics, 2008; 9:205.
  4. Yu W, et al. BMC Med Inform Decis Mak. 2007; 7:17.
  5. Yu W, et al: BMC Bioinformatics. 2007; 8:436
Previous Abstract | Next Abstract >>