Yale Genetics

Our Department Faculty/Labs Graduate Program Medical Genetics Directory

Cheung, Kei-Hoi


Assistant Professor, Yale Center for Medical Informatics

* B.S. Southern Connecticut State University, 1989
* Ph.D. University of Connecticut, 1998

Research Interests:

Genetic database and tool interoperation As an increasing number of databases and tools are presently accessible to the life sciences research community over the Internet, we have realized the importance of database and tool interoperation in facilitating genome-wide research. In addition to the large data quantity, the growing number, heterogeneity, and variety of the available databases and tools have posed a major interoperability challenge. We tackle this challenge by exploring efficient and innovative data and tool interoperation approaches involving the use of XML, semantic web, metadata, distributed computing, and high performance computing. We have been collaborating with many faculty members in different departments and core facilities including Genetics, Biology, Computer Science, Biostatistics, Yale Keck Microarray Facility, and Yale Keck Protein Profiling Facility. Our research is carried out in the context of integrating and analyzing: a) microarray data, b) proteomics data including mass spectrometry (MS) data, c) single nucleotide polymorphism (SNP) data, and d) yeast genome data. Related projects include: a) Yale Microarray Database (YMD), b) Yale Protein Expression Database (YPED), c) Allele Frequency Database (ALFRED), and d) YeastHub.

Ongoing Projects:

Yale Microarray Database: An institution-wide database for use by microarray researchers at Yale and outside of Yale
Yeast transposon-insertion genome project
PhenoDB: a database program that manages and analyzes genotype/phenotype data in support of population and pedigree genetics study ALFRED: ALlele FREquency Database

YCMI lectures (2002):

Introduction to the issues of multilevel approaches to interoperation (2/28) [paper 1 paper 2] [ppt slides]
Data model interoperation (3/7) [paper will be distributed separately] [ppt slides]
     Emerging genome standards: GO, MIAME, etc (3/14) [paper 1 paper 2 ]
Gene Ontology; MIAME; MGED; MAGE-ML; YMD paper (submitted to AMIA 2002)
Flexible linking of distributed objects: SOAP, Turbogenomics, etc (3/28) [paper 1 paper 2]
     Apache SOAP

Biomedical Informatics Course, Spring 2003 (Bioinformatics Database/Tool Interoperability):

Interoperation of heterogeneous bioinformatics databases [Overview paper 1]
Interoperation of heterogeneous bioinformatics software tools[Overview paper 1]
XML and its related technologies [Overview paper 1]
Applying XML-based standards to interoperation of bioinformatics databases/tools [Overview paper 1, paper 2]

Representative Publications

Pedrioli PG, Eng JK, Hubley R, Vogelzang M, Deutsch EW, Raught B, Pratt B, Nilsson E, Angeletti RH, Apweiler R, Cheung K, Costello CE, Hermjakob H, Huang S, Julian RK, Kapp E, McComb ME, Oliver SG, Omenn G, Paton NW, Simpson R, Smith R, Taylor CF, Zhu W, Aebersold R. (2004). A common open representation of mass spectrometry data and its application to proteomics research. Nature Biotechnology 22(11):1459-66.

Cheung KH, deKnikker R, Guo Y, Zhong G, Hager J, Yip KY, Kwan AKH, Li P, Cheung DW. (2004). Biosphere: the interoperation of web services in microarray cluster analysis. Applied Bioinformatics 3(4):253-6.

Cheung KH, et. al. (2005) YeastHub: a semantic web use case for integrating data in the life sciences domain. Bioinformatics 21(Suppl. 1): i85-i96.