Genomics Workshop: Identification of Functional Elements in Mammalian Genomes
Cold Spring Harbor Laboratory, New York, U. S.
November 12, 2004
Poster presentation

Identification and characterization of novel genes and trascriptional units on human chromosome 7 using custom oligonucleotide microarrays
Kohji Okamura1, Joseph Cheung1, Terence Tang1, Junjun Zhang1, Jeffery R. MacDonald1, Naveed Mohammad2, Quaid D. Morris3, Brendan J. Frey3, Timothy R. Hughes2, and Stephen W. Scherer1
1The Centre for Applied Genomics, The Hospital for Sick Children, and Department of Molecular and Medical Genetics, 2Banting and Best Department of Medical Research, 3Department of Electrical and Computer Engineering, University of Toronto, Canada
In order to discover new genes on human chromosome 7 we have initiated chromosome-wide transcript profiling experiments. We analyzed the sequence of chromosome 7 and extracted 55,705 putative exons which were predicted by five kinds of gene prediction software programs, as well as 15,045 exons that were annotated through our previous work. Subsequently, we eliminated redundant ones and those which seemed inappropriate to hybridization experiment such as ones including high G+C content or repetitive sequences. We then made a custom microarray using the Agilent platform. Since there was room on the array, we added 818 mouse syntenic anchors on the human chromosome. Eventually, we selected 21,605 targets. We then selected 20 human tissues for testing. These included brain subsections and frequently examined organs, such as heart, liver, kidney, spleen, etc. Poly(A)-tailed RNAs were converted to cDNAs in the presence of aminoallyl-dUTP so as to be labeled with fluorescent dye. They were hybridized to the probes and the arrays were scanned. Quantified data were statistically analyzed and functionally classified by using the support vector machine, a machine learning approach. We will report not only the transcript profiling, some of which suggest the existence of tissue-specific splicing isoforms, but also characterization of all detected genes based on quantitative transcriptional co-expression. The result of this study provides a model for the transcriptional analyses of the remainder of the genome.