James Russo, Ph.D.
High throughput DNA sequencing has become an essential component of gene discovery, and the gold standard for identifying mutations that are responsible for the development of diseases. In our laboratory, we utilize DNA sequencing for both of these goals. Projects in the laboratory fall within four major areas: (1) whole genome sequencing; (2) disease gene discovery utilizing the positional candidate strategy; (3) identification of polymorphisms and mutations that directly cause diseases or increase the likelihood of developing disease phenotypes; and (4) designing biological applications of new sequencing and mutation analysis approaches developed by our colleagues in Dr. Jingyue Ju’s laboratory. Examples from our work that typify each of these paradigms are presented. Investigators in our laboratory work closely with collaborators in a large number of basic science and clinical laboratories to translate these ideas into practical consequences. Moreover, many of these projects require cooperation among several sections of the Genome Center, including physical mapping, molecular genetics, functional genomics and proteomics, and bioinformatics.
Whole genome sequencing. Propelled by the Human Genome Initiative, sequencing of whole genomes has become commonplace. Several years ago, in collaboration with Drs. Roy Bohenzky, Yuan Chang and Patrick S. Moore in the Department of Pathology, our laboratory obtained the 140 kb sequence of human herpesvirus 8, the causative agent of Kaposi sarcoma, the most common malignancy in AIDS patients. Currently, we are sequencing the nearly 4 million base genome of Legionella pneumophila. This bacterium is responsible for Legionnaires’ disease and other pneumonia-like symptoms. It is found in most standing water supplies (water towers, plumbing systems, whirlpool baths, swimming pools), where it is part of biofilms and can survive within protozoa. When aerosols containing these bacteria are inhaled, they can enter lung macrophages. Rather than destroying these invaders, the macrophages serve as a pool for their replication. While many of the genes responsible for their pathogenicity are known from genetic analyses, having the complete genome sequence should aid greatly in identifying candidate genes to target with antibiotics and vaccines. Our approach to obtaining the complete sequence is a combined whole-genome, BAC-based shotgun approach. The details and the current status of the Legionella genome project, complete with graphics on all sequence contigs available to-date and identification of homologues to known genes in other organisms, is available at http://genome3.cpmc.columbia.edu/~legion/. This project is a collaboration with Dr. Howard Shuman’s laboratory in the Department of Microbiology, and members of the physical mapping and informatics groups of the Human Genome Center. We and our collaborators are involved in comparative genomics (looking for differences in potential virulence genes between pathogenic and non-pathogenic strains and species of Legionella by PCR/sequencing and computational approaches) and functional genomics (knocking out genes of interest in Legionella to determine their role in the organism’s life cycle; developing Legionella gene microarrays for tracking changes in expression patterns during infection and following treatment with drugs). In the future, we expect to participate in projects to sequence larger invertebrate genomes. For instance, we have just embarked upon a project to obtain partial genome sequence of the Mediterranean fruit fly, a major agricultural scourge, with the aim of identifying some of its odorant and taste receptors.
Disease gene discovery. Sequencing support provided by our laboratory has been a major part of several gene discovery projects initiated by the Genome Center in collaboration with Dr. Riccardo Dalla-Favera of the Department of Pathology. Dr. Dalla-Favera’s group is interested in cancers that involve various stage lymphocytes. These include lymphomas, leukemias, and multiple myelomas. Our approach has been to use the positional cloning strategy. Once a locus has been delineated by deletion mapping or loss of heterozygosity analysis (in the case of putative tumor suppressor genes) or by the chance occurrence of a translocation breakpoint, and a physical map has been developed for the region, sequencing plays many important roles. Low-pass sequencing (2- to 3-fold coverage) of the area is accomplished following shotgun cloning of a tiling path of large clones (PACs, BACs or cosmids) within the region into plasmid sequencing vectors. At this level of coverage, the likelihood of having substantial sequence of most exons and all genes is quite high. Identification of genes is then possible by using various prediction algorithms, as well as from searches of the public EST databases,. Other approaches for determining candidate genes include cDNA selection or exon trapping with the regional clones as templates. The products of these reactions of course need to be sequenced not only to confirm that they represent true coding sequences, but in an attempt to ascertain their function. Sequencing of 3′ and 5′ RACE products is used to obtain full-length genes and to discover differentially spliced or polyadenylated gene variants. Candidate genes are prioritized based on their expression in lymphocytes, and available information on known homologues, after which mutation analysis is carried out by direct sequencing of cDNA’s or exons in normal and tumor tissue. In addition to the cancer gene projects, this approach is also being brought to bear in the Genome Center on several projects involving loci potentially involved in complex diseases, such as diabetes, obesity and neuropsychiatric disorders.
Another approach to discovery of tumor suppressor genes is to use subtractive approaches in different pre-cancer stages. In one such project, Dr. Ramon Parson, another member of the Pathology Department, is using representational difference analysis (RDA) to identify potential new tumor suppressors in breast cancer. By sequencing many members of these RDA libraries, it should be possible to identify deleted regions of chromosomes that may harbor such tumor suppressors. With its collection of slab gel based and capillary based automated sequencers, our laboratory is in an excellent postion to rapidly sequence such RDA, SAGE, cDNA and other libraries.
Identification of polymorphisms and mutations by direct sequencing. Certain alleles of genes can lead directly to phenotypic abnormalities, either because certain base replacements, deletions or insertions within coding regions lead to altered amino acid sequences or truncation of the protein product and loss or modification of its function. Even when they occur in non-coding regions, some variations may affect gene expression leading to inappropriate amounts of proteins with pathological consequences. In other cases, particular alleles by themselves may not cause disease, but in combination with alleles of other genes, may increase relative risk for developing a disease or may affect the severity of the disease. In our laboratory, we have been involved in projects manifesting either of these paradigms.
The first category includes the mutation analyses that are part of the disease gene discovery projects described above. As an additional example, we were recently involved in a project with Dr. Timothy Bestor from the Department of Genetics and Development. He and his colleagues were interested in the rare ICF syndrome which presents with immunodeficiency, centromere instability and facial abnormalities. We provided evidence that dominant mutations in a DNA methyltransferase gene (DNMT2) were responsible for this disease state5. This was the first time that methylation defects were clearly shown to be involved in a genetic disease.
In the second category of polymorphisms that may predispose toward disease in combination with other genes and environmental factors, the lab has considered such diseases as asthma, schizophrenia, and diabetes, and in separate projects, such genes as Rb1 (retinoblastoma) and ATM (ataxia telangiectasia). In one ongoing project, we have been interested in various gene interactions that may culminate in the different presentations of the long QT syndrome( LQTS). This condition, which includes changes in the properties of the surface electrocardiogram that increase the risk for fatal cardiac arrhythmias, is known to be elicited by mutations in various sodium and potassium channel subunit genes in the heart. Because people often suffer severe episodes following stimulation by the sympathetic nervous system (e.g., exercise and heightened emotions), we have undertaken a genotyping project to consider whether certain alleles of critical genes in this signal transduction pathway play a role in disease progression.
Methods development. Drs. Jingyue Ju and Jim Russo, as head and associate head of the Sequencing and Chemical Biology section of the Genome Center, and their teams, work very closely with each other to develop new methods and then test them on biological problems. These methods include improved sequencing and alternative mutation analysis technologies that take advantage of time of flight mass spectroscopy, fluorescence energy transfer, sequencing by synthesis, or modified surface chemistries. The goal is to produce high throughput diagnostic tools for the assessment of major categories of simple and complex diseases.
Russo, J.J., Bohenzky, R.A., Chien, M.C., Chen, J., Yan, M., Maddalena, D., Parry, J.P., Peruzzi, D., Edelman, I.S., Chang, Y. and Moore, P.S. (1996) Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8). Proc. Natl. Acad. Sci. U.S.A. 93: 14862-14867.
Segal, G., Russo, J.J. and Shuman, H.A. (1999) Relationships between a new type IV secretion system and the icm/dot virulence system of Legionella pneumophila. Molec. Microbiol. 34: 799-809.
Qu, X., Morozova, I., Chien, M., Kalachikov, S., Segal, G., Chen, J., Park, H., Georghiou, A., Asamani, G., Feder, M., Rineer, J., Greenberg, J.J., Goldsberry, C., Rzhetsky, A., Fischer, S.G., DeJong, P., Zhang, P., Cayanis, E., Shuman, H.A. and Russo, J.J. (2001) The Legionella pneumophila sequencing project. Invited chapter in “Legionella: Proceedings of the 5th International Symposium” ASM Press. In press.
Qu, X., Hauptschein, R.S., Rzhetsky, A., Scotto, L., Chien, M., Ye, X., Frigeri, F., Rao, P.H., Pasqualucci, L., Gamberi, B., Zhang, P., Chaganti, R.S.K., Dalla-Favera, R. and Russo, J.J. (1998) Analysis of a 69 kb contiguous genomic sequence at a putative tumor suppressor gene locus on human chromosome 6q27. DNA Seq. 9: 189-204.
Kalachikov, S., Migliazza, A., Cayanis, E., Fracchiolla, N.S., Bonaldo, M.F., Lawton, L., Jelenc, P., Ye, X., Qu, X., Chien, M., Hauptschein, R., Gaidano, G., Vitolo, U., Saglio, G., Resegotti, L., Brodjansky, V., Yankovsky, N., Zhang, P., Soares, M.B., Russo, J., Edelman, I.S., Efstratiadis, A., Dalla-Favera, R. and Fischer, S.G. (1997) Cloning and gene mapping of the chromosome 13q14 region deleted in chronic lymphocytic leukemia. Genomics 42: 369-377.
Hatzivassiliou, G., Miller, I., Takizawa, J., Palanisamy, N., Rao, P.H., Iida, S., Tagawa, S., Taniwaki, M., Russo, J., Neri, A., Cattoretti, G., Clynes, R., Mendelsohn, C., Chaganti, R.S. and Dalla-Favera, R. (2001) IRTA-1 and IRTA-2, novel Immunoglobulin Superfamily Receptors expressed in B cells and involved in chromosome 1q21 abnormalities in B cell malignancy. Immunity 14: 277-289.
Migliazza, A., Bosch, F., Komatsu, H., Cayanis, E., Martinotti, S., Toniato, E., Guccione, E., Qu, X., Chien, M., Murty, V.V.V., Gaidano, G., Inghirami, G., Zhang, P., Fischer, S., Kalachikov, S.M., Russo, J., Edelman, I., Efstratiadis, A. and Dalla-Favera, R. (2001) Nucleotide sequence, transcription map and mutation analysis of the 13q14 chromosomal region deleted in B-cell chronic lymphocytic leukemia. Blood 97: 2098-2104.
Xu, G-L., Bestor, T.H., Bourc’his, D., Hsieh, C-L., Tommerup, N., Bugge, M., Hulten, M., Qu, X., Russo, J.J. and Viegas-Péquignot, E. (1999) Chromosome instability and immunodeficiency syndrome caused by mutations in a DNA methyltransferase gene. Nature 402: 187-191.
Hall, E.J., Schiff, P.B., Hanks, G.E., Brenner, D.J., Russo, J., Chen, J., Sawant, S.G. and Pandita, T.K. (1998) A preliminary report: frequency of AT heterozygotes among prostate cancer patients with severe late responses to radiation therapy. Cancer J. Sci. Am. 4: 385-389.
Tong, A.K., Li, Z., Jones, G.S., Russo, J.J. and Ju, J. (2001) Combinatorial fluorescent energy transfer tags for multiplex biological assays. Nature Biotechnology 19: 756-759.