The Evolution of Genomics

0
330

This year’s Genome Issue goes one step beyond the genome project. In its broadest sense, the issue is concerned with taking the sequence information that is the output of the project to the next level—making it useful in studies ranging from understanding the function associated with the sequence to making the transition from gene discovery to the clinic. Two review articles as well as a special fold-out chart are concerned with analyzing the data and using structural and functional criteria to elaborate a universal system in which genes and proteins can be described and classified in a logical and efficient way. These complex and exciting challenges are a consequence of progress made in ascertaining the full genomic sequences of model organisms, as well as advances in sequencing the human genome. Although only 2 percent of our genetic material has been sequenced so far, more than 50 percent of the human gene complement is likely to be represented in collections of expressed sequence tags. It is believed that some improvement of the technology might still allow the genome project to meet its 2005 deadline [see Rowen et al . (p. 605) for a discussion of the state of the art]. A full description of our genome will not be sufficient to understand its functional organization, neither for individual units nor at a more integrated level. Hence, novel technologies and conceptual tools must be designed that promote a systematic approach to gene function: a transition from “structural genomics” to “functional genomics,” as discussed by Hieter (p. 601). Functional genomics will no doubt increase the number of cases in which genes can be associated with particular phenotypes, either normal or pathological. The question, as discussed by Holtzman et al . ([p. 602][1]), becomes how to successfully make the transition from genomics to clinical practice in a way that fulfills scientific criteria and respects ethical as well as social concerns. However, whether or not they are linked to genetic diseases, it is imperative that newly isolated gene products be classified so that the future exhaustive list of our genes can be based on rational criteria and thus reflect some internal properties of our genome. Two articles in this issue address this key question (Henikoff et al ., [p. 609][2], and Tatusov et al ., [p. 631][3]) and discuss the bases for the elaboration of a system of classification. The past 15 years have revealed that genes can be organized as families, as defined by a discrete number of functional protein building blocks. It has become equally clear that animals tend to have the same complement of genes, although copy number may vary as the result of large-scale genome duplications. Although these links among proteins should facilitate their classification, Henikoff et al . emphasize that local duplication, rearrangement of protein-coding segments and combinations of modules, as well as unequal expansion of some subfamilies at the expense of others, have made this challenge much more complex than anticipated. These evolutionary relationships between genes may nevertheless be used as a driving principle in a system of classification. Tatusov et al . report that comparison between seven complete genomes led to the definition of more than 700 rows of orthologous groups, that is, groups of genes that show orthologous relationships based on the presence of an ancient conserved protein domain. However, the combination of various protein motifs in the course of evolution did not follow any particular organizing principle, and new chimeric proteins (containing more than one motif) of all kinds are being reported. The challenge may well be to build up a logical classification system to account for a fundamentaly illogical process. This precise issue was addressed in these columns 20 years ago by Jacob in his seminal paper “Evolution and tinkering.”[*][4] Jacob suggested that recycling of preexisting material, rather than the design of new players, was a source of molecular and regulatory innovations. By sequencing genomes, we can now contemplate the result of evolutionary tinkering and realize how important it has been in producing genetic novelties. Most importantly, the mere description of our genetic material may turn out to be decisive in our understanding of evolutionary mechanisms. The combination of protein motifs may have allowed genes to become highly pleiotropic, which in turn may have introduced important constraints on their future potential variations. Structural genomics is under way, functional genomics is coming; let us embark for “evolutionary genomics,” as it is surely on the horizon.