Chapter 99 Genomic and Proteomic Medicine in Critical Care

David Jardine, Mary K. Dahmer, Michael Quasney

• The DNA of two unrelated humans is more than 99.9% identical.

• Human genetic variation most commonly comes from single-nucleotide polymorphisms or copy number variations. Copy-number variations are stretches of DNA of greater than 1 kb that show differences in the expected number of copies of the DNA in greater than 1% of the human population. One single-nucleotide polymorphism is believed to occur in every 100 to 300 bases.

• Although high-throughput screening technology is not commonly utilized for individual patients, in the future these technologies will likely be used to provide information on an individual patient’s disease or response to therapy.

• Gene expression arrays are providing new insights into sepsis and acute lung injury. Studies of genetic variation are helping us to better understand response to medications such as opiates and β-agonists.

• Genomic medicine will provide us with tremendous benefits and challenges. Perhaps one of the greatest benefits will be the understanding that human similarities and differences transcend the racial and ethnic categories that have proved so contentious in the past.

The recently developed disciplines of genomics, proteomics, and metabolomics are producing significant change in the biologic sciences. In the next decade, these disciplines are projected to have a major effect on clinical medicine, both in speeding the development of new therapies and in helping to create individualized therapies that are specifically tailored to the disease and drug metabolism characteristics of each patient.

One of the hallmarks of these new technologies is that they produce enormous quantities of data, so they often are described as “high-throughput” technologies. These data simultaneously reveal details about many components of a biologic system rather than focusing on a single pathway or product. To meet the challenge of extracting meaningful information from such large quantities of data, the new discipline of systems biology is being developed. The goals of systems biology are to integrate information from a variety of sources and to develop a comprehensive picture of the relationships and interactions between the components of a biologic system. This chapter will focus on these disciplines and describe their potential impact on patients in the intensive care unit.

Genomics

From the Discovery of the Double Helix to the Human Genome Project

In 1953, in a manuscript that scarcely exceeded one page,¹ the double helical structure of deoxyribonucleic acid (DNA) was described. This brief report opened the door for a new understanding of heredity and gene function. As a direct result of this discovery, the field of molecular biology emerged and the task of deciphering the genetic code began. Initially, progress was slow, depending mostly on methodical detective work and a certain measure of luck. It was not until 1983, with the localization of the mutation for Huntington’s disease to chromosome 4, that a gene was unequivocally linked to a physical location within the human genome (Figure 99-1). Another 10 years would elapse before the sequence of this gene was known and the molecular abnormality causing Huntington disease was identified. The cystic fibrosis gene was among the earliest disease-causing genes to be sequenced in 1989. Among the insights gained was that the disease was genetically heterogenous. Only 70% of patients with this illness had the most common mutation (ΔF508); the remaining patients could have any of hundreds of mutations in the chloride-channel protein encoded by this gene. Knowledge of variation in the gene’s sequence helped to explain the tremendous diversity in the clinical manifestations of this disease.

Figure 99–1 Milestones in molecular biology and sequencing of the human genome.

Gene Expression and Microarrays

High-throughput automated DNA sequencing was developed in the late 1980s, allowing rapid determination of DNA sequences. This development has led to new challenges, because managing the large volumes of sequence data demanded new technologies. Fortunately, the rapid increase in computing power in mainframe and desktop computers and the availability of the Internet to link investigators to public databases provided a solution to this problem. A fusion of these technologies greatly accelerated the pace of gene sequencing.

In the late 1980s and early 1990s, leaders from the National Institutes of Health and the Department of Energy began to create the infrastructure necessary for large-scale sequencing of the human genome. Although the Human Genome Project began in the early 1990s, the international effort to sequence the entire genome did not begin until 1998.² Remarkably, this massive project was completed just 5 years later.³ Scientists throughout the world who participated in the human genome project contributed DNA sequence data to public databases so that the entire sequence of the human genome is freely available to the scientific community.

In the early 1990s, as the DNA sequence of an increasing number of genes became available, investigators began to experiment with gene expression microarrays. These investigators made microarrays from slides with a series of spots, in which each spot contained the DNA from a single gene. As the technology improved, the investigators were able to examine the expression of an increasingly larger number of genes in a single experiment. In the earliest published experiments, the investigators examined expression patterns of 45 yeast genes.⁴ Less than a decade later, commercially produced gene expression microarrays were available with more than 30,000 human genes on each array.⁵ The field of functional genomics emerged when gene expression microarrays made it feasible to study the expression of thousands of genes at once. Gene expression microarray use has increased rapidly since 1995, when the first gene expression microarray publications appeared in the literature. These powerful tools are readily available to laboratory investigators and are beginning to find their way into clinical practice. Within the next few years, gene expression microarrays and other high-throughput technologies will find increasing numbers of applications in clinical medicine.

Gene Expression Microarrays

All gene expression microarrays start with known gene sequences bound to some kind of a platform. These DNA sequences are obtained from public databases. As new information about the sequence and function of genes becomes available, it is added to the database so that the latest information is readily available to investigators who are using microarrays. Gene expression microarrays depend on the property of a single strand of DNA in solution to hybridize (bind) with a complementary strand of DNA that is bound to the microarray slide.

The oligonucleotides chosen for the microarray are selected based upon their ability to bind tightly with the DNA from a single gene, thus making each oligonucleotide highly selective for one gene with little cross-reactivity for other genes. Each oligonucleotide is printed onto a known location (spot) on the surface of the slide such that the various genes are arrayed in rows and columns. This spot is the “address” of the oligonucleotide and is an important piece of information in collecting and interpreting data from microarrays.

Quantifying Gene Expression

Microarray experiments usually are designed so that gene expression from two tissue samples may be compared either directly or indirectly.⁶ This allows investigators to learn how different conditions can alter gene expression. For example, much has been learned about the biology of cancer by comparing gene expression from cancer cells to gene expression from normal cells in the same tissue.

When a gene is expressed in a cell, the DNA from that gene is transcribed into messenger ribonucleic acid (mRNA). This is transported out of the nucleus and then used as a template to guide the synthesis of proteins in the cell. At any given time, the mRNA content of a cell represents a snapshot of the genes being expressed and the proteins being made in the cell. If a gene is turned on (upregulated), more mRNA will be produced from that gene. Conversely, if a gene is turned off (downregulated), less mRNA will be produced from that gene. To quantify gene expression, mRNA is extracted from the cell and reverse transcribed to make cDNA that is labeled with a fluorescent dye. The labeled cDNA is then incubated on a microarray slide to permit the hybridization with complementary DNA oligonucleotide probes that are bound to the surface of the slide (Figure 99-2). The amount of labeled cDNA hybridized to each oligonucleotide spot will be proportional to the quantity of mRNA that was expressed from the target gene.

Figure 99–2 Making of a microarray. Ribonucleic acid (RNA) is isolated from a cell and reverse transcribed into labeled complementary deoxyribonucleic acid (cDNA). This is placed on the surface of a slide, which is covered with spots of oligonucleotides that are complementary to the cDNA. Under carefully regulated conditions, the labeled cDNA specifically hybridizes with a complementary oligonucleotide. Under a fluorescent laser, the brightness of each oligonucleotide spot is proportional to the quantity of cDNA bound to it. The brightness is measured to determine the relative amounts of RNA from each gene in the cells.

When hybridization is complete and the labeled target DNA is bound to the oligonucleotide probes, the microarray is ready for analysis. The relative amount of mRNA produced from each gene is quantitated by exposing the microarray to blue laser light and capturing the image. A strong fluorescence signal from an oligonucleotide spot indicates a large quantity of mRNA was present from that gene. The strength of signals can be compared under different experimental conditions to determine how the conditions affect gene regulation. The signal intensity for each spot is quantitated and saved in a database. Because the position of each spot corresponds to a specific oligonucleotide, the database links the information about the image intensity to information about the gene. It now is possible to assemble a profile of the gene expression in the tissue.

Large, high-density microarrays contain thousands of spots, each of which corresponds to a different gene. Managing and analyzing this data would be nearly impossible if not for the substantial power available from desktop computers and the use of specialized software packages designed for microarray data analysis. Using this technology, it is possible to track the expression patterns of sets of genes and observe how these patterns change under different conditions. Analysis of gene expression patterns can provide new insights into how tissues and organs respond to a disease or a therapy.

Knowledge gained from expression microarray studies has greatly enhanced our understanding of the mechanisms organisms employ in response to environmental stresses, diseases, metabolic challenges, and therapeutic interventions. At the cellular level, maintenance of homeostasis often means altering gene expression to compensate for changes in a cell’s environment. Learning which genes are activated or suppressed by changes caused by disease can suggest new targets for therapies directed specifically at correcting the molecular changes created by a disease state.

Genes and Human Variation

When contemplating the great diversity among humans, it is somewhat surprising to realize that the DNA of two unrelated humans is more than 99.9% identical.⁷ Although the vast majority of nuclear DNA is identical from one person to the next, a small fraction of DNA sequence (~0.1%) varies between individuals and is responsible for the genetically determined variation in our physical characteristics and physiology. Genetic variability also appears to be involved with susceptibility to some diseases, as well as therapeutic responses to treatment.

The sequencing of the human genome (actually now several individuals’ genomes) and the advent of high-throughput sequencing and genotyping technologies has revolutionized the understanding of gene structure and genetic variation. Many genes are polymorphic; that is, there are small differences in DNA sequence between individuals. Polymorphisms are sites in DNA in which variation at a specific nucleotide or DNA region is found in greater than 1% of the general population, and in some instances in as much as 50% of the population. (Mutations are considered to be sites at which variation occurs in 1% or less of the population.) Polymorphisms may alter protein level or function in several ways. For example, altering a single base can alter an amino acid in a protein, which may lead to a change in protein function. Polymorphisms can also have significant effects without altering proteins. A polymorphism occurring in a promoter region, that controls gene expression through controlling mRNA synthesis, may lead to increased or decreased synthesis of that protein, which may have significant effects. Although polymorphisms are still being mapped and the function of most polymorphisms is still being defined, it is clear that these genetic variations account for the vast majority of inherited human phenotypes, from differences in hair color to differences in response to medications.

Polymorphic sites within a gene do not necessarily affect the expression or the function of the gene product. However, polymorphisms are not only of interest because of the subset which are responsible for genetic variability, but also because they occur fairly frequently in the human genome and can be used as markers to map genes involved with disease to specific regions of the genome. To be used as a marker, a polymorphism does not have to change the expression or function of a protein product but rather only needs to be linked to the gene involved with the disease of interest.

Single-Nucleotide Polymorphisms

Polymorphisms in DNA sequence may exist in several forms with the most common form due to the substitution of one nucleotide for another. These single nucleotide substitutions are called single-nucleotide polymorphisms (SNPs). For example, the sequences GATCACA and GATTACA differ because one of the cytosines (C) in the first sequence has been replaced by a thymine (T) in the second sequence. This example represents the most common human SNP, which involves the substitution of T for C.

Although such substitutions may occur spontaneously and represent a new mutation, the vast majority of the observed substitutions are stable variations in the human gene pool. SNPs are the most common type of polymorphism, and are thought to account for approximately 90% of human variation.⁸ One SNP is believed to occur in every 100 to 300 bases. Although most of the SNPs in the human genome remain to be identified, if this figure holds true for the entire genome, then more than 20 million SNPs exist in our genome, and constitute an enormous source of variation.

Copy-Number Variations

In addition, polymorphisms within genes may be due to insertions or deletions of fragments of DNA, or to the presence of a variable number of tandem repeats (VNTRs) of short, repetitive DNA sequences. Some of these insertions or deletions, although submicroscopic, can be relatively large, resulting in gene copy-number variations (CNVs). CNVs are generally defined as stretches of DNA of greater than 1 kb that show differences in the expected number of copies of the DNA (that generally would be two due to the presence of one copy on each chromosome) in greater than 1% of the human population.^9,¹⁰ Very recently it has become clear that CNVs are common in human genomes and contribute significantly to human genetic variation.^11,¹² In addition, CNVs have been demonstrated to be associated with a number of diseases including Williams-Beuren syndrome, DiGeorge syndrome, mental retardation, and autism.^9,¹⁰ It is thought that alterations in phenotypes due to CNVs are due to differences in gene dosage or to gene disruption that may be caused during the duplication or deletion. This is a new and rapidly expanding field and the next 10 years will likely determine how large a role CNVs play in human variation and disease.

Genotyping and Microarrays

DNA microarrays have been designed for both SNP and CNV genotyping. SNP microarrays are similar to those used for gene expression studies, but the oligonucleotides on SNP arrays have each been designed to selectively hybridize with one form of an SNP. Some of these microarrays have probes for almost a million different SNPs from throughout the genome and can quickly reveal the genotypes of an individual at these sites. Many of these same arrays also have the ability to genotype the individual for almost a million CNVs. Such arrays have been used for genome-wide association studies to identify genes associated with complex diseases such as asthma ¹³^–¹⁶ and diabetes.¹⁷^–¹⁹ The ability to define the genetic components of variation with speed and precision is more than a valuable research tool. In the near future, this information will be used to identify disease and to help select therapies for illness, taking into consideration the individual variations that can be predicted on the basis of a patient’s SNP and CNV genotype. As more polymorphisms are identified and their function understood, this technology will probably become an integral part of clinical practice. In the future, this new technology likely will permit physicians to plan highly individualized therapy for each patient, taking into account issues such as individual disease susceptibility and variations in drug metabolism.

Proteomics

Genes are the main sites of biologic information, but proteins are the main centers of biologic activity, which gives proteins a unique importance. The discipline of proteomics encompasses the study of all the proteins encoded by the genome present in specific tissues, cells, or fluids. Proteomics encompasses not only the study of differences in protein levels but also the study of the modifications that occur after protein synthesis. Study of the proteome is particularly important because levels of mRNA often do not correspond to levels of the protein product.²⁰ In addition, it is estimated that there are over a million proteins encoded by only about 30,000 genes, suggesting that there is substantial protein processing and modification involved in generating the proteome.²¹ In clinical medicine, proteomic studies often examine differences in proteins between normal and diseased—or between untreated and treated—cells, tissues, or body fluids. Often the goal is to identify biomarkers associated with disease, or to identify novel targets for drug development.

Because the protein complement within a cell can vary widely over time in response to intracellular and extracellular influences, any picture of the proteome must consider these influences. Knowing which genes are expressed or suppressed by a given disease state is important but is only part of the picture. After proteins are synthesized, they can be modified in a number of ways that can dramatically alter their function. Such alterations are generally due to activation of a signaling cascade or enzyme pathway. Because activation of these cascades and pathways can occur without the activation of gene expression, gaining a full picture of the functioning of a cell, particularly during differentiation, development of disease, or response to extracellular signals or drugs, will require determination of protein levels, protein modifications, and protein interactions.

Unfortunately, proteins are much more complex than DNA and RNA in a variety of ways. Proteins are composed of 20 amino acids rather than the four nucleotides that constitute DNA and RNA. The three-dimensional structure of proteins, which is critical to their function, usually is much more complicated than the three-dimensional structure of DNA. Finally, after proteins are synthesized, they undergo a variety of modifications (e.g., cleavage, phosphorylation, and glycosylation) termed posttranslational modifications. In contrast, DNA undergoes relatively little modification after synthesis. Proteomic methods are being developed to detect and measure posttranslational modifications.^22,²³ The complexity of proteins has slowed the development of high-throughput methods for examining large numbers of proteins simultaneously. Nevertheless, great progress has been made in this field and a number of new techniques are being developed to enhance the use of proteomics for studying disease.²¹ Some of the techniques bear a resemblance to DNA-based microarrays, except that proteins or ligands rather than oligonucleotides are spotted on a slide.^24,²⁵ Other approaches, including mass spectrometry-based analysis of proteins, are also showing considerable promise.^21,^25,²⁶

Metabolomics

Perhaps the newest of the “omic” fields is metabolomics, the study of all the small molecules, primarily metabolites, within cells, tissues, organs, or biological fluids. The goal is to provide a comprehensive picture of the metabolic state of a cell or tissue by measuring the full suite of metabolites and small molecules. The metabolites produced in a cell can vary widely depending on external influences, including the environment, and can reflect the health of the cell. In a sense, metabolites are at the end of the cellular information chain (because they are dependent on genes, gene expression, and proteins), but the metabolic state of the cell often drives mRNA and protein synthesis through feedback loops.

Because metabolites are a heterogeneous group of small molecules, many of which are structurally unrelated, this field presents great challenges. Substantial progress is being made in measuring the metabolic state of a cell using the technologies of gas and liquid chromatography coupled to mass spectrometry.²⁷^–³⁰ Recently, relatively high-throughput metabolomics approaches have been utilized to identify metabolomic signatures for a number of disease processes including cancer, motor neuron disease, and type 2 diabetes, and have demonstrated that metabolomic signatures are also useful in identifying drug-response phenotypes.^27,²⁹ Metabolomics studies in cancer patients have resulted in the use of this technology for tests for diagnosis of breast and prostate cancer that are paid for by insurance providers.²⁹ Future metabolomic studies will likely help elucidate disease processes, identify biomarkers, and identify new drug targets.

Systems Biology

Together, the fields of genomics, proteomics, and metabolomics characterize biologic processes at a level of detail that was almost unimaginable 20 years ago. However, amalgamating these details into a meaningful narrative is one of the greatest challenges facing scientists (the genomic data alone can have more than 30,000 data points per experiment). The ambitious goal of systems biologists is to integrate data from all the “omic” disciplines, identify important components, and assemble the knowledge into a meaningful whole that can be validated (Figure 99-3).³¹ Integrating information about system structures, system dynamics, the control method, and the design method can lead to a systems-level understanding of an organism.³² Because of the obvious technical challenges in such an undertaking, this field is still relatively new. Nevertheless, a systems biology approach has already produced novel biologic insights regarding the regulation of innate immunity ^33,³⁴ and protein kinases.³³ As expertise in this field continues to grow, there is optimism that it will lead to new insights in such diverse areas as drug discovery,^35,³⁶ synthetic transgene control networks,³⁷ and neurologic diseases,³⁸ to name a few.

Figure 99–3 The goal of systems biology is to provide an integrated picture of all cellular functions. Each of the “omic” fields provides a comprehensive picture of one aspect of cellular function.

(Modified from Minie ME: Module 7, expression resources, NCBI Advanced Workshop for Bioinformatics Information Specialists. Available at http://www.ncbi.nlm.nih.gov/Class/NAWBIS/.)

Metabolic control analysis is another discipline dedicated to the development of an integrated overview of genetic, enzymatic, and substrate control mechanisms in biologic systems.^38,³⁹ When a metabolic control analysis is fully developed, a control coefficient is assigned to each step in an enzymatic pathway.⁴⁰ These coefficients reflect the magnitude of change that is induced in a pathway compared with the change in the state or level of an enzyme. Enzymes with high coefficients are logical targets for therapeutic intervention (drug design). Identification of important regulatory points also can help with the understanding of carcinogenesis and can provide new insights into genetic disorders.

Clinical Applications

The new technologies described above offer great promise for clinical medicine. At the research level, new biologic insights already are becoming available as a result of high-throughput technologies. Clearly, genomic screening of tumors can aid in diagnostic classification, and SNP analysis of enzymes involved in drug metabolism can characterize an individual’s response to some drugs. Although high-throughput screening technology has not yet been implemented routinely for individual patients, almost certainly in the near future high-throughput technologies will provide information about an individual patient’s disease or response to therapy.

Cancer

Oncologists have made extensive use of gene expression profiling to revise and more accurately classify the prognostic categories of malignancies.⁴¹^–⁴⁴ One of the earliest uses of microarray technology for prognostic purposes was to study lymphomas using a specialized microarray, the “Lymphochip.” Using this microarray, investigators were able to identify different histologic classes of lymphoma by their gene expression patterns, indicating that histologically distinguishable tumors differ in their gene expression, so these tumors can be distinguished at a molecular level. More importantly, the investigators found patients with gene expression patterns that permitted them to separate B-cell lymphomas into two groups: one group that resembled germinal center B cells and another that resembled activated B lymphocytes. Although these subgroups had identical histology, patient survival was significantly altered by the gene expression patterns. The lymphomas that exhibited gene expression patterns similar to germinal center B cells had a 5-year survival of 76%, whereas lymphomas that demonstrated gene expression patterns similar to activated B cells had a 5-year survival of 16%.

The promise of expression arrays to help define the prognosis of tumor types has also been used effectively in classifying breast cancers according to risk of metastasis. Use of microarray analysis of gene expression patterns permitted grouping of patients into high- and low-risk groups with greater accuracy than currently used clinical parameters.^45,⁴⁶ Although these investigations were performed with high-density microarrays containing thousands of genes, the investigators found that expression levels of just 70 genes were sufficient to distinguish risk groups.⁴⁵ The authors believed that these results indicate the propensity to metastasize was an inherent genetic property of certain tumors and that this was not necessarily something that developed late in tumorigenesis. In the near future, these tools may be used to select patients who will require adjuvant therapy and to spare patients in the low-risk group who will not benefit from therapy.

Proteomic and metabolomic studies have also been utilized to characterize cancers. As described earlier (in the Metabolomic section), a specific metabolomic signature has been identified that is now used diagnostically to identify patients with breast and prostate cancer. Proteomic approaches have also been utilized in cancer research. Such studies have generally focused either on better understanding the cancer process with the goal of identifying novel therapeutic targets or on identifying biomarkers that can be used clinically. Proteomics studies have provided significant insight into understanding the mechanism of cancer development and currently much effort is targeting the use of proteomics to identify biomarkers that can be utilized in clinical oncology.^47,⁴⁸

< div class='tao-gold-member'>

Only gold members can continue reading. Log In or Register to continue