Melbourne Brain Genome Project : Introduction

The background to the Melbourne Brain Genome Project (MBGP)
The aims of the MBGP
Why choose to study the brain?
How valid is the mouse as an experimental model for human?
How does Serial Analysis of Gene Expression (SAGE) work?
What are the advantages of SAGE?
Which groups comprise the MBGP?
Syndromes, disorders, diseases and models.
A systematic comparison with SAGE and microarrays
Flow on benefits to the research community
Future plans

The background of the MBGP.

The full version of the article is available at BestWritingService.com.

Gene expression analyses quantify global temporal and spatial patterns in gene expression. Understanding these patterns and how they relate to those of other genes and respond to, for example, behavioural alterations or onset of disease is of enormous interest, hence the great interest in "global" gene expression technologies such as cDNA microarrays and SAGE. With approximately 90-96% of the mouse genome available in the current mouse assembly, it is possible to exploit genome-scale approaches to the study of the brain. The Melbourne Brain Genome Project (MBGP) uses murine models for quantifying gene expression in the brain, enabling correlation of expression with different developmental, pathological, or functional states. The dominant gene expression technology that the MBGP is using is SAGE.

MBGP: neural development, Down syndrome, neurodegeneration.

The MBGP has five main aims:

to amass catalogues of genes representing developmental stages, functional regions and stem-cell derived neurospheres of normal and mutant mouse brain
to reveal gene expression profiles characteristic of Down syndrome mouse models
to build and compare databases of expressed genes in mouse models of neurodegenerative diseases
to systematically compare SAGE and microarray data and utilize SAGE data to validate microarray analyses
to systematically describe the molecular anatomy of previously undiscovered genes during brain development and to test the function of a subset of these genes using a variety of techniques including gene knockout technology

Why choose to study the brain?

Our brains make us what we are, both as individuals and as a species. The brain presents unique challenges to the study of how gene expression is manifested in specialized cell form and function. The brain possesses a complex array of cell types falling into two broad groups (neurons and glia). Importantly, groups of neurons that appear morphologically similar can have very different functional properties and molecular phenotypes, providing an opportunity to study the problem of how gene expression regulates the development, structure and function of individual cells. In addition, the brain displays unparalleled sophistication of function, being responsible for sensory processing, learning and memory, cognition and reasoning, emotion and coordination of motor and autonomic activity. Mounting evidence suggests that these higher-order functions are dependent on gene expression alterations.

An understanding of brain development is unlikely to advance without consideration of the multiple gene sets that are activated during complex morphogenetic events including long-distance migration from germinal zones into specific layers and regions; interactions with radial glial scaffold, other migrating neurons and the surrounding extracellular matrix; reciprocal innervation with specific targets, and juggling of the above with progressive lineage restriction and cellular differentiation programs. The MBGP aims to provide a molecular inventory of genes that are expressed in developing and adult brain tissues in defined space and time. Expression profiles derived from different sources of neural tissue will reveal molecular asymmetry, enabling detection of new genes present in one domain but not in another and uncovering different complexities of gene expression in different brain territories. SAGE libraries, containing information on sequence abundance and complexity, can be further analysed for higher order correlations using appropriate algorithms. The MBGP plans to address the molecular bases of brain regionalization by comparing libaries from various time points including embryonic (ganglionic eminences), E12, and adult brain regions. Comparison of regional gene expression will provide a foundation for higher resolution analysis of particular brain nuclei or cell types and pave the way for system level interpretations. SAGE technology can also be applied to the study of the transcriptional networks activated in response to particular developmental signals. The MBGP plans to conduct such studies with two mutant mouse models; p75 and disabled-1. The p75 neurotrophin receptor mediates the death of neural cells, including stem cells, during development and in the damaged and diseased nervous system (Coulson 1999, Coulson 2000). Preliminary studies indicate that p75 is likely to be involved in maintaining the crucial balance between neural precursor proliferation and apoptosis however the components that make up the downstream signaling pathway are unknown. The consequences of inactivating the p75 gene on the molecular profile of stem cell neurospheres (Rietze 2001) will be assessed. Disabled-1 (Dab1) is part of the Reelin signaling pathway, important for directing neuronal migration and positioning in the developing cortex (Rice 1999). Mutations in Dab-1, or in other members of the Reelin pathway, result in inversion of cortical layers however the transcriptional consequences of disrupting the Reelin signaling pathway have not been analyzed. Recent data from the Tan laboratory (Hammond 2001) suggests that Dab-1 activation leads to increased neuron-neuron adhesion and the MBGP aims to test this hypothesis by comparing the cortical transcriptomes of Dab1 -/- mutants with those of wild-type mice. Similar SAGE strategies have been spectacularly successful, revealing previously unknown components of tumor suppressor signaling pathways (Polyak 1997, He 1998).

How valid is the mouse as an experimental model for human?

Differences in gene expression between developmentally regulated mouse and human orthologues have been demonstrated (Ross 2000), however preliminary comparisons of the mouse and human brain transcriptomes show that there is good correlation for highly expressed genes in both transcript identity and abundance (Fougerousse 2000). With the completion of both the human and mouse genomes, it will be possible to capitalise enormously on SAGE data, both pre-existing and that to be generated. Detailed comparisons will be possible and the absolutely quantitative nature of SAGE data will be important in determining the validity of mouse global gene expression studies for extrapolation to human. A longer term goal is that the genes identified may be possible targets for therapeutic intervention. Translation and comparison of our mouse studies to human neurological diseases will bring our analyses closer to direct application.

How does Serial Analysis of Gene Expression (SAGE) work?

SAGE works by two principles (Velculescu 1995): first, a short tag of 10 base pairs acts like a bar-code, containing sufficient information to distinguish between 1,048,576 transcripts provided the tag is taken from a defined position in the mRNA; second, by joining the tags together for sequence analysis, SAGE provides for rapid and automatable data collection and analysis (Velculescu 1995).

What are the advantages of SAGE?

As SAGE relies on sequencing to identify genes it may be considered as a variant of expressed sequence tag (EST) analysis. While large scale EST sequencing is "somewhat" quantitative and also an effective approach to gene discovery, it is laborious due to the length of the clones and the high level of redundancy (more than 2 million human ESTs have been found to collapse by UniGene clustering to only approximately 86,000 unique genes; dbEST release, October, 2000). In comparison, by reducing the sequencing effort to the minimum sequence length required for unambiguous transcript identification, SAGE results in an approximately 40 fold increase in efficiency over EST sequencing. Techniques such as cDNA (Schena 1995) or oligonucleotide arrays (Lockhart 1996), have been used to compare the expression of thousands of genes in a variety of tissues but are limited to analysing only previously identified transcripts. New methods and ever-increasing numbers of molecules to array make microarrays very much an evolving technology and therefore it can be difficult, if not impossible, to compare between microarray experiments. Microarrays are also a "closed system" in that it is necessary to know what is on the microarray. It is possible that even the developing microarray technologies will not have the sensitivity to detect and quantitate the preponderance of low abundance transcripts given hybridization kinetic limitations.

The power of SAGE:

SAGE does not depend on the prior availability of transcript information (Velculescu 1995).
SAGE is an absolutely quantitative methodology.
SAGE can be considered to be "global".
There is a high probability that transcripts detected by SAGE that cannot be assigned an identity (e.g. rare transcripts) will correspond to previously unknown genes. [A SAGE tag sequence has sufficient information to generate longer cDNA fragments for gene identification (Polyak 1997)].
SAGE data is in a standard, digital format allowing quick and easy comparison of data sets from within lab experiments and from external groups.
The data files do not have large storage requirements.
Data can be stored and reanalysed at any stage in the future as new information is acquired.
A number of SAGE data sets are freely available and a system is in place to enable all future SAGE data to be desposited in the public domain (NCBI SAGEmap).
SAGE is high-throughput, something that EST sequencing lacks. The MBGP is aided in this issue by the close proximity of the Australian Genome Research Facilty (AGRF).
SAGE provides the ability to uncover higher order organization patterns of chromosomal arrangement (e.g. RIDGEs: Regions of IncreaseD Gene Expression in the Human Transcriptome Map Caron 2000)

Current issues with SAGE, for example the identifying of unknown tags, will be reduced as genome sequencing nears completion. Many genes not currently in the databases will be predicted from accumulating genomic sequence and corresponding cDNAs will eventually be arrayed.

Which groups comprise the MBGP?

The MBGP is a collaboration between the laboratory of Dr. Hamish Scott at The Walter and Eliza Hall Institute of Medical Research, the Brain Development team under Associate Professor Seong-Seng Tan at the Howard Florey Institute of Experimental Physiology and Medicine, and Professor Colin Masters group at the Department of Pathology at The University of Melbourne.
Dr Scott has contributed extensively to the genetic, physical, and transcription maps of human chromosome 21 (HC21), including participation in the cloning and functional characterization of 24 HC21 genes, leading to the identification of genes for two H21 monogenic disorders. The identification of the genes on HC21 is essential in order to fully understand the molecular pathogenesis of Down Syndrome.
The Tan group has demonstrated expertise in the SAGE technique, having published the results of a small-scale study identifying molecules potentially involved in the growth and migration of rat C6 glioma cells (Gunnersen 2000). In line with the major research focus of the Tan laboratory, to understand the molecular processes underlying patterning and regionalization of the developing mouse brain (Tan 1998), SAGE analyses of genes expressed in the developing neocortex are currently underway. A total of more than 40,000 SAGE tags have been sequenced from two libraries prepared from neocortex at two different time points (embryonic day 15 [E15] and post-natal day 1 [P1]). Comparison of these SAGE libraries has revealed 153 differentially expressed genes (p-chance <0.01), one third of which represent unknown genes. Of the known genes, transcription factors of the bHLH, zinc finger, forkHEAD box and Sox gene families featured prominently, several of which (e.g. neuroD2, Id2 and brain factor 1) have recognized roles in neurogenesis and patterning of the cortex. The majority of differentially-expressed genes tested have been confirmed by Northern blot and the detailed expression patterns of the complete set of differentially expressed genes are being characterized by in situ hybridization on developing brain tissue.
The Masters laboratory has worked on the mechanisms of neurodegeneration for more than 20 years. The focus has been on Alzheimer's and Creutzfeldt-Jakob diseases, with analyses of the amyloidogenic proteins which accumulate in the extracellular space (as plaques). More recently, the laboratory has extended its studies into the proteins which aggregate in the other major neurodegenerative conditions. Facilities within the laboratory are wide-ranging, from basic molecular genetic techniques through to human brain banking and clinical trials. Thus, results from the proposed gene expression profiles of transgenic models of these disorders can be rapidly translated into experiments on human tissues and systems.

Syndromes and diseases under study. Mouse models of neurodegenerative disorders

Over the past five years, the molecular bases of the major neurodegenerative diseases have become more clearly defined, and for each disease, the corresponding mouse model has been created based on gene targeting or transgenic technologies. The major disease categories encompass the commonly recognized sub-types of degenerations associated with the aging human brain. Yet despite the discovery of the molecular lesion in each category, no satisfactory therapeutics have emerged. A strategy aimed at identifying key genes or pathways in these diseases is to examine the downstream effects of the primary molecular lesion at a very early stage of preclinical evolution using SAGE, a technique well-suited to this approach. A primary outcome of this approach would be to define the common elements of down-stream effects across all neurodegenerative phenotypes, thus indicating a possible therapeutic strategy applicable across the whole spectrum of disorders.

The MBGP is working with a number of mouse models of neurodegenerative disorders. The main models are:
Alzheimer's disease: several transgenic lines now exist which reproducibly form abundant extracellular Ab amyloid deposits although none of the models fully replicate the neurofibrillary tangles and neuritic changes which are also an integral part of the human condition. A double transgenic line (APP x PS), in which over-expression of the substrate (APP) combined with abnormal processing (PS) results in earlier phenotypic onset, is established in our colony. By 16 weeks of age, doubly transgenic mice display both increased brain Ab and behavioural abnormalities (Holcomb 1998). Cortical tissue will be harvested from these mice at 8 weeks for SAGE analysis.
Creutzfeldt-Jakob Disease: the various forms of experimental transmissible spongiform encephalopathy (TSE) represent very authentic models for this infectious neurodegenerative and neurogenetic disease (Prusiner 1998). The Masters lab uses routine inoculation of a mouse-adapted strain of a Japanese Gerstmann-Straussler-Scheinker syndrome isolate (Fujisaki strain) to generate a reproducible incubation period of 120 day. Mouse cortices will be taken for SAGE 60 days after inoculation, when the abnormal isoforms of PrP are first detectable by Western blot.
Parkinson's disease: transgenic mice over-expressing either wild type or mutant forms of human a-synuclein (van der Putten 2000, Masliah 2000) develop aggregates of a-SN in association with degenerative changes in the dopaminergic systems. These mice provide valuable models of idiopathic Parkinson's disease and multiple system atrophy, although not displaying a full phenotype. The downstream consequences of a-SN aggregation are poorly defined - it will be of immense interest to determine whether they mirror effects seen in the other types of neurodegeneration. SAGE libraries will be constructed from the striatum of 6 month old mice.
Motor neuron disease (amyotrophic lateral sclerosis): motor neuron disease is far more restricted in its bulbar/spinal cord phenotype. The mouse models which over-express mutant SOD-1 closely match the human disease in this respect (Gurney 1994, Dal Canto 1997). Aggregation of SOD-1 with ensuing abnormal redox chemical reactions is expected to generate a defined sequence of downstream events to be analysed by SAGE. To encompass the restricted topographic distribution of the lesions in this model, we will collect spinal cord and lower brain stem from 4 month old mice. Huntington's disease and selected forms of spinocerebellar ataxias: The discovery of polyglutamine aggregates in this class of neurogenetic disorder immediately allowed a synthesis of ideas underlying the neurodegeneration in these conditions. Although initially thought to be primarily nuclear in their distribution, polyglutamine aggregates are also formed in the neuronal cytoplasm. Mice transgenic for exon I of the huntingtin gene containing an expanded CAG sequence display inclusions which may affect the expression of genes important for neuronal function and appear to result in non-apoptotic neurodegeneration (Turmaine 2000). Striatal tissue for SAGE from appropriately aged animals (depending on transgenic mouse line used) will be available through collaborators. Fronto-temporal dementias: This group of illnesses are still being categorised according to the patterns of Tau protein aggregation. In the first instance, however, the model in which four-repeat Tau is overexpressed (Goedert 1999) provides an approximation for the Tau aggregates seen as Pick bodies and straight filaments in Picks disease and progressive supranuclear palsy, respectively. Mice of this line are also available for SAGE studies through collaborators.

Trisomy 21 or Down Syndrome (DS) (OMIM 190685) occurs at a frequency of approximately 1/700 live births. In contrast to the variable phenotypic penetrance in many organs, mental retardation (MR) and early onset Alzheimer disease (AD) are invariably present in brains of all DS patients such that DS is the most common genetic cause of mental retardation (Epstein). Changes in the neuropathology, neurochemistry, neurophysiology, and neuropharmacology of DS patient brains indicate that there is abnormal development and maintenance of CNS structure and function. DS is thus a model for mental retardation with abnormal development and neurodegeneration or premature aging. Importantly, two of the genes known to be involved in neurodegenerative disorders, APP and SOD1, map to human chromosome 21 (HC21). To understand fully the molecular pathogenesis of DS, it is necessary to identify all HC21 genes. The DS SAGE studies have provided the first global analysis of gene expression differences in aneuploid versus normal cells, implicating certain genes as "directly" involved in several DS phenotypes (Chrast 2000) and providing pointers to diagnostic or prognostic markers and possible targets for therapeutic intervention.
A number of viable mouse models of DS trisomic for different parts of mouse chromosome 16 (MMU16 syntenic to HC21) have been produced. While all show learning and behavioural abnormalities, these phenotypes are the most pronounced in the Ts65Dn mice which are trisomic for the largest region of MMU16 (App to Mx1, Reeves 1995) (Hernandez 1999, Baxter 2000). Ts65Dn is better characterized phenotypically than the human syndrome due to availability of tissues for testing. An increasingly long list of Ts65Dn phenotypes can be directly related to the human syndrome including skeletal defects, developmental delay, learning and behavioural deficits, and age-related degeneration of neurons (Reeves 1995). Imaging and neuropathological studies have shown few differences between the brains of DS and normal neonates and similar observations have been made for the Ts65Dn mouse. However, both DS and Ts65Dn brains show a reduction in cell number and volume in the hippocampal dentate gyrus (Insausti 1998) and the cerebellar internal granule and molecular layers (Baxter 2000) and a reduction in excitatory (asymmetric) synapses in the temporal cortex at more advanced ages (Kurt 2000, Wisniewski 1990). These neuropathological changes are likely to be associated with the impaired memory, sensory and motor function seen in both DS and Ts65Dn mice (Escorihuela 1998, Martinez-Cue, Holtzman 1996). Additionally, age-related degeneration of septohippocampal cholinergic neurons and astrocytic hypertrophy, markers of Alzheimer’s disease pathology, are seen in elderly DS individuals (Holtzman 1996). Thus Ts65Dn mice may be used to study both developmental and degenerative abnormalities in the DS brain.

Systematic comparison of SAGE and microarrays and generation of a control for microarrays

The MBGP plans to use the complementary gene expression technique of microarrays in combination with SAGE, which will enhance the utility of both techniques. A systematic comparison of SAGE results to those generated using AGRF microarrays will be performed using the same samples used in the SAGE analyses. Recent comparisons of SAGE with Affymetrix chips (Ishii 2000), filter arrays (Nacht 1999; Lyle, Chrast, Antonarakis and Scott, unpublished data) and microarrays (Chrast, Antonarakis and Scott, unpublished; Blackshaw 2000) have shown that the techniques have similar sensitivity, although microarrays tended to underestimate the fold difference in expression determined by SAGE or Northern blot (Blackshaw 2000). The MBGP aims to make microarray expression data absolutely quantitative and comparable between different experiments by employing SAGE data. Reliable and precise microarray gene expression profiling relies on comparison of hybridization efficiency between an experimental and a reference RNA samples. Differences in hybridization intensity between these RNA targets reflect relative differences in gene expression levels. Using the same reference RNA in different microarray experiments provides a common denominator for accurate and reproducible comparison of gene expression data (Eisen 1999, Lash 2000). For comparison of multiple experimental RNA samples (hybridization experiments), a common reference RNA sample provides an essential internal-control and allows comparisons to be made among large numbers of samples (experiments) and can thus dramatically increase the power of a microarray experiment. The MBGP will generate large quantities of whole C57BL/6J adult mouse brain RNA as a reference RNA sample for microarray gene expression profiling from pooled samples from routinely sacrificed mice. The mouse reference RNA will be made available as a reference sample through the AGRF allowing inter-laboratory comparisons of microarray data in Australia.

Velculescu et al. (14) showed that the number of new unique transcripts identified approached zero at the level of 600,000 tags. We will generate a "saturated" transcriptome of 600,000 tags from the whole C57BL/6J adult mouse brain reference RNA. Microarray elements representing a subset of genes representative of different expression levels in the SAGE transcriptome (600,000 tags) will then act as additional microarray controls. Use of the reference RNA sample should allow conversion of microarray data to absolute expression levels as well as thorough monitoring of sensitivity. With the advent of multiple colour microarray scanners and additional flurophores, it may become possible to include three (or more) RNA samples in a microarray experiment, the reference RNA and the two (or more) samples for which an immediate comparison is desired. Addtionally, the saturated whole mouse brain transcriptome will allow ready identification of transcripts that are specific to the more defined areas of the brain to be studied.

Flow-on benefits to the research community

Upon publication all SAGE data will be made available to the public. This data will provide stimulus for hypothesis-driven research on genes of neurological importances is crucial. In keeping with the vast majority of SAGE libraries constructed to date (SAGE 2000, Baltimore, MD, USA, Sept.18-20), we will continue to use the anchoring enzyme NlaIII and the tagging enzyme BsmF1 during SAGE library construction. In this way, the value of newly generated SAGE data is enhanced by access to other large data sets for mouse The MBGP is pleased to offer assistance to other investigators in the production of SAGE libraries by supplying proven reagents and expertise.

Future plans

In order to more precisely define the gene expression changes associated with the neurodevelopmental defects seen in DS/Ts65Dn, we propose to conduct SAGE analyses of discrete brain regions, namely hippocampus, cortex and cerebellum at three developmental stages (P1, P15 and P30). In addition we will follow the subsequent neurodegeneration seen in DS/Ts65Dn by comparison of these libraries with those generated from equivalent regions of adult mice. Ts1Cje mice, which are trisomic for a region of MMU16 from Sod1 to Mx1, may be an appropriate model mainly for the neurodevelopmental defects in DS while Ms1Ts65 mice (3 copies of MMU16 from App to Sod1 (both implicated in neurodegenerative diseases – see below) may be used to study the neurodegenerative aspects of DS. To dissect the contribution of these two chromosomal segments to the DS/Ts65Dn phenotypes, gene expression profiles of hippocampus, cortex and cerebellum from these two mouse models will be analysed at the same time points as for Ts65Dn using microarrays from the AGRF. The SAGE data from the Ts65Dn mouse line will serve as a framework for the interpretation of the microarray results.

Click here for the original press release.

Last modified on the 28th January 2004.
For website queries, please email

Welcome to the Melbourne Brain Genome Project