Greenbaum et al
13
and secondary structures in the vegetative yeast cell. Other comparisons included looking at
average biomasses, looking into subcellular localizations and a direct comparison of mRNA
expression vs. protein abundance.
Overall Transcriptome and Translatome Similarity: Outliers Against Trend
The overall similarity we find between transcriptome and translatome contrasts somewhat with the
weak correlation between mRNA expression and gene abundance as shown in figure 2 and reported
previously (Futcher et al. 1999; Gygi et al. 1999). This reflects the way our system of overall
categories collects many proteins into robust averages. It shows that variation between proteins is
not systematic with respect to the categories. For example, individual transcription factors might
have higher or lower protein abundance than one expects from their mRNA expression, but the
category transcription factors as a whole has a similar representation in the transcriptome and
translatome.
We used the reference data sets to compare mRNA expression and protein abundance for the 181
genes shared between the two sets -- the largest such comparison. While we found an overall
correlation between the two data sets, indicating that mRNA expression may be closely related to
protein abundance, we found some genes that bucked the trend. Possible explanations for the
aberrant behavior of some of these outliers are presented. Those outliers that have higher levels of
protein abundance than expected from their mRNA expression are dominated by alcohol
dehydrogenases and Glyceraldehyde-3-phosphate (G3P) dehydrogenases. It is known that G3P
dehyderogenase forms a bienzyme complex with alcohol dehydrogenase, thus, the similar
abundance pattern of these two enzymes can be rationalized (Batke et al. 1992). Alcohol
dehydrogenase is also a stress induced protein in many organisms (Matton et al. 1990; An et al.
1991; Millar et al. 1994), induced into action when the cell undergoes trauma, thus perhaps
translated to a higher degree prophylactically (although the expression pattern of another stress-
induced protein (HSP70) shows that this is not always the case). Translation-related proteins are
more prominent in the outliers, with lower protein abundance than expected from mRNA
expression.
While it is known that multiple features of an individual mRNA influence its expression and
regulation, it is presently not clearly understood how. There are many non-coding regions in each
mRNA species that are responsible for this regulation. These include upstream AUG codons
(uAUGs), both 3 and 5 untranslated regions, upstream open reading frames (uORFs) and the
overall secondary structure of mRNA. Presently it is unclear how these act to exert their control
(Morris & Geballe 2000).
One might conceive of using "outliers" with significantly different transcriptional and translational
behavior to find consensus regulatory sequences. One possible method would involve using
predicted mRNA structures (Jaeger et al. 1990; Zuker 2000) to find consensus structural elements
in these outliers. In particular, it might be worthwhile to investigate the secondary mRNA
structure, to which the yeast translational machinery is known to be sensitive (McCarthy 1998).
The regulation of mRNA stability is certainly an additional factor causing strong disparities
between gene expression and protein abundance. Presently, there are many structures within