Greenbaum et al 
13
and secondary structures in the vegetative yeast cell.  Other comparisons included looking at 
average biomasses, looking into subcellular localizations and a direct comparison of mRNA 
expression vs. protein abundance. 
Overall Transcriptome and Translatome Similarity: Outliers Against Trend  
 
The overall similarity we find between transcriptome and translatome contrasts somewhat with the 
weak correlation between mRNA expression and gene abundance as shown in figure 2 and reported 
previously (Futcher et al. 1999; Gygi et al. 1999). This reflects the way our system of overall 
categories collects many proteins into robust averages. It shows that variation between proteins is 
not systematic with respect to the categories. For example, individual transcription factors might 
have higher or lower protein abundance than one expects from their mRNA expression, but the 
category transcription factors as a whole has a similar representation in the transcriptome and 
translatome. 
 
We used the reference data sets to compare mRNA expression and protein abundance for the 181 
genes shared between the two sets -- the largest such comparison.  While we found an overall 
correlation between the two data sets, indicating that mRNA expression may be closely related to 
protein abundance, we found some genes that bucked the trend.  Possible explanations for the 
aberrant behavior of some of these outliers are presented. Those outliers that have higher levels of 
protein abundance than expected from their mRNA expression are dominated by alcohol 
dehydrogenases and Glyceraldehyde-3-phosphate (G3P) dehydrogenases.  It is known that G3P 
dehyderogenase forms a bienzyme complex with alcohol dehydrogenase, thus, the similar 
abundance pattern of these two enzymes can be rationalized (Batke et al. 1992). Alcohol 
dehydrogenase is also a stress induced protein in many organisms (Matton et al. 1990; An et al. 
1991; Millar et al. 1994), induced into action when the cell undergoes trauma, thus perhaps 
translated to a higher degree prophylactically (although the expression pattern of another stress-
induced protein (HSP70) shows that this is not always the case). Translation-related proteins are 
more prominent in the outliers, with lower protein abundance than expected from mRNA 
expression. 
 
While it is known that multiple features of an individual mRNA influence its expression and 
regulation, it is presently not clearly understood how.  There are many non-coding regions in each 
mRNA species that are responsible for this regulation.  These include upstream AUG codons 
(uAUGs), both 3 and 5 untranslated regions, upstream open reading frames (uORFs) and the 
overall secondary structure of mRNA.  Presently it is unclear how these act to exert their control 
(Morris & Geballe 2000).   
 
One might conceive of using "outliers" with significantly different transcriptional and translational 
behavior to find consensus regulatory sequences.  One possible method would involve using 
predicted mRNA structures (Jaeger et al. 1990; Zuker 2000) to find consensus structural elements 
in these outliers.  In particular, it might be worthwhile to investigate the secondary mRNA 
structure, to which the yeast translational machinery is known to be sensitive (McCarthy 1998). 
 
The regulation of mRNA stability is certainly an additional factor causing strong disparities 
between gene expression and protein abundance.  Presently, there are many structures within