Proteins are elaborate biopolymers balancing between contradicting intrinsic propensities to fold, aggregate, or remain disordered. Assessing their primary structural preferences observable without evolutionary optimization has been reinforced by the recent identification of de novo proteins that have emerged from previously non-coding sequences. In this paper we investigate structural preferences of hypothetical proteins translated from random DNA segments using the standard genetic code and three of its proposed evolutionarily predecessor models encoding 10, 6, and 4 amino acids, respectively. Our only main assumption is that the disorder, aggregation, and transmembrane helix predictions used are able to reflect the differences in the trends of the protein sets investigated. We found that the 10-residue code encodes proteins that resemble modern proteins in their predicted structural properties. All of the investigated early genetic codes give rise to proteins with enhanced disorder and diminished aggregation propensities. Our results suggest that an ancestral genetic code similar to the proposed 10-residue one is capable of encoding functionally diverse proteins but these might have existed under conditions different from today's common physiological ones. The existence of a protein functional repertoire for the investigated earlier stages which is quite distinct as it is today can be deduced from the presented results.
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics
- Molecular Biology