1. Random sequences and unique protein structures

Alexei Finkelstein

Physical theory shows that a substantial fraction (~10-4 - 10-8 %) of
random amino acid sequences can form unique stable 3-D structures
under physiological conditions, fold rapidly, and have architectures
typical for protein molecules. The main peculiarity of a "protein-
like" chain is a considerable gap dividing the energy of its lowest-
energy fold from the energies of the other folds; this gap plays
an essential role in thermodynamics and kinetics of folding.


2. How can a protein chain find its unique fold?

The problem of how a protein chain can find its most stable structure
without exhaustive sorting out of all its possible conformations is known
as the "Levinthal paradox". I shall show in the lecture that attaining of
the lowest-energy fold is rapid when it occurs in a vicinity of a
thermodynamic "all-or-none" transition from the coil to the lowest-
energy fold. Such a transition requires an "edited" chain with an
enhanced stability of its lowest-energy fold. In a vicinity of the mid-
transition point, all the mis- and semi-folded states cannot "trap" the
folding since, even taken together, all these states are less stable than
both the initial coil and the final stable fold of the chain. Therefore, a
stable fold can be rather rapidly achieved here via that "nucleation and
growth" folding pathway which provides a continuous entropy-by-energy
compensation in the course of folding, thus providing a low transition
state free energy. In the mid-transition, an N-residue chain
folds normally in ~exp(N 2/3) nsec. Therefore, a 100-residue chain
normally finds its most stable fold within minutes rather than in
10100 psec ~ 1080 years, according to the famous paradoxical estimate of Levinthal.



3. Introduction to protein structure prediction

This is a review of the state of the art in the recognition and
prediction of protein folds from their sequences. I pay a special
attention to physical background of the predictive methods. In
particular, I review the secondary structure predictions and the
"threading" methods used for recognition of protein folds. It is shown
that all the predictive methods can use only some part of the
interactions operating in the chain, and that even their energies are
not known precisely. This is the principal source of errors and
uncertainties. The errors can be reduced by employment of many distant
homologs, but this opens only a possibility to predict a secondary
structure and a generalised folding pattern rather than a particular
fold of a given chain with all details of the fold.

Back to the list of abstracts