图片特写:“争夺”图书馆 | 依法治校
R.Wallace/BioSystems103 (2011) 18–2623 ...[P]rotein folding speeds–now known to vary over more than
eight orders of magnitude–correlate with the topology of the
native protein:fast folders usually have mostly local structure,
such as helicies and tight turns,whereas slow folders usually
have more non-local structure,such assheets(Plaxco et al.,
1998)...
A simple groupoid probability argument reproduces something
of this result.Assume that protein structure can be characterized
by some groupoid representing,at least,the disjoint union of the
groups describing the symmetries of component secondary struc-
tures–e.g.,helices and sheets.Then,in Eq.(4),we take the set
A=∪˛asfixed,with increasing˛representing increased structural
complexity.If channel capacity is also capped by some mechanism,
so that R isfixed also then,the log of the folding rate will be given
as
log[P[Hˇ]]=log
exp[−Hˇ/ R]
˛
exp[−H˛/ R]
=
C(R)−Hˇ
R
,(5)
where C(R)is positive.ˇindexes increasing topological complexity, using some appropriate measure.
The simplest assumption is that Hˇ∝ˇ.Then,using an integral approximation,
P[ˇ]=
exp[−mˇ/ R]
∞
˛=0
exp[−m˛/ R]d˛
=
m
R
exp
−m
ˇ
R
,(6)
and
log[P[ˇ]]=log[m/ R]−mˇ
R
.(7)
Thus one expects,at afixed value of R defining a maximum channel capacity,that
log[folding rate]=C−kˇ,(8) C,k constant and all values positive.
As Ivankov et al.(2003)discuss at some length,one standard index of protein complexity is the absolute contact order(Plaxco et al.,1998):
ACO=1
N
N
L i,j(9)
where N is the number of contacts within6˚A between nonhydrogen atoms in the protein,and L ij is the number of residues separating the interacting pair of nonhydrogen atoms.
Adjacent residues are assumed to be separated by one residue.
Fig.5,adapted from Gruebele(2005),reexpresses data from Ivankov et al.(2003),showing the correlation of the log of the fold-ing rate with fold complexity,measured by the ACO.The upper line estimates folding speed limited only by fold complexity,following Yang and Gruebele(2004),and seems clearly to represent a maxi-mum possible rate distortion function/channel capacity,according to Eq.(8).The molecular species along the lower curve are assumed to be‘frustrated’by an irregular folding funnel,and appear to fol-low a narrow spectrum of relations like Eq.(7),necessarily below the line defined by maximum channel capacity,and necessarily somewhat scattered,according to the variation in R.
It is possible to reproduce something like Fig.5by describing ‘smooth’and‘rough’folding funnels in terms of a Gaussian channel, that is,one in which the signal transmission S0→S f is perturbed by Gaussian noise having a squared-error distortion,so that the rate distortion function has the standard form(Wallace,2010a,b):
R(D)=
1
2
log
2
D
.
(10)
Fig.5.From Gruebele(2005).Correlation of the log of the protein folding rate with
fold complexity.The upper line indicates folding speeds that are limited only by fold
complexity,without the‘frustration’effects of a rough folding funnel.Frustration,in
this model,constrains channel capacity,and hence drives R irregularly lower than
the value implied by the relation for the fastest folders.
R(D)is the rate distortion function at average distortion D,and 2
represents the amplitude of the imposed random noise.A smooth
folding funnel would have little noise.
Plugging Eq.(10)into Eq.(7)gives,over an appropriate range
of parameters,the spectrum of linear relations for log folding rate
shown in Fig.6.D,m and arefixed,andˇand 2increase,as
indicated.
7.Extending the model
As Serdyuk(2007)discusses,many proteins in the cell have no
unique tertiary structure in isolation,although they have a distinct
function under physiological conditions,that is,in partnership.
Thus their conformation is determined not only by their amino
acid
Fig.6.Spectrum of linear relations between log folding rate and increasing topo-
logical complexity for increasing‘roughness’of the folding funnel,as measured by
noise 2for a Gaussian channel.ˇincreases to the right and 2increases downward.