AMBER Archive (2009)

Subject: Re: [AMBER] Distance-covariance and PCA questions...

From: Hannes Loeffler (
Date: Wed Jun 17 2009 - 02:41:42 CDT

On Tue, 2009-06-16 at 00:18 -0400, Cihan Aydin wrote:
> Third, I have read some papers that argued strongly against the validity
> of PCA analysis. The principal concern was that the sampling time
> usually used (the most I have seen was 3ns) was not enough to achieve
> convergence (a nice paper was from Rueda et al. - they used 100ns
> simulations like 30 proteins and argued against the reproducibility of
> MD trajectories from only a slice of the timeframe). If you had any
> personal experience with PCA, what is your opinion about this?

I agree with this view. A few nanoseconds of simulation appear to be
awfully short but I would not necessarily argue that you would need to
do hundreds of nanoseconds in any case. As usual it depends and you
will have to look very carefully if you have obtained convergence.

I have been working with a small protein of around 200 residues.
Comparing the first mode (via the dot product) of 5ns patches of a 30ns
simulation showed that the mode could be reproduced quite well on
shorter time scales. However, this was in the open state where the
first mode clearly dominated over all others. (The mode described a
hinge type motion as expected from the biological function). 30ns of
simulation of the closed state showed that there appeared to be no
clearly dominating modes, i.e. the eigenvalues were of very similar
magnitude. In this case shorter simulation would not reproduce the
results of a longer simulation. It wasn't clear how much simulation
would be necessary to obtain converged results or if that could be
expected at all.

Recently, we have been looking into soluble EGFR (about 1300 residues)
with PCA. Our longest simulation was about 100ns if I recall correctly
but it was not possible to get any meaningful PCA analysis out of it.

Hope that helps,

