AMBER Archive (2009)

Subject: Re: [AMBER] 60 giga output file .mdcrd file

From: Jason Swails (jason.swails_at_gmail.com)
Date: Tue Nov 03 2009 - 07:49:01 CST


Hello,

Ptraj can do this for you as well. You should know how many frames are in
your mdcrd (since ptraj told you the first time when it stripped the
waters). Suppose there are 50 000 frames in your trajectory file. To split
this into 5 equal chunks, you would use the following ptraj commands:

trajin YOURMDCRD 1 10000 1
trajout chunk1

trajin YOURMDCRD 10000 20000 1
trajout chunk2

trajin YOURMDCRD 20000 30000 1
trajout chunk3

etc. where each of the above is a new ptraj script. This will take quite
awhile for the ptraj 1.2 since, like described earlier, it checks through
the entire mdcrd file before processing. However, if you run these 5
scripts, go to lunch, come back, you should have 5 equally sized, smaller
mdcrd chunks to work with.

Another option is to 'thin out' your mdcrd a bit. With such a large file,
you probably didn't need to save snapshots so frequently. Therefore, if you
wanted to create a new mdcrd that contains every 10th frame (i.e. frames 10,
20, 30, 40, 50, 60, etc.), you probably would not miss much in the
visualization (unless you were trying to visualize events on a timescale
equal to your output frequency). This is easily done by using the ptraj
script

trajin YOURMDCRD 1 9999999999 10
trajout SHORTENEDMDCRD

(This would actually pull frames 1, 11, 21, 31, 41, ... up to your total
number of frames). This is a much less time-consuming option since your
coordinates only have to be read once (rather than one time for each chunk
you want to divide it into).

Good luck!
Jason

2009/11/3 Fernando Martín García <fmgarcia_at_cbm.uam.es>

> Hello Silvia
>
> I had the same problem. My solution is to make a counter for your crd
> files like this:
>
> #!/bin/csh
> set i=1
> while ($i <= n) #n is the number of your file (my advice is to use an
> increase of 15)
> echo "yourfile$i.coord >> todas_las_trayectorias" $i
> zcat yourfile$i.coord.gz >> fileoutput.crd
>
> @ i++
> end
>
> Fer
>
>
> El mar, 03-11-2009 a las 11:24 +0100, Silvia Carlotto escribió:
> > Thanks to all,
> >
> > I use the command Strip in ptraj and I obtaine a 5 Giga file. But now
> >
> > the new problem is that vmd open only 2Giga file.
> >
> > How can I split my mdcrd file in 3 smaller file?
> >
> >
> > I can't understand the command in amber tools
> >
> > thanks to all
> >
> > 2009/11/3 Ross Walker <ross_at_rosswalker.co.uk>
> >
> > > Hi Silvia,
> > >
> > > This sounds normal to me. Currently ptraj goes through and completely
> > > checks
> > > the mdcrd file before it processes it. If you figure that your system
> can
> > > do
> > > maybe 10MB per second sustained then just to run through a 60GB file
> > > without
> > > doing any processing will be 1.7 hours. Try just creating a copy of the
> > > file
> > > and this will give you an idea of the speed of your system, it will
> > > probably
> > > take > 30 mins just to copy. Then figure at least twice this for ptraj
> to
> > > process it for a basic strip command.
> > >
> > > Options are to 1) Get yourself a faster system, probably with multiple
> > > striped raid disks that will allow you to approach something like
> 120Meg a
> > > second.
> > >
> > > 2) rerun your simulation and do not save to the mdcrd file as often.
> > >
> > > 3) break up your mdcrd file into chunks or use ptraj to reduce the
> number
> > > of
> > > frames. This of course will still require you to run through it once.
> > >
> > > AmberTools 1.3 will be released soon and this includes several
> improvements
> > > for ptraj when working with large files, such as removing the initial
> check
> > > etc. It can also do parallel i/o so you can try to take advantage of
> > > parallel file systems such as GPFS or Lustre. In short though, if you
> are
> > > running this on your desktop, rather than a well build supercomputer
> > > connected by fiber to a true (and well designed) parallel file system
> then
> > > you are unfortunately up against the hardware limitations of your
> system.
> > >
> > > If you are using this on some NFS server that you are connecting to
> > > remotely
> > > then I would just take a weeks vacation while you wait for it to
> process.
> > >
> > > Sorry I can't help much more but unfortunately disk speeds have been
> > > largely
> > > flat for the last 8 years while capacity has ballooned.
> > >
> > > All the best
> > > Ross
> > >
> > > > -----Original Message-----
> > > > From: amber-bounces_at_ambermd.org [mailto:amber-bounces_at_ambermd.org]
> On
> > > > Behalf Of Silvia Carlotto
> > > > Sent: Monday, November 02, 2009 9:42 AM
> > > > To: amber_at_ambermd.org
> > > > Subject: [AMBER] 60 giga output file .mdcrd file
> > > >
> > > > Dear user,
> > > >
> > > > I generate a.crd file of 60 Giga (a protein in a water box, 10 ns, n
> > > > tot
> > > > atom ca. 8000).
> > > >
> > > > I am using ptraj to strip water but
> > > >
> > > > the ptraj command is over 30 minutes on
> > > >
> > > > PTRAJ : trajin XXX.mdcrd
> > > >
> > > > checking coordinates : XXX.mdcrd
> > > > Is it a normal?
> > > >
> > > > Is it possible to manupulate with ptraj a file of these dimensions?
> > > >
> > > > I have no other ideas to strip water to generate a movie with vmd.
> > > >
> > > > thanks to help
> > > > _______________________________________________
> > > > AMBER mailing list
> > > > AMBER_at_ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER_at_ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> >
> >
> >
> --
> Fernando Martín García.
> Centro de Biología Molecular "Severo Ochoa".
> C/ Nicolás Cabrera, 1.
> Campus UAM. Cantoblanco, 28049 Madrid. Spain.
>
>
> _______________________________________________
> AMBER mailing list
> AMBER_at_ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>

-- 
---------------------------------------
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Graduate Student
352-392-4032
_______________________________________________
AMBER mailing list
AMBER_at_ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber