[3dem] Re: creating models from PDB files (Ozan Oktem)

Ozan Öktem ozan at oktem.se
Fri Jul 29 11:57:13 PDT 2011


28 jul 2011 kl. 21.00 skrev 3dem-request at ncmir.ucsd.edu:
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 27 Jul 2011 15:47:51 -0400
> From: David Gene Morgan <dagmorga at indiana.edu>
> Subject: [3dem] creating models from PDB files
> To: 3dem at ncmir.ucsd.edu
> Message-ID: <4E306B67.3070404 at indiana.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Hi,
> 
> 	Has anyone done any sort of critical evaluation of the different 
> programs that will read PDB files and generate volumes?  I know that 
> there are lots of ways to do this, and that different programs appear to 
> do fairly different things.  But does anyone have good reasons for 
> believing that one program is better than another?
> 
> -- 
>                  David Gene Morgan
>                   cryoEM Facility
>                   320C Simon Hall
>            Indiana University Bloomington
>                812 856 1457 (office)
>                812 856 3221 (EM lab)
>             http://bio.indiana.edu/~cryo

We in Stockholm have studied this problem in the context of TEM simulation and below is an account on what we came up with. Since this is a matter rarely discussed in the EM literature, I thought our experiences could be of interest to the community. More of this discussion can be found in our recent paper [11] which also provides links to TEM simulation software (incl. source code) that among other can also be used for creating volumetric models from PDB files in a physically correct manner.

=====
The problem of properly translating a PDB structure into a volume is highly dependent on what one seeks to do with the volume. In my reply I will assume that the volume will be used to represent the scattering properties of the molecules in a specimen when imaged by a TEM. This is relevant if one e.g. seeks to use the volume as input for a TEM simulator or for docking an X-ray structure (given by a PDB file) into an EM-structure obtained from experimental data. 

The basic scientific problem is thus to generate a representation of the the scattering properties of the specimen given some specification. Since my focus here is to use the volume in a context relevant for TEM imaging, the way the representation is generated depends on the model one assumes for the electron-specimen interaction. We will here assume that the electron-specimen interaction is modelled by the scalar Schrödinger equation, thus treating the specimen classically and the incident electron quantum mechanically. Within this context, the scattering properties of the specimen are fully defined by a complex valued function. The real part of this function is proportional to the electrostatic potential and is responsible for the phase contrast in TEM images. The imaginary part is proportional to the absorption potential and is responsible for the amplitude contrast in TEM images. 

Type of specifications of a specimen
----------------------------------------------
In the context of TEM imaging of molecules embedded in some background medium, there are two ways of specifying a specimen:

Atomic model: A specification of all atoms (position and type of their nuclei) in the specimen incl. other relevant information, such as Debye–Waller factors (thermal vibration). This information is typically contained in a PDB-file.

Average composition: A specification of the average composition of the involved molecules together with other information, such as pH value and ionic strength, related to the bulk matter. This is typically the way the background is specified.

Challenges
---------------
Now, there are basically two challenges associated with generating an appropriate volume given a specification of a specimen. 
1. Generate the electrostatic and absorption potentials independently using the same specimen specification.
2. Specimen specification is often mixed, involving both atomic model and average composition. One needs to properly generate and fuse portions of the electrostatic and absorption potentials obtained from atomic specification with portions obtained from a specification of average composition.

Generating the volume from a PDB file
-------------------------------------------------
Let us consider the original question by David, namely how to generate the electrostatic potential from an atomic specification contained in a PDB file. This is essentially the problem of calculating the electrostatic force fields. Due to its wide applicability, a plethora of methods and approaches have been developed for this purpose. From a computational viewpoint there are essentially two type of approaches, the "direct" and the "indirect" approach. In the former the electrostatic potential is modeled directly from the atomic model provided by the PDB-file whereas in the latter, one first models the electron density function given the atomic model provide by the PDB-file, and then the electrostatic potential is obtained by solving the Poisson--Boltzmann equation [7]. To properly understand the relation between the direct and the indirect approach and to determine appropriateness of either approach, it is necessary to consider the various origins of the electrostatic potential and when they are important to account for. There are essentially three different contributions to the scattering potential:

a) The distribution of charge within each atom. Since the positive charge on the nucleus is very concentrated, while the negative charge on the electrons is more spread out, there is a positive potential within each atom. This potential drops off very rapidly with increasing distance from the atom.

b) Net charges of the atoms in a molecule. Due to the polarization of covalent bonds, the atoms in a molecule generally have a small net charge. The potential due to these charges drops off less rapidly with increasing distance from the atom.

c) Ions in the solvent surrounding the molecule. The solvent contains equal amounts of positively and negatively charged ions, whose electrostatic potentials usually cancel each other. However, in regions where there is, say, a positive potential from a molecule, negative ions will be attracted and positive ions repelled, so there is a surplus of negative ions. The effect of the ions in the solvent is therefore to partially cancel the long-range potential of a molecule.

Inside a molecule, the dominating contribution to the potential is from (a). This is the most important aspect of the potential to include in a simulation of electron-specimen interaction since the electrons in a TEM mainly scatter against the nuclei in the specimen. In fact, on average about 95% of the potential relevant for electron scattering comes from (a) as stated in [1]. Computationally, (a) is most conveniently and accurately handled by the direct approach. Direct approaches for calculation of the scattering potential are quite popular within material sciences for modeling crystalline specimens, see sections~2.4--2.7 &  3.1 in [1], [2] and [3]. The starting point is to consider the nuclei as classical objects that are modeled as point-like particles at specific positions. This stems from the Born-Oppenheimer approximation, see p. 54 in [1],  which then allows us to model the specimen as in equation (1.19) on p. 7 in [1]. The resulting model is however impractical since it requires that one specifies the position and type of all nuclei and all electrons in the specimen. A useful approximation is given in equation (1.20) on p. 8 in [1]. The contribution from each nuclei can be modeled separately once the singularity at the centre of the nuclei is regularized (truncated potential), see e.g. p. 385 in [4]. One can now parametrize the electrostatic potential and fit it to experimental scattering data (scattering factors). The contribution from the shell electrons can either be handled phenomenologically by multiplying the contribution from the nuclei by a screening function, as in pp. 1464--1468 in [5] and section 3.3.3 in [6], or one can attempt at modeling the potential contribution from the shell electrons [2]. See also the nice overview provided in chapter 38 in [3]. The absorption potential (the imaginary part) is expressible in terms of the elastic and inelastic cross sections which are also tabulated for each type of atom. The details of this part are worked out in [11].

A remark is in order here. Consider the case where the volume is to be used as input for simulations of molecular interactions, say in-silico protein interactions by means of molecular dynamics. Then one must primarily account for (b) and (c), whereas (a) can be safely neglected. The reason is that the dominating contributions to the potential outside the molecule are from (b) and (c) whereas the contribution from (a) drops off very fast. For modeling (b), the net charges of the atoms are needed. These are specified (along with atomic coordinates) in PQR files, and must therefore be properly computed, e.g. by programs such as PDB2PQR [8]. Poisson--Boltzmann solvers, such as APBS [9],  that only take into account (b) and (c)  may be useful for simulation of molecular interactions, but is completely unsatisfactory for TEM simulations. 

On a final note, one possible approach to obtain a general method including (a), (b) and (c) given above is to include (a) into the Poisson--Boltzmann equation. A straightforward approach is to tabulate the electron density of each atom and then compute the potential by solving the Poisson--Boltzmann equation. However, this would be much more computationally expensive and not necessarily more accurate since it would require a discretization on a very fine mesh in order to resolve the variation of the potential within each atom. The discretization of the Poisson--Boltzmann equation is already a tricky numerical issue as clearly discussed in [8]. It is e.g. inappropriate to use finite differences for solving Poisson--Boltzmann, instead adaptive finite element methods are to be preferred [10].  In this context it is probably a much better method to compute the contribution from (a) by a direct method and add it to the solution representing (b) and (c). Contributions from (b) could probably be handled by either the direct and the indirect method whereas the contribution from (c) can only be determined by solving an equation involving both electrostatic potentials, and the concentration of ions, and therefore inherently falls in the category of indirect methods.

Generating the volume from an average composition
-------------------------------------------------------------------
Large parts of a biological specimen are only given by means of an average description, e.g. the buffer in an in-vitro specimen. For these parts one has to generate the volume by means of an averaging procedure based on an atomic model for a small portion. The details are worked out in [11].

Comments about software for generating volumes
--------------------------------------------------------------------
A problem is that there is no method for controlling how well the volume represents the scattering properties of a specimen, so one can not directly validate a method for volume generation. The most common approach is to compare simulated and experimental TEM images, so one validates the software for generating volumes together with the TEM simulator.

If the goal is visualization only, then one probably do not have to generate the volume properly from a physics viewpoint. On the other hand, if the volume is to be used for TEM simulation or for docking, one needs to be more careful. Most software we observed do not properly generate the electrostatic and absorption potentials from a PDB file. The absorption potential is left out or generated as a multiple times the electrostatic potential. Furthermore, the approaches used to generate the electrostatic potential are often purely phenomenological, leading to a situation where one must tune nuisance parameters. Xmipp has an approach based on [1] that is similar to ours in [11].  A quick look however in the source code for Xmipp reveals that they do not generate the absorption potential and they also have some minor mistakes in the way they generate the electrostatic potential.

References
----------------
[1] Peng, L.-M., Dudarev, S. L. and Whelan, M. J. High-Energy Electron Diffraction and Microscopy, Oxford University Press, 2004.

[2] Le Bris, C. and Lions, P.-L. From Atoms to Crystals: A Mathematical Journey, Bulletin of the American Mathematical Society, 43:3, 291-363, 2005.

[3] Young, D. C. Computational Chemistry: A Practical Guide for Applying Techniques to Real-World Problems, Wiley-Interscience, 2001.

[4] Reinhardt, G. Quantum Electrodynamics, 3rd ed, Springer Verlag, 2003.

[5] Hawkes, P. W. and Kasper, E. Principles of Electron Optics - Volume 3. Wave Optics. Academic Press, 1994.

[6] Fultz, B. and Howe, J. M. Transmission Electron Microscopy and Diffractometry of Materials, Springer Verlag, 2002.

[7] Baker, N. A. Poisson-Boltzmann Methods for Biomolecular Electrostatics, Methods in Enzymology, 383, 94-118, 2004.

[8] Dolinsky, T. J., Czodrowski, P., Li, H., Nielsen, J. E., Jensen, J. H., Klebe, G. and Baker, N. A. PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations, Nucleic Acids Research, 35, W522--525, 2007.

[9] Baker, N. A., Sept, D., Joseph, S., Holst, M. J. and McCammon, J. A. Electrostatics of nanosystems: application to microtubules and the ribosome, PNAS, 98, 10037--10041, 2001.

[10] Chen, L., Holst, M. J. and Xu, J. The Finite Element Approximation of the Nonlinear Poisson-Boltzmann Equation, SIAM Journal on Numerical Analysis, 45:6, 2298-2320, 2007.

[11] Rullgard, H., Ofverstedt, L-G, Masich, S., Daneholt B, and Oktem, O. Simulation of transmission electron microscope images of biological specimens, Journal of microscopy, 2011.



--
Ozan Öktem
Granlidsvägen 20
SE-172 75 Sundbyberg, Sweden
Phone: +46-(0)8-29 58 25
Mobile: +46-(0)733-52 21 85
E-mail: ozan at oktem.se
Web: http://www.oktem.se

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mail.ncmir.ucsd.edu/mailman/private/3dem/attachments/20110729/4f1f4b99/attachment-0001.html


More information about the 3dem mailing list