[3dem] PDBe introduces a compound-based browser for the PDB archive
Gerard DVD Kleywegt
gerard at xray.bmc.uu.se
Fri Mar 18 14:48:04 PDT 2011
As part of its recent winter update, the Protein Data Bank in Europe (PDBe;
http://pdbe.org) introduced a new, chemistry-based module of its PDB archive
browser (a.k.a. PDBeXplore). It can be accessed at:
As you may (or may not) know, the PDB browser is an interface that enables you
to retrieve and analyse information on subsets of structures in the PDB using
various biological or chemical classifications. Previously released modules
enable browsing of the archive based on the Enzyme Class (http://pdbe.org/ec),
CATH domains (http://pdbe.org/cath), Pfam families (http://pdbe.org/pfam) or
Fasta-based sequence-similarity searches (http://pdbe.org/fasta).
The new compound-based browser allows you to enter the name of a chemical
compound of interest and analyse all the PDB entries that contain that
compound. Once you start typing the name (or three-letter code, if you happen
to know it) of a compound, a drop-down menu will show you matching compound
names and you can select the compound of interest. For instance, if you are
interested in Sildenafil, just start typing the name and once you get to
"sild", the only remaining matching compound is:
(Note: the auto-complete function uses information about synonyms from the
wwPDB chemical component dictionary.)
Select this compound, click on the "Submit" button and the central panel of the
browser will soon be filled with a table of all PDB entries that contain this
compound (currently there are only five). The right-hand panel will contain
more information about the compound you have selected, including a chemical
diagram, formula and SMILES codes.
Note: if you don't know if your compound occurs in the PDB or what its name is,
you can use the search options of PDBeChem - at http://pdbe.org/pdbechem -
including an option to draw a (sub)structure (to do this, click on the "edit"
button for the "Non-Stereo SMILES (Has Sub-Structure)" field in the PDBeChem
In order to demonstrate the powerful analysis options in the compound browser,
select a more abundant compound, e.g. ATP, and hit the "Submit" button again
(or click on this link: http://pdbe.org/compounds?ligand=ATP). The central
panel will show a list of the PDB entries containing the compound you selected.
The information here can be sorted by clicking on any of the column headings in
the table (clicking again reverts the sort order).
You will notice a number of tabs at the top of the central panel - they are
labelled "PDB entries", "Ligands", etc. Selecting one of these tabs gives you a
new "perspective" on the selected set of PDB entries (in this case, all entries
containing ATP or whichever compound you selected):
* PDB entries: this is the default view that the browser will present once you
have selected a compound. To download the entire table as a text file, use the
link in the right-hand panel. If you move your mouse over the PDB code of an
entry, it will show a miniature image of the structure; clicking the link will
open the PDBe summary page for that entry. Clicking on the "view" link will
load the structure in an interactive viewer so that you can study it in detail.
* Ligands: this view displays a table of information about the additional
compounds found in all the PDB entries that contain your compound of interest.
The table is ordered such that the compounds that occur most often are at the
top. Each row in the table gives information about the three-letter code of the
compound, its chemical structure, chemical formula and systematic name. The
second column contains a link to information about the interaction statistics
of the compound with the standard amino-acid types. The link "Get PDB entries"
generates a list of all PDB entries containing both that compound and our
compound of interest.
* Structure folds: this view displays information about the fold families
(based on the CATH classification) encountered in the PDB entries containing
the selected compound. The tab also shows the distribution of CATH classes and
CATH architectures for the selected PDB entries as a pie chart. If you click on
a pie slice (or in the legend), only the appropriate CATH categories will be
shown in the table. By the way, the pie charts can also be printed or
* Assemblies: this view provides information about the possible quaternary
structure(s) of the selected PDB entries. A small table shows how many entries
are monomeric, homomeric and heteromeric, and two (clickable) pie charts show a
further breakdown of the homomeric and heteromeric structures respectively. The
main table in the tab shows the possible quaternary structure(s) for the
entries, together with (for non-monomeric structures) the accessible and buried
surface areas of the complex and the estimated free energy gain upon formation
of the complex.
* Sequence families: this view lists all Pfam families that are present in the
selected PDB entries.
* Organisms: the source organisms found in all selected PDB entries are shown
in a table. The clickable pie charts show the distribution of these organisms
based on superkingdom (bacteria, archaea, etc.) and genus (homo, rattus,
* Publications: this table contains details about the (primary) publications of
all the PDB entries with the selected compound.
* Authors: this tab lists the names of all the authors of the structures
containing the selected compound in the PDB, sorted by the number of those PDB
entries of which they are an author. This information is useful to biologists
and journal editors who wish to get in touch with, for instance,
crystallographers who have solved many structures containing a particular
The information presented by the browser is taken from the PDBe database, which
means that it is always up to date.
Using this browser, it is now child's play to dig up titbits such as:
- the compound that occurs most commonly in entries that also contain ATP is
- about 1 in 10 entries that contain NAD also contain FAD
- 95% of CATH domains occurring in entries with NAD are of the alpha-beta class
- there is only one hetero-hexameric assembly in all the entries that contain
NAD, namely http://pdbe.org/3ket
- Johan Weigelt has deposited more structures of NAD-containing proteins than
Note: currently, the statistics presented by the browser are based on all the
PDB entries that contain your compound of interest, i.e. not only the
macromolecules to which it is actually bound in those entries.
By the way, all the previously released browser modules have been updated
recently to include clickable pie charts and retrieve results much faster than
We welcome your comments, bug reports and feature requests on the compound
browser (and the other browser modules). Please use the feedback button at the
top of any PDBe web page.
Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
gerard at ebi.ac.uk ..................... pdbe.org
Secretary: Pauline Haslam pdbe_admin at ebi.ac.uk
More information about the 3dem