[3dem] Distributing PDBx/mmCIF-Formatted Assembly Files

Justin Flatt justin at rcsb.rutgers.edu
Thu May 5 05:46:45 PDT 2022


Starting May 3, 2022, the PDB archive distributes assembly files in 
PDBx/mmCIF format, allowing direct access and visualization of the 
curated assemblies for all PDB entries 
(http://www.wwpdb.org/news/news?year=2022#61f7f8bc8f40f9265109d398).

Previously, PDBx/mmCIF formatted assembly files provided for structures 
were non-PDB compliant, however the coordinates use model numbers to 
differentiate alternate symmetry copies of PDB chain IDs. This method is 
not ideal, nor necessary, for the current archive PDBx/mmCIF format and 
has led to limited use of these files in community software tools. In 
response to this issue and recommendations by the wwPDB advisory 
committee, we are implementing updated, standardized practices for 
generation of assembly files for all PDB entries.

These updated PDBx/mmCIF format assembly files have improved 
organization of assembly data to support usage by the community. These 
files will include all symmetry generated copies of each chain within a 
single model, with distinct chain IDs (_atom_site.auth_asym_id and 
_atom_site.label_asym_id) assigned to each. Generation of distinct chain 
IDs in assembly files are based upon the following rules:

# Chain IDs of the original chains from the atomic coordinate file will 
be retained (e.g., A)
# Assign unique chain ID (atom_site.label_asym_id and 
atom_site.auth_asym_id) for each symmetry copy within a single model. 
Rules of chain ID assignments:

  * The applied index of the symmetry operator
    (pdbx_struct_oper_list.id) will be appended to the original chain ID
    separated by a dash (e.g., A-2, A-3, etc.)
  * If there are more than one type of symmetry operators applied to
    generate symmetry copy, a dash sign will be used between two
    operators (e.g., A-12-60, A-60-88, etc.)

In addition, entity ID and chain ID mapping categories are provided: 
_pdbx_entity_remapping and _pdbx_chain_remapping.

A new directory (https://ftp.wwpdb.org/pub/pdb/data/assemblies/) was 
created for the distribution of these updated assembly files. The 
directory containing the existing assembly mmCIF files for large entries 
has been removed (ftp.wwpdb.org/pub/pdb/data/biounit/mmCIF/ 
<https://ftp.wwpdb.org/pub/pdb/data/biounit/mmCIF/>).

wwPDB asks all PDB users and software developers to review code and 
address any limitations related to PDB assemblies. Sample files were 
made available for testing purposes and to support community adoption at 
GitHub.com/wwpdb/assembly-mmcif-examples 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_wwpdb_assembly-2Dmmcif-2Dexamples&d=DwICAg&c=-35OiAkTchMrZOngvJPOeA&r=L7-zyQ-04fFCMRqzLIOnx7H0exGZHwIQe_wMPuY600I&m=cOM-WPmahdrUaPhXqRn2gInq0xZzh2aqymry8dX94cC5xBX-c2SZvzy2eYCiSX_0&s=irfSN4ACaC_iGGTrBxUw-FutMNLZPc9BwU66m_K3iCs&e= >.

If you plan to use these assembly files for graphical viewing, check if 
your visualization software (e.g., PyMol, ChimeraX, etc.) supports 
instantiation of assemblies directly from atomic coordinate files 
(_struct_assembly related categories), for improved efficiency.

For any further information please email info at wwpdb.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ncmir.ucsd.edu/pipermail/3dem/attachments/20220505/58a0fedb/attachment.html>


More information about the 3dem mailing list