[3dem] Resources for Supporting the Extended PDB ID Format (pdb_00001abc)

Justin Flatt justin at rcsb.rutgers.edu
Tue Jan 9 11:11:54 PST 2024


wwPDB anticipates that all the four character PDB accession codes (PDB 
ID) will be consumed by 2029.

With the continuous growth of PDB archive, wwPDB has revised the PDB 
accession code format by extending its length and prepending "PDB" 
(e.g., "1abc" will become "pdb_00001abc"). This process will enable text 
mining detection of PDB entries in the published literature and allow 
for more informative and transparent delivery of revised data files.

Entries with extended PDB IDs (12 characters) will not be compatible 
with the legacy PDB file format once four-character PDB IDs are 
consumed. wwPDB encourages scientific journals, PDB community and users 
to transition to using the PDBx/mmCIF format and the extended PDB ID 
format as soon as possible.

Resources are available to help PDB users with this transition through 
the wwPDB resource portal page (Extended PDB ID With 12 Characters) [1]. 
This page links to useful resources for handling this change, including 
an FAQ on PDB ID extension [2], materials to learn more about PDBx/mmCIF 
format, and links to other PDBx/mmCIF resources and software tools. As 
the transition phase progresses, more training resources will be added 
to this page.

Additionally, a PDB "beta" archive will be provided during the 
transition phase in 2026. The directory structure of this "beta" archive 
will mirror the data organization of the PDB Versioned Archive [3] in 
the form of 
https://urldefense.com/v3/__https://files-beta.org/pub/pdb/data/entries/_two-letter-hash_/_pdb_accession_code_/_entry_data_File_names___;!!Mih3wA!D_3vIHRoYs3vvPbUs15xgGinikTOR4gs8gGkCZCheLNBa0YkqjPTR0m9Ejd5EJupdQnTcerSZx8vjI2MGGiiJXhg$ . 
The two-letter hash will be based on the n-2 and n-3 characters. For 
example, PDB entry PDB_12345678 will be under /67/. This will maintain 
consistency with the current PDB archive, where e.g. PDB entry 1abc is 
under /ab.

Once all the four character PDB accession codes are consumed, this PDB 
"beta" archive will become the PDB main archive and the current PDB 
archive will be removed.

Download example files containing extended PDB IDs for software adoption 
from GitHub [4].

wwPDB recently announced that PDB three-character Chemical Component IDs 
have been consumed. [5] Five-character alphanumeric accession codes for 
CCD IDs are now issued by the OneDep system.

For any further information please contact us at info at wwpdb.org.

  [1]_Sample extended PDB ID_

Links:
------
[1] https://urldefense.com/v3/__http://www.wwpdb.org/documentation/new-format-for-pdb-ids__;!!Mih3wA!D_3vIHRoYs3vvPbUs15xgGinikTOR4gs8gGkCZCheLNBa0YkqjPTR0m9Ejd5EJupdQnTcerSZx8vjI2MGE_bdA5a$ 
[2] https://urldefense.com/v3/__http://www.wwpdb.org/documentation/pdb-id-extension-faq__;!!Mih3wA!D_3vIHRoYs3vvPbUs15xgGinikTOR4gs8gGkCZCheLNBa0YkqjPTR0m9Ejd5EJupdQnTcerSZx8vjI2MGGfNBFEA$ 
[3] https://urldefense.com/v3/__http://files-versioned.wwpdb.org/__;!!Mih3wA!D_3vIHRoYs3vvPbUs15xgGinikTOR4gs8gGkCZCheLNBa0YkqjPTR0m9Ejd5EJupdQnTcerSZx8vjI2MGIbb9ZWM$ 
[4] https://urldefense.com/v3/__https://github.com/wwPDB/extended-wwPDB-identifier-examples__;!!Mih3wA!D_3vIHRoYs3vvPbUs15xgGinikTOR4gs8gGkCZCheLNBa0YkqjPTR0m9Ejd5EJupdQnTcerSZx8vjI2MGHWVAMTx$ 
[5] https://urldefense.com/v3/__http://www.wwpdb.org/news/news?year=2023*656f4404d78e004e766a96c6__;Iw!!Mih3wA!D_3vIHRoYs3vvPbUs15xgGinikTOR4gs8gGkCZCheLNBa0YkqjPTR0m9Ejd5EJupdQnTcerSZx8vjI2MGD8avfR6$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ncmir.ucsd.edu/pipermail/3dem/attachments/20240109/e4f9066b/attachment.html>


More information about the 3dem mailing list