<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<div name="messageBodySection">
<div dir="auto"><span style="color:black;background-color:white;font-family:Times New Roman, serif">Dear colleagues,</span><br />
<span style="color:black;background-color:white;font-family:Times New Roman, serif"> </span><br />
<span style="color:black;background-color:white;font-family:Times New Roman, serif">We are pleased to announce the newly launched cryo-EM single particle dataset for training of AI-based particle picking programs at </span><a style="background-color:white;font-family:Times New Roman, serif" href="https://urldefense.com/v3/__https://nam02.safelinks.protection.outlook.com/?url=https*3A*2F*2Fgithub.com*2FBioinfoMachineLearning*2Fcryoppp&data=05*7C01*7Cchengji*40missouri.edu*7Cc203222d877a4cd2731a08db2ecf3cb5*7Ce3fefdbef7e9401ba51a355e01b05a89*7C0*7C0*7C638155239839580896*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000*7C*7C*7C&sdata=H3GTIYDR8kl9IJ01E5CBcBH1LFYwRy4L*2BVIO1kjw3tc*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJQ!!P4SdNyxKAPE!GETMAJJhQwhdP2T_F8q5OLc8wU5r1Cp_Us2GK5qW52HbgKb0OJCau1xtjOi8tK7QOPv_9Ekr6gc2JMiByXY$" target="_blank">https://github.com/BioinfoMachineLearning/cryoppp</a><span style="color:black;background-color:white;font-family:Times New Roman, serif">.</span><br />
<span style="color:black;background-color:white;font-family:Times New Roman, serif"> </span></div>
<div style="text-align:justify"><span style="color:black;background-color:white;font-family:Times New Roman, serif">As we know, particle picking in single particle analysis is a critical step in reconstructing protein structures and machine learning is a promising approach to automating protein particle picking. However, the development of machine learning methods is hindered by the lack of large, high-quality, manually labelled training data. To address the bottleneck, CryoPPP, a large, diverse, expert-curated cryo-EM image dataset for single protein particle picking and analysis was created. It consists of </span><span style="color:red;background-color:white;font-family:Times New Roman, serif">9,089 manually labelled cryo-EM micrographs of 32 non-redundant, representative protein datasets</span><span style="color:black;background-color:white;font-family:Times New Roman, serif"> selected from the Electron Microscopy Public Image Archive (EMPIAR). The coordinates of protein particles in the diverse, high-resolution micrographs were labelled by human experts. The protein particle labelling process was rigorously validated by both 2D particle class validation and 3D density map validation with the gold standard. CryoPPP is several times larger than any other particle dataset in the field and is expected to greatly facilitate the training and test of machine learning and artificial intelligence methods for automated cryo-EM protein particle picking. </span></div>
<div style="text-align:justify"><span style="color:black;background-color:white;font-family:Times New Roman, serif"> </span></div>
<div dir="auto"><span style="color:black;background-color:white;font-family:Times New Roman, serif">Please see the manuscript for details at </span><a style="background-color:white;font-family:Times New Roman, serif" href="https://urldefense.com/v3/__https://nam02.safelinks.protection.outlook.com/?url=https*3A*2F*2Fwww.biorxiv.org*2Fcontent*2F10.1101*2F2023.02.21.529443v1&data=05*7C01*7Cchengji*40missouri.edu*7Cc203222d877a4cd2731a08db2ecf3cb5*7Ce3fefdbef7e9401ba51a355e01b05a89*7C0*7C0*7C638155239839580896*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000*7C*7C*7C&sdata=GZGZyJ5YspQjsAwnm8EgfdNBHaNN6s9bgn*2ByVfp1PvU*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJSU!!P4SdNyxKAPE!GETMAJJhQwhdP2T_F8q5OLc8wU5r1Cp_Us2GK5qW52HbgKb0OJCau1xtjOi8tK7QOPv_9Ekr6gc2fIHXRsw$" target="_blank">https://www.biorxiv.org/content/10.1101/2023.02.21.529443v1</a><span style="color:black;background-color:white;font-family:Times New Roman, serif"> and try the new dataset at </span><a style="background-color:white;font-family:Times New Roman, serif" href="https://urldefense.com/v3/__https://nam02.safelinks.protection.outlook.com/?url=https*3A*2F*2Fgithub.com*2FBioinfoMachineLearning*2Fcryoppp&data=05*7C01*7Cchengji*40missouri.edu*7Cc203222d877a4cd2731a08db2ecf3cb5*7Ce3fefdbef7e9401ba51a355e01b05a89*7C0*7C0*7C638155239839580896*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C3000*7C*7C*7C&sdata=H3GTIYDR8kl9IJ01E5CBcBH1LFYwRy4L*2BVIO1kjw3tc*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJQ!!P4SdNyxKAPE!GETMAJJhQwhdP2T_F8q5OLc8wU5r1Cp_Us2GK5qW52HbgKb0OJCau1xtjOi8tK7QOPv_9Ekr6gc2JMiByXY$" target="_blank">https://github.com/BioinfoMachineLearning/cryoppp</a><span style="color:black;background-color:white;font-family:Times New Roman, serif">.</span></div>
<div style="text-align:justify"><span style="color:black;background-color:white;font-family:Times New Roman, serif"> </span></div>
<div dir="auto"><span style="color:black;background-color:white;font-family:Times New Roman, serif"> </span><br />
<span style="color:black;background-color:white;font-family:Times New Roman, serif">Best,</span><br />
<span style="color:black;background-color:white;font-family:Times New Roman, serif"> </span><br />
<span style="color:black;background-color:white;font-family:Times New Roman, serif">Liguo Wang and Jianlin Cheng</span><br /></div>
</div>
<div name="messageSignatureSection"><br />
<div class="matchFont"><br /></div>
</div>
</body>
</html>