确定固体和液体材料的晶体结构对于了解其机械、电磁和热力学特性非常重要。 粉末 X 射线衍射 (XRD) 是材料表征的重要工具,可编码有关晶体对称性、晶格参数、类型和纳米级原子填充的信息。 然而,目前的分类方法需要大量的人工干预才能完成基于整体信息综合评估的分类。 有许多变量会影响XRD图的形状,例如材料的相或晶格,因此很难在没有已知的类似结构的情况下表征材料。
fig. 1 crystal system and space group distributions.
此外,样品中存在少量杂质相会使分类更加困难、耗时且不准确。 超快同步X射线衍射和光谱测量的最新进展导致从数百万次测量中产生非常大的数据集,远远超过人类可以手动分析的数据量。
fig. 2 diffraction pattern comparison.
因此,迫切需要对XRD数据进行自适应和自动化分析。 所开发的深度学习模型在不同数据集下的性能差异较大,其特点是鲁棒性不足。 因此,我们需要一个更强大的模型,可以对不同材料的动态和/或不可见的真实XRD数据进行分类。
fig. 3 model architectures.
来自罗切斯特大学机械工程学院的Niaz Abdolrahim教授的团队开发了一种用于晶体系统和空间群分类的广义深度学习模型。 由于XRD数据中的相对峰强度、距离和有序表征了对称性,研究人员研究了是否存在置换不变性和平移不变性,并提出了一种无池卷积神经网络(NPCNN)来表征基于索引峰之间的相对和局部推断的材料。
fig. 4 rruff experimental performance.
为了实现广泛的分类能力,作者还开发了一个数据生成管道来构建高质量的数据集,该数据集结合了衍射模式的实验效应,并能够模拟经过合金化和/或动态实验的材料。 最后,研究人员成功地发挥了深度学习模型的最新性能。
fig. 5 rruff and mp dataset confusion matrix. confusion
本研究也为其他光谱表征技术模型的开发提供了有效的研究思路。 相关**最近发表在 NPJ Computational Materials v9: 214 (2023) 上。
fig. 6 materials project performance.
editorial summary
xrd data:an automated deep learning classifier
determining the crystal structure of solid and liquid materials is important for understanding their mechanical, electromagnetic and thermodynamic properties. powder x-ray diffraction (xrd) is an important means of material characterization, encoding information about crystal symmetry, lattice parameters, type, and filling of atoms on nanoscale domains.
fig. 7 lattice augmentation performance.
however, the current classification method requires a lot of human intervention to complete the classification based on comprehensive evaluation of the overall information. there are many variables that affect the shape of an xrd pattern, such as the phase or crystal lattice of the material. without a known similar structure, it is difficult to characterize the material. in addition, the presence of some small amounts of impurity phases in the sample may make classification more difficult and time-consuming. and inaccuracies.
fig. 8 f1 score on rruff and mp datasets.
recent advances in ultrafast synchronized xrd and spectroscopy measurements h**e generated extremely large data sets from millions of measurements, far exceeding what humans can manually analyze. therefore, there is an urgent need for adaptive and automatic analysis of xrd data. the performance of currently developed deep learning models on different data sets varies greatly, showing insufficient robustness. a more robust model is needed that can classify dynamic and/or unseen real xrd data obtained from different materials.
fig. 9 scatterplot on mp performance.
a group led by prof. niaz abdolrahim from the school of mechanical engineering, university of rochester, developed a generalized deep learning model for crystal system and space group classification. considering that the relative peak intensity, distance and order in xrd data indicate symmetry, the researchers investigated whether there is alignment invariance and translation invariance, and based on this, they proposed a no-pool convolutional neural network (npcnn). classification was accomplished by characterizing materials based on relative and local inferences between indexed peaks. to enable extensive classification capabilities, the authors also developed a data generation pipeline to build high-quality data sets that incorporates experimental effects on diffraction patterns. the pipeline also has the capability of simulating materials that undergo alloying and/or dynamic experimentation. the researchers succeeded in **the deep learning model achieve state-of-the-art performance. this study provides a valuable platform for developing models of other spectral characterization techniques. this article was recently published in npj computational materials v9: 214 (2023).
fig. 10 model architecture taxonomy.
原文摘要及其译文
使用深度学习模型对大 X 射线衍射数据进行自动分类
jerardo e. salgado, samuel lerman, zhaotong du, chenliang xu & niaz abdolrahim
abstract in current in situ x-ray diffraction (xrd) techniques, data generation surpasses human analytical capabilities, potentially leading to the loss of insights. automated techniques require human intervention, and lack the performance and adaptability required for material exploration. given the critical need for high-throughput automated xrd pattern analysis, we present a generalized deep learning model to classify a diverse set of materials’ crystal systems and space groups. in our approach, we generate training data with a holistic representation of patterns that emerge from varying experimental conditions and crystal properties. we also employ an expedited learning technique to refine our model’s expertise to experimental conditions. in addition, we optimize model architecture to elicit classification based on bragg’s law and use evaluation data to interpret our model’s decision-**we evaluate our models using experimental data, materials unseen in training, and altered cubic crystals, where we observe state-of-the-art performance and even greater advances in space group classification.
总结:
在当前的原位X射线衍射(XRD)技术中,生成数据的能力超过了人类分析数据的能力,这可能导致洞察力的丧失。 自动化技术需要人工干预,缺乏材料研究所需的性能和适应性。 鉴于对高通量自动化XRD模式分析的迫切需求,我们提出了一种广义深度学习模型,对不同材料的晶体系统和空间群进行分类。
在我们的方法中,我们利用来自不同实验条件和晶体特性模式的整体表示来生成训练数据。 我们还采用了快速学习技术来提高模型在实验条件下的专业知识。 此外,我们优化了模型架构,以引出基于布拉格定律的分类,并使用评估数据来解释模型的决策。 使用实验数据、训练中未见的材料和改变的立方晶体来评估模型,我们观察到最先进的性能和空间组分类的更大进步。