畜牧兽医学报 ›› 2023, Vol. 54 ›› Issue (8): 3299-3312.doi: 10.11843/j.issn.0366-6964.2023.08.016

• 遗传育种 • 上一篇    下一篇

基于中红外光谱的牛奶中三种氨基酸含量预测模型的建立及应用

褚楚1, 张静静1, 丁磊1, 樊懿楷1, 包向男2, 向世馨1, 刘锐1, 罗雪路1, 任小丽1, 李春芳1, 刘文举1, 王亮1, 刘莉1, 李永青1, 江汉1, 李委奇3, 孙伟2, 李喜和2, 温万3, 周佳敏3, 张淑君1*   

  1. 1. 华中农业大学动物科学技术学院、动物医学院, 动物遗传育种与繁殖教育部实验室, 武汉 430070;
    2. 内蒙古国家乳业技术创新中心有限责任公司, 呼和浩特 011517;
    3. 宁夏回族自治区畜牧工作站, 银川 750000
  • 收稿日期:2022-11-25 出版日期:2023-08-23 发布日期:2023-08-22
  • 通讯作者: 张淑君,主要从事动物抗病分子遗传育种、牛奶MIR指纹及奶牛生物标记的研究,E-mail:sjxiaozhang@mail.hzau.edu.cn
  • 作者简介:褚楚(1999-),女,山东枣庄人,硕士,主要从事动物遗传育种与繁殖研究,E-mail:1346409454@qq.com;张静静(1996-),女,山东烟台人,硕士,主要从事动物遗传育种与繁殖研究,E-mail:1462210902@qq.com。
  • 基金资助:
    国家重点研发计划政府间国际科技创新合作(2021YFE0115500);国家乳业技术创新中心项目(2022-科研攻关-3);湖北省国际合作项目(2022EHB043)

Establishment and Application of Prediction Model of Three Amino Acids in Milk Based on Mid-infrared Spectroscopy

CHU Chu1, ZHANG Jingjing1, DING Lei1, FAN Yikai1, BAO Xiangnan2, XIANG Shixin1, LIU Rui1, LUO Xuelu1, REN Xiaoli1, LI Chunfang1, LIU Wenju1, WANG Liang1, LIU Li1, LI Yongqing1, JIANG Han1, LI Weiqi3, SUN Wei2, LI Xihe2, WEN Wan3, ZHOU Jiamin3, ZHANG Shujun1*   

  1. 1. Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Education, College of Animal Science and Technology/College of Animal Medicine, Huazhong Agricultural University, Wuhan 430070, China;
    2. Inner Mongolia National Center of Technology Innovation for Dairy Industry, Hohhot 011517, China;
    3. Ningxia Hui Autonomous Region Animal Husbandry Workstation, Yinchuan 750000, China
  • Received:2022-11-25 Online:2023-08-23 Published:2023-08-22

摘要: 旨在建立牛奶中游离精氨酸、组氨酸和异亮氨酸含量的中红外光谱快速批量检测的方法,并进行大量外部验证。本研究以来自华北、华中和西北3个地区4个省份的217份健康中国荷斯坦牛牛奶样本为研究对象,利用4种光谱预处理算法(SG平滑、差分、多元散射校正、标准正态变换)、4种特征选择算法(已知信息区域、适应重加权算法、遗传算法及最小角回归算法)及两种建模算法(偏最小二乘回归和岭回归),分别建立了牛奶中游离的精氨酸、组氨酸和异亮氨酸含量的MIR光谱定量预测模型,将建立的最优模型应用于另外9个不同奶牛场的4 690头牛采集的32 559个牛奶样本的MIR光谱进行预测分析,以探讨泌乳阶段、牧场、胎次及季节对MIR预测的精氨酸、组氨酸及异亮氨酸含量的影响。结果表明:1)基于CARS特征选择算法、无光谱预处理和PLSR建模算法开发的精氨酸含量预测模型效果最好,该模型RP2=0.58,RMSEp=6.89 nmol·mL-1;基于CARS特征选择算法、SG平滑(窗口长度为11,2阶多项式)预处理及PLSR建模算法开发的组氨酸含量预测模型效果最好,该模型RP2=0.56,RMSEp=0.88 nmol·mL-1;基于274个特征信息波点、SG平滑(窗口长度为29,3阶多项式)预处理及PLSR建模算法开发的异亮氨酸含量预测模型效果最好,该模型RP2=0.49,RMSEp=1.75 nmol·mL-1;2)将最优模型进行跨地区外部验证时,预测准确性有所降低;3)将建立的模型应用于E省(未参与模型建立)大规模光谱数据库,以预测牛奶中游离精氨酸、组氨酸和异亮氨酸含量,发现泌乳阶段、牧场、季节对牛奶中游离精氨酸、组氨酸及异亮氨酸含量均有极显著影响(P<0.001),而胎次对精氨酸含量无显著影响,对组氨酸和异亮氨酸有极显著影响(P<0.001)。结果表明,利用MIR预测牛奶中游离氨基酸含量是可行的,特别是在牛奶氨基酸含量高低趋势分析方面具有一定预测能力,而该预测模型还需要更多的有代表性样本进行优化,提高模型的准确性和通用性。

关键词: 中红外光谱(MIR), 牛乳氨基酸, 预测模型, 牛奶, 机器学习

Abstract: The purpose of this study was to establish a rapid batch determination method for free arginine, histidine and isoleucine in milk by mid-infrared spectroscopy, and to carry out a large number of external verifications. A total of 217 Chinese Holstein milk samples from 4 provinces in North China, Central China and Northwest China were taken as the research objects, using 4 spectral preprocessing algorithms (SG smoothing, difference, multivariate scattering correction, standard normal transformation), 4 feature selection algorithms (known information region, adaptive heavy weighting algorithm, genetic algorithm and minimum angle regression algorithm) and 2 modeling algorithms (partial least squares regression and ridge regression), the MIR spectral quantitative prediction models of free arginine, histidine and isoleucine contents in milk were established. The optimal model was applied to the MIR spectra of 32 559 milk samples collected from 4 690 cows in 9 different dairy farms to explore the effects of lactation stage, pasture, parity and season on the predicted arginine, histidine and isoleucine contents by MIR. The results show that:1) The prediction model of arginine content based on CARS feature selection algorithm, non-spectral pretreatment algorithm and PLSR modeling algorithm was the best, RP2=0.58, RMSEp=6.89 nmol·mL-1; The prediction model of histidine content based on CARS feature selection algorithm, SG smoothing (window length is 11, 2-order polynomial) pretreatment and PLSR modeling algorithm was the best, RP2=0.56, RMSEp=0.88 nmol·mL-1; Based on 274 characteristic information wave points, SG smoothing (window length is 29, 3-order polynomial) pretreatment and PLSR modeling algorithm, the prediction model of isoleucine content was the best, RP2=0.49, RMSEp=1.75 nmol·mL-1; 2) When the optimal model was verified externally across regions, the prediction accuracy was reduced; 3) Applying the established model to the large-scale spectral database of E province (not participating in the establishment of the model), the contents of free arginine, histidine and isoleucine in milk was predicted, it was found that lactation stage, pasture and season had significant effects on the contents of free arginine, histidine and isoleucine in milk (P<0.001), while parity had no significant effect on arginine content, but had significant effect on histidine and isoleucine (P<0.001). The results show that it is feasible to predict the content of free amino acids in milk by MIR, especially, it has certain predictive ability in the trend analysis of milk amino acid content, and the prediction model needs more representative samples to optimize, so as to improve the accuracy and universality of the model.

Key words: mid-infrared spectroscopy(MIR), milk amino acid, prediction model, milk, machine learning

中图分类号: