畜牧兽医学报 ›› 2025, Vol. 56 ›› Issue (1): 213-221.doi: 10.11843/j.issn.0366-6964.2025.01.020

• 生物技术与繁殖 • 上一篇    下一篇

基于芯片数据的长白猪繁殖性状基因组选择研究

阳文攀1,2(), 刘相杰1,2, 罗冬香3, 陈梦会1, 谢瑛1, 方跃鑫1, 林婷燕1, 李爱民1, 李文静1, 邓政1,2, 丁能水1,2,4,*()   

  1. 1. 福建傲农生物科技集团股份有限公司 农业农村部华南生猪育种重点实验室, 漳州 363000
    2. 福建傲芯生物科技集团有限公司, 漳州 363000
    3. 上高县现代农业技术服务中心, 宜春 336400
    4. 江西农业大学 猪遗传改良与养殖技术国家重点实验室, 南昌 330045
  • 收稿日期:2024-05-31 出版日期:2025-01-23 发布日期:2025-01-18
  • 通讯作者: 丁能水 E-mail:945226087@qq.com;13631698@qq.com
  • 作者简介:阳文攀(1994-),男,湖北孝感人,硕士,主要从事动物遗传育种与繁殖研究,E-mail:945226087@qq.com
  • 基金资助:
    福建省种业企业培优项目(2120814-农业生产发展支出)

Research on Genomic Selection of Reproductive Traits in Landrace Pigs Based on Chip Data

YANG Wenpan1,2(), LIU Xiangjie1,2, LUO Dongxiang3, CHEN Menghui1, XIE Ying1, FANG Yuexin1, LIN Tingyan1, LI Aimin1, LI Wenjing1, DENG Zheng1,2, DING Nengshui1,2,4,*()   

  1. 1. Key Laboratory of Swine Breeding for the South China of Ministry of Agriculture and Rural Affairs, Aonong Group, Zhangzhou 363000, China
    2. Fujian Aoxin Biotechnology Group Co., Ltd., Zhangzhou 363000, China
    3. Shanggao County Modern Agricultural Technology Service Center, Yichun 336400, China
    4. State Key Laboratory of Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang 330045, China
  • Received:2024-05-31 Online:2025-01-23 Published:2025-01-18
  • Contact: DING Nengshui E-mail:945226087@qq.com;13631698@qq.com

摘要:

旨在比较不同基因组预测模型预测准确性与运行效率,以探究支持向量机(SVM)回归与随机森林(RandomForest)回归在基因组预测中的应用价值与应用前景。本研究使用博瑞迪猪50K液相芯片,采用GBLUP、BayesB、BayesLASSO、SVM回归和RandomForest回归等基因组预测模型,对1 001头长白猪繁殖性状进行基因组预测评估。研究发现,在总产仔数、产活仔数、窝重等繁殖性状中,使用SVM回归径向基函数核的评估准确性均最高;产活仔数、窝重在参数C值为1时评估准确性达到最大值,总产仔数在参数C值为2时评估准确性达到最大值。在总产仔数、产活仔数、窝重等繁殖性状中,使用RandomForest回归评估ntreemtrynodesize等参数时发现,基因组预测准确性随着参数的变化展现一定的随机性。RandomForest回归模型在总产仔数、产活仔数、窝重中的评估准确性均最高,其次为SVM回归,GBLUP、BayesB、BayesLASSO等模型遗传评估准确性较差且保持一致。交叉验证相关性显示,不同模型遗传评估结果存在较强的相关性,为0.806~0.995。SVM回归与RandomForest回归等非参数机器学习模型在猪繁殖性状基因组选择中具有一定的优势,但运行时间在一定程度上限制了这些算法的使用。随着算法的研究优化,SVM回归与RandomForest回归等非参数机器学习模型将具有良好的应用前景。

关键词: 长白猪, 繁殖性状, 基因芯片, 基因组选择, 机器学习

Abstract:

The study aimed to compare the prediction accuracy and computational efficiency of different genomic prediction models, explore the application value and prospects of support vector machine (SVM) regression and RandomForest regression in genomic prediction. In this study, the 50K liquid chip was used to predict the reproductive traits of 1 001 Landrace pigs by GBLUP, BayesB, BayesLASSO, support vector machine regression and RandomForest regression. It was found that support vector machine regression radial basis function kernel had the highest predictive accuracy of reproductive traits such as total litter size, live litter size and litter weight. The predictive accuracy of live litter size and litter weight reached the maximum value when the parameter C value was 1, and the predictive accuracy of total litter size reached the maximum value when the parameter C value was 2. When using RandomForest regression to evaluate parameters such as ntree, mtry and nodesize in reproductive traits such as total litter size, number of live litter and litter weight, it was found that the predictive accuracy showed a certain randomness with the change of parameters. The RandomForest regression model showed the highest accuracy in total litter size, live litter size and litter weight, followed by support vector machine regression. The predictive accuracy of GBLUP, BayesB, BayesLasso models was poor and consistent. In the correlation of cross-validation of different models, it can also be found that there is a strong correlation between the results of different models ranging from 0.806 to 0.995. Non-parametric machine learning models such as support vector machine regression and RandomForest regression have certain advantages in pig reproductive trait genome selection, but the running time limits the use of this machine learning algorithm to some extent. With the optimization of the algorithm, non-parametric machine learning models such as support vector machine regression and RandomForest regression will have a good application prospect.

Key words: Landrace pig, reproductive traits, gene chip, genomic selection, machine learning

中图分类号: