畜牧兽医学报 ›› 2024, Vol. 55 ›› Issue (7): 2775-2785.doi: 10.11843/j.issn.0366-6964.2024.07.001
收稿日期:
2023-10-10
出版日期:
2024-07-23
发布日期:
2024-07-24
通讯作者:
王立刚
E-mail:w18439393365@163.com;wangligang01@caas.cn
作者简介:
王进部(2001-),男,河南濮阳人,硕士,主要从事动物遗传育种研究,E-mail: w18439393365@163.com
基金资助:
Jinbu WANG(), Jia LI, Deming REN, Lixian WANG, Ligang WANG*(
)
Received:
2023-10-10
Online:
2024-07-23
Published:
2024-07-24
Contact:
Ligang WANG
E-mail:w18439393365@163.com;wangligang01@caas.cn
摘要:
基因组选择的广泛应用大大加快了畜禽的遗传进展。随着畜禽芯片的商业化和测序成本的不断降低,可获得的畜禽基因组信息越来越丰富。基因型标记数量远远超过具有表型数据的样本个数,基因组信息之间的关系更加复杂等问题也随之出现,使得最佳线性无偏预测(best linear unbiased prediction,BLUP)和Bayes等传统评估模型的使用受到极大限制。机器学习算法不依赖于预定的方程模型,可以更好地处理非线性关系,为以上问题提供了解决方案,因此逐步被应用于基因组选择中。本文综述了基因组选择的发展,阐述了几种常用于基因组选择中的机器学习算法的原理,并对机器学习在畜禽基因组选择中的应用现状和实现方式进行了总结,最后对机器学习在畜禽育种中面临的问题进行了探讨并对其发展进行了展望。
中图分类号:
王进部, 李佳, 任德明, 王立贤, 王立刚. 机器学习在畜禽基因组选择中的应用进展[J]. 畜牧兽医学报, 2024, 55(7): 2775-2785.
Jinbu WANG, Jia LI, Deming REN, Lixian WANG, Ligang WANG. Progress in the Application of Machine Learning in Livestock and Poultry Genomic Selection[J]. Acta Veterinaria et Zootechnica Sinica, 2024, 55(7): 2775-2785.
1 |
GODDARD M E , HAYES B J . Genomic selection[J]. J Anim Breed Genet, 2007, 124 (6): 323- 330.
doi: 10.1111/j.1439-0388.2007.00702.x |
2 |
MEUWISSEN T H E , HAYES B J , GODDARD M E . Prediction of total genetic value using genome-wide dense marker maps[J]. Genetics, 2001, 157 (4): 1819- 1829.
doi: 10.1093/genetics/157.4.1819 |
3 | 孙东晓, 张胜利, 张勤, 等. 我国奶牛基因组选择技术应用进展[J]. 畜牧兽医学报, 2023, 54 (10): 4028- 4039. |
SUN D X , ZHANG S L , ZHANG Q , et al. Application progress on genomic selection technology for dairy cattle in China[J]. Acta Veterinaria et Zootechnica Sinica, 2023, 54 (10): 4028- 4039. | |
4 | 邢文凯, 刘建, 刘燊, 等. 猪基因组选择育种研究进展[J]. 中国畜牧杂志, 2021, 57 (7): 31- 37. |
XING W K , LIU J , LIU S , et al. Research progress on the genomic selection breeding in swine[J]. Chinese Journal of Animal Science, 2021, 57 (7): 31- 37. | |
5 | 张瑞锋, 黄珍, 谢水华, 等. 猪全基因组选择技术发展现状和应用前景[J]. 中国畜牧杂志, 2023, 59 (10): 21- 29. |
ZHANG R F , HUANG Z , XIE S H , et al. The development status and application prospect of genomic selection for pigs[J]. Chinese Journal of Animal Science, 2023, 59 (10): 21- 29. | |
6 | 成海建, 姜富贵, 张清峰, 等. 全基因组选择技术在肉牛育种中的应用[J]. 中国牛业科学, 2018, 44 (6): 68- 72. |
CHENG H J , JIANG F G , ZHANG Q F , et al. Application of genomic selection in beef cattle[J]. China Cattle Science, 2018, 44 (6): 68- 72. | |
7 | 吴桂琴, 闫奕源, 李花妮, 等. 基因组选择技术在家禽育种中的应用[J]. 中国家禽, 2018, 40 (9): 1- 5. |
WU G Q , YAN Y Y , LI H N , et al. Application of genomic selection in poultry breeding[J]. China Poultry, 2018, 40 (9): 1- 5. | |
8 | 张统雨, 魏霞, 张勤, 等. 基因组选择在羊育种中的应用研究进展[J]. 畜牧兽医学报, 2018, 49 (12): 2535- 2542. |
ZHANG T Y , WEI X , ZHANG Q , et al. Progress on application of genomic selection in sheep and goat breeding[J]. Acta Veterinaria et Zootechnica Sinica, 2018, 49 (12): 2535- 2542. | |
9 | 宋海亮, 胡红霞. 基因组选择及其在水产动物育种中的研究进展[J]. 农业生物技术学报, 2022, 30 (2): 379- 392. |
SONG H L , HU H X . Genomic selection and its research progress in breeding of aquaculture species[J]. Journal of Agricultural Biotechnology, 2022, 30 (2): 379- 392. | |
10 | 李棉燕, 王立贤, 赵福平. 机器学习在动物基因组选择中的研究进展[J]. 中国农业科学, 2023, 56 (18): 3682- 3692. |
LI M Y , WANG L X , ZHAO F P . Research progress on machine learning for genomic selection in animals[J]. Scientia Agricultura Sinica, 2023, 56 (18): 3682- 3692. | |
11 | 李航. 统计学习方法[M]. 2版 北京: 清华大学出版社, 2019. |
LI H . Statistical learning methods[M]. 2nd ed Beijing: Tsinghua University Press, 2019. | |
12 | HANSEN K B , BORCH C . The absorption and multiplication of uncertainty in machine-learning-driven finance[J]. Br J Sociol, 2021, 72 (4): 1015- 1029. |
13 | HANDELMAN G S , KOK H K , CHANDRA R V , et al. eDoctor: machine learning and the future of medicine[J]. J Intern Med, 2018, 284 (6): 603- 619. |
14 | WEISKITTEL T M , CORREIA C , YU G T , et al. The trifecta of single-cell, systems-biology, and machine-learning approaches[J]. Genes (Basel), 2021, 12 (7): 1098. |
15 | BOWLER A L , POUND M P , WATSON N J . A review of ultrasonic sensing and machine learning methods to monitor industrial processes[J]. Ultrasonics, 2022, 124, 106776. |
16 | AN B X , LIANG M , CHANG T P , et al. KCRR: a nonlinear machine learning with a modified genomic similarity matrix improved the genomic prediction efficiency[J]. Brief Bioinform, 2021, 22 (6): bbab132. |
17 | WANG X , SHI S L , WANG G J , et al. Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs[J]. J Anim Sci Biotechnol, 2022, 13 (1): 60. |
18 | 张哲, 张勤, 丁向东. 畜禽基因组选择研究进展[J]. 科学通报, 2011, 56 (26): 2212- 2222. |
ZHANG Z , ZHANG Q , DING X D . Advances in genomic selection in domestic animals[J]. Chin Sci Bull, 2011, 56 (25): 2655- 2663. | |
19 | VANRADEN P M . Efficient methods to compute genomic predictions[J]. J Dairy Sci, 2008, 91 (11): 4414- 4423. |
20 | LEGARRA A , AGUILAR I , MISZTAL I . A relationship matrix including full pedigree and genomic information[J]. J Dairy Sci, 2009, 92 (9): 4656- 4663. |
21 | 王重龙, 丁向东, 刘剑锋, 等. 基因组育种值估计的贝叶斯方法[J]. 遗传, 2014, 36 (2): 111- 118. |
WANG C L , DING X D , LIU J F , et al. Bayesian methods for genomic breeding value estimation[J]. Hereditas (Beijing), 2014, 36 (2): 111- 118. | |
22 | HABIER D , FERNANDO R L , KIZILKAYA K , et al. Extension of the Bayesian alphabet for genomic selection[J]. BMC Bioinformatics, 2011, 12, 186. |
23 | YI N J , XU S Z . Bayesian LASSO for quantitative trait loci mapping[J]. Genetics, 2008, 179 (2): 1045- 1055. |
24 | MOSER G , LEE S H , HAYES B J , et al. Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model[J]. PLoS Genet, 2015, 11 (4): e1004969. |
25 | GONZÁLEZ-RECIO O , ROSA G J M , GIANOLA D . Machine learning methods and predictive ability metrics for genome-wide prediction of complex traits[J]. Livest Sci, 2014, 166, 217- 231. |
26 | CORTES C , VAPNIK V . Support-vector networks[J]. Mach Learn, 1995, 20 (3): 273- 297. |
27 | VAPNIK V N. Methods of function estimation[M]//VAPNIK V N. The Nature of Statistical Learning Theory. 2nd ed. New York: Springer, 2000: 181-224. |
28 | 梁忙. 基于机器学习算法的全基因组选择研究[D]. 北京: 中国农业科学院, 2021. |
LIANG M. The algorithms research for genomic selection study based on machine learning[D]. Beijing: Chinese Academy of Agricultural Sciences, 2021. (in Chinese) | |
29 | BREIMAN L . Random forests[J]. Mach Learn, 2001, 45 (1): 5- 32. |
30 | CHEN X , ISHWARAN H . Random forests for genomic data analysis[J]. Genomics, 2012, 99 (6): 323- 329. |
31 | GONZÁLEZ-RECIO O , FORNI S . Genome-wide prediction of discrete traits using Bayesian regressions and machine learning[J]. Genet Sel Evol, 2011, 43 (1): 7. |
32 | SRIVASTAVA S , LOPEZ B I , KUMAR H , et al. Prediction of Hanwoo cattle phenotypes from genotypes using machine learning methods[J]. Animals (Basel), 2021, 11 (7): 2066. |
33 | ABDOLLAHI-ARPANAHI R , GIANOLA D , PEÑAGARICANO F . Deep learning versus parametric and ensemble methods for genomic prediction of complex phenotypes[J]. Genet Sel Evol, 2020, 52 (1): 12. |
34 | OGUTU J O , PIEPHO H P , SCHULZ-STREECK T . A comparison of random forests, boosting and support vector machines for genomic selection[J]. BMC Proc, 2011, 5 Suppl 3 (Suppl 3): S11. |
35 | WALDMANN P . Genome-wide prediction using Bayesian additive regression trees[J]. Genet Sel Evol, 2016, 48 (1): 42. |
36 | RUMELHART D E , HINTON G E , WILLIAMS R J . Learning representations by back-propagating errors[J]. Nature, 1986, 323 (6088): 533- 536. |
37 | WALDMANN P , PFEIFFER C , MÉSZÁROS G . Sparse convolutional neural networks for genome-wide prediction[J]. Front Genet, 2020, 11, 25. |
38 | WALDMANN P . Approximate Bayesian neural networks in genomic prediction[J]. Genet Sel Evol, 2018, 50 (1): 70. |
39 | PÉREZ-ENCISO M , ZINGARETTI L M . A guide on deep learning for complex trait genomic prediction[J]. Genes (Basel), 2019, 10 (7): 553. |
40 | LONG N Y , GIANOLA D , ROSA G J M , et al. Application of support vector regression to genome-assisted prediction of quantitative traits[J]. Theor Appl Genet, 2011, 123 (7): 1065- 1074. |
41 | LI B , ZHANG N X , WANG Y G , et al. Genomic prediction of breeding values using a subset of SNPs identified by three machine learning methods[J]. Front Genet, 2018, 9, 237. |
42 | LIANG M , MIAO J , WANG X Q , et al. Application of ensemble learning to genomic selection in Chinese Simmental beef cattle[J]. J Anim Breed Genet, 2021, 138 (3): 291- 299. |
43 | LIANG M , CHANG T P , AN B X , et al. A stacking ensemble learning framework for genomic prediction[J]. Front Genet, 2021, 12, 600040. |
44 | LIANG M , AN B X , CHANG T P , et al. Incorporating kernelized multi-omics data improves the accuracy of genomic prediction[J]. J Anim Sci Biotechnol, 2022, 13 (1): 103. |
45 | ZHAO W , LAI X S , LIU D Y , et al. Applications of support vector machine in genomic prediction in pig and maize populations[J]. Front Genet, 2020, 11, 598318. |
46 | XIANG T , LI T , LI J L , et al. Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs[J]. FASEB J, 2023, 37 (6): e22961. |
47 | 陈健梅. 大白猪繁殖性状的全基因组关联分析和基因组选择研究[D]. 郑州: 河南农业大学, 2023. |
CHEN J M. Genome-wide association study and genomic selection for reproductive traits in large white pigs[D]. Zhengzhou: Henan Agricultural University, 2023. (in Chinese) | |
48 | LONG N , GIANOLA D , ROSA G J M , et al. Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers[J]. J Anim Breed Genet, 2007, 124 (6): 377- 389. |
49 | 丁纪强, 李庆贺, 张高猛, 等. 比较机器学习等算法对肉鸡产蛋性状育种值估计的准确性[J]. 畜牧兽医学报, 2022, 53 (5): 1364- 1372. |
DING J Q , LI Q H , ZHANG G M , et al. Comparing the accuracy of estimated breeding value by several algorithms on laying traits in broilers[J]. Acta Veterinaria et Zootechnica Sinica, 2022, 53 (5): 1364- 1372. | |
50 | LI Z D , ZHENG J M , AN B X , et al. Several models combined with ultrasound techniques to predict breast muscle weight in broilers[J]. Poult Sci, 2023, 102 (10): 102911. |
51 | PEDREGOSA F , VAROQUAUX G , GRAMFORT A , et al. Scikit-learn: machine learning in python[J]. J Mach Learn Res, 2011, 12, 2825- 2830. |
52 | CHOLLET F. Keras[Z]. GitHub, 2015. |
53 | CHEN T Q, GUESTRIN C. XGBoost: a scalable tree boosting system[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, California, USA: Association for Computing Machinery, 2016: 785-794. |
54 | KE G L, MENG Q, FINLEY T, et al. LightGBM: a highly efficient gradient boosting decision tree[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, California, USA: Curran Associates Inc., 2017: 3149-3157. |
55 | YIN L L , ZHANG H H , ZHOU X , et al. KAML: improving genomic prediction accuracy of complex traits using machine learning determined parameters[J]. Genome Biol, 2020, 21 (1): 146. |
56 | CHARMET G , TRAN L G , AUZANNEAU J , et al. BWGS: a R package for genomic selection and its application to a wheat breeding programme[J]. PLoS One, 2020, 15 (4): e0222733. |
57 | 于广宁. 全基因组预测软件包predhy的研发及其应用[D]. 扬州: 扬州大学, 2023. |
YU G N. Development and application of genomic prediction software package predhy[D]. Yangzhou: Yangzhou University, 2023. (in Chinese) | |
58 | ZENG S , MAO Z T , REN Y J , et al. G2PDeep: a web-based deep-learning framework for quantitative phenotype prediction and discovery of genomic markers[J]. Nucleic Acids Res, 2021, 49 (W1): W228- W236. |
59 | WANG K L , ABID M A , RASHEED A , et al. DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants[J]. Mol Plant, 2023, 16 (1): 279- 293. |
60 | ALVES A A C , ESPIGOLAN R , BRESOLIN T , et al. Genome-enabled prediction of reproductive traits in Nellore cattle using parametric models and machine learning methods[J]. Anim Genet, 2021, 52 (1): 32- 46. |
61 | LIANG M , AN B X , LI K A N , et al. Improving genomic prediction with machine learning incorporating TPE for hyperparameters optimization[J]. Biology (Basel), 2022, 11 (11): 1647. |
62 | 袁泽湖, 葛玲, 李发弟, 等. 整合生物学先验信息的全基因组选择方法及其在家畜育种中的应用进展[J]. 畜牧兽医学报, 2021, 52 (12): 3323- 3334. |
YUAN Z H , GE L , LI F D , et al. The Method of genomic selection by integrating biological prior information and its application in livestock breeding[J]. Acta Veterinaria et Zootechnica Sinica, 2021, 52 (12): 3323- 3334. |
[1] | 郭子骄, 郑伟杰, 孙伟, 吴宝江, 包向男, 张琪, 贺巾锋, 包斯琴, 赵高平, 王子馨, 韩博, 李喜和, 孙东晓. 荷斯坦奶牛胚胎基因组遗传评估研究[J]. 畜牧兽医学报, 2024, 55(7): 2940-2950. |
[2] | 李竟, 张元旭, 王泽昭, 陈燕, 徐凌洋, 张路培, 高雪, 高会江, 李俊雅, 朱波, 郭鹏. 机器学习全基因组选择研究进展[J]. 畜牧兽医学报, 2024, 55(6): 2281-2292. |
[3] | 吴华煊, 杜志强. 基因型特征提取方法影响基因组选择预测准确性的研究[J]. 畜牧兽医学报, 2024, 55(6): 2431-2440. |
[4] | 张元旭, 李竟, 王泽昭, 陈燕, 徐凌洋, 张路培, 高雪, 高会江, 李俊雅, 朱波, 郭鹏. 动物遗传评估软件研究进展[J]. 畜牧兽医学报, 2024, 55(5): 1827-1841. |
[5] | 王亚鑫, 王璟, 田学凯, 杨公社, 于太永. 多组学技术在畜禽重要经济性状研究中的应用[J]. 畜牧兽医学报, 2024, 55(5): 1842-1853. |
[6] | 段益欣, 张林云, 赵永聚. SNP遗传力估计方法、影响因素及其在畜禽育种中的应用[J]. 畜牧兽医学报, 2024, 55(5): 1854-1865. |
[7] | 杜改梅, 王月, 茅慧华, 雷卫强, 储岳峰, 刘茂军. 绵羊肺炎支原体小鼠感染模型的建立[J]. 畜牧兽医学报, 2024, 55(4): 1728-1737. |
[8] | 罗承慧, 高江瑞, 陈俊威, 魏春洁, 韦双双, 裴业春. 尘螨诱导特应性皮炎小鼠模型和哮喘小鼠模型的构建[J]. 畜牧兽医学报, 2024, 55(3): 1257-1267. |
[9] | 武文英, 夏青, 胡萌婕, 赵逸轩, 王琛, 张宇豪, 郝成武, 贺笋, 郭爱珍, 陈建国, 陈颖钰. 牛支原体兔体攻毒模型的建立[J]. 畜牧兽医学报, 2024, 55(3): 1268-1277. |
[10] | 钟欣, 张晖, 张充, 刘小红. 母猪繁殖力基因遗传育种研究进展[J]. 畜牧兽医学报, 2024, 55(2): 438-450. |
[11] | 严晓春, 习海娇, 李金泉, 王志英, 苏蕊. 内蒙古绒山羊绒毛性状基因组育种值估计准确性研究[J]. 畜牧兽医学报, 2024, 55(1): 120-128. |
[12] | 李珂, 王宇龙, 李栋, 史新娥, 杨公社, 于太永. 畜禽泛基因组研究进展[J]. 畜牧兽医学报, 2023, 54(9): 3595-3604. |
[13] | 褚楚, 张静静, 丁磊, 樊懿楷, 包向男, 向世馨, 刘锐, 罗雪路, 任小丽, 李春芳, 刘文举, 王亮, 刘莉, 李永青, 江汉, 李委奇, 孙伟, 李喜和, 温万, 周佳敏, 张淑君. 基于中红外光谱的牛奶中三种氨基酸含量预测模型的建立及应用[J]. 畜牧兽医学报, 2023, 54(8): 3299-3312. |
[14] | 黄江, 李闯, 崔月琦, 袁雪莹, 赵志诚, 刘宇, 周玉龙, 朱战波, 张泽财. 基于小鼠模型研究肠道菌群紊乱对BVDV易感性的影响[J]. 畜牧兽医学报, 2023, 54(8): 3466-3473. |
[15] | 王万年, 陈思佳, 郜金荣, 温中豪, 袁梦娇, 张洪志, 庞志旭, 乔利英, 刘文忠. 基于多层感知机的绵羊限性性状基因组选择模拟研究[J]. 畜牧兽医学报, 2023, 54(7): 2824-2835. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||