Acta Veterinaria et Zootechnica Sinica ›› 2024, Vol. 55 ›› Issue (6): 2431-2440.doi: 10.11843/j.issn.0366-6964.2024.06.015
• Animal Genetics and Breeding • Previous Articles Next Articles
Received:
2023-11-08
Online:
2024-06-23
Published:
2024-06-28
Contact:
Zhiqiang DU
E-mail:2021710855@yangtzeu.edu.cn;zhqdu@yangtzeu.edu.cn
CLC Number:
Huaxuan WU, Zhiqiang DU. Methods of Genotype Feature Extraction Affecting the Prediction Accuracy of Genomic Selection[J]. Acta Veterinaria et Zootechnica Sinica, 2024, 55(6): 2431-2440.
Table 3
Different feature extraction methods and their average prediction accuracy"
数据集 Dataset | 方法 Method | 平均PCC Mean PCC | PCC标准差 PCC STD | 平均MSE Mean MSE | MSE标准差 MSE STD |
北京鸭,体长/cm | GWAS(P < 0.05) | 0.484 | 0.081 | 8.701 | 2.835 |
Pecking duck, body length | PCA | 0.243 | 0.043 | 5.741 | 0.244 |
Gene-PCA | 0.132 | 0.026 | 6.094 | 0.080 | |
LD | 0.277 | 0.052 | 5.803 | 0.343 | |
SNP-PCC | 0.290 | 0.058 | 5.916 | 0.556 | |
Random | 0.265 | 0.036 | 6.002 | 0.375 | |
Origin | 0.217 | 0.020 | 6.525 | 0.888 | |
猪,背膘厚/mm | GWAS(P < 0.05) | 0.366 | 0.037 | 4.927 | 0.595 |
Pig, backfat thickness | PCA | 0.341 | 0.034 | 4.102 | 0.105 |
Gene-PCA | 0.186 | 0.025 | 4.560 | 0.118 | |
LD | 0.358 | 0.020 | 4.259 | 0.211 | |
SNP-PCC | 0.367 | 0.023 | 4.164 | 0.181 | |
Random | 0.338 | 0.021 | 4.286 | 0.158 | |
Origin | 0.336 | 0.026 | 4.612 | 0.552 | |
猪,乳头数/个 | GWAS(P < 0.05) | 0.240 | 0.042 | 1.247 | 0.105 |
Pig, teat number | PCA | 0.313 | 0.030 | 1.024 | 0.029 |
Gene-PCA | 0.123 | 0.011 | 1.148 | 0.023 | |
LD | 0.301 | 0.022 | 1.111 | 0.068 | |
SNP-PCC | 0.312 | 0.018 | 1.090 | 0.043 | |
Random | 0.299 | 0.020 | 1.088 | 0.053 | |
Origin | 0.280 | 0.025 | 1.207 | 0.158 |
Table 4
Comparison on computing times required for different feature extraction methods"
数据集 Dataset | 方法 Method | 位点数 Number of markers | 计算速度 Computing speed |
体长 | GWAS(P < 0.05) | 4 282 | 2 s |
Body length | PCA | 359 | 1 s |
Gene-PCA | 542 | 1 s | |
LD | 31 002 | 10 s | |
SNP-PCC | 26 495 | 6 s | |
Random | 3 993 | 2 s | |
Origin | 39 932 | 15 s | |
背膘厚、乳头数 | GWAS(P < 0.05) | 21 760 | 12 s |
Backfat thickness, teat number | PCA | 716 | 1 s |
Gene-PCA | 2 500 | 1.6 s | |
LD | 33 101 | 18 s | |
SNP-PCC | 20 731 | 11 s | |
Random | 23 088 | 12 s | |
Origin | 230 884 | 2 min 20 s |
1 |
MEUWISSENT H E,HAYESB J,GODDARDM E.Prediction of total genetic value using genome-wide dense marker maps[J].Genetics,2001,157(4):1819-1829.
doi: 10.1093/genetics/157.4.1819 |
2 |
OSTERSENT,CHRISTENSENO F,HENRYONM,et al.Deregressed EBV as the response variable yield more reliable genomic predictions than traditional EBV in pure-bred pigs[J].Genet Sel Evol,2011,43(1):38.
doi: 10.1186/1297-9686-43-38 |
3 |
ZHAOY S,GOWDAM,LIUW X,et al.Accuracy of genomic selection in European maize elite breeding populations[J].Theor Appl Genet,2012,124(4):769-776.
doi: 10.1007/s00122-011-1745-y |
4 |
LIUT F,QUH,LUOC L,et al.Genomic selection for the improvement of antibody response to Newcastle disease and avian influenza virus in chickens[J].PLoS One,2014,9(11):e112685.
doi: 10.1371/journal.pone.0112685 |
5 |
BEYENEY,SEMAGNK,MUGOS,et al.Genetic gains in grain yield through genomic selection in eight Bi-parental maize populations under drought stress[J].Crop Sci,2015,55(1):154-163.
doi: 10.2135/cropsci2014.07.0460 |
6 |
PALAIOKOSTASC,FERRARESSOS,FRANCHR,et al.Genomic prediction of resistance to pasteurellosis in gilthead sea bream (Sparus aurata) using 2b-RAD sequencing[J].G3 (Bethesda),2016,6(11):3693-3700.
doi: 10.1534/g3.116.035220 |
7 |
MEUWISSENT H.Accuracy of breeding values of 'unrelated' individuals predicted by dense SNP genotyping[J].Genet Sel Evol,2009,41(1):35.
doi: 10.1186/1297-9686-41-35 |
8 |
AKBARZADEHM,DEHKORDIS R,ROUDBARM A,et al.GWAS findings improved genomic prediction accuracy of lipid profile traits: tehran cardiometabolic genetic study[J].Sci Rep,2021,11(1):5780.
doi: 10.1038/s41598-021-85203-8 |
9 |
LIB,ZHANGN X,WANGY G,et al.Genomic prediction of breeding values using a subset of SNPs identified by three machine learning methods[J].Front Genet,2018,9,237.
doi: 10.3389/fgene.2018.00237 |
10 |
PILESM,BERGSMAR,GIANOLAD,et al.Feature selection stability and accuracy of prediction models for genomic prediction of residual feed intake in pigs using machine learning[J].Front Genet,2021,12,611506.
doi: 10.3389/fgene.2021.611506 |
11 | TORADAL,LORENZONL,BEDDISA,et al.ImaGene: a convolutional neural network to quantify natural selection from genomic data[J].BMC Bioinformatics,2019,20(Suppl 9):337. |
12 | 王万年,陈思佳,郜金荣,等.基于多层感知机的绵羊限性性状基因组选择模拟研究[J].畜牧兽医学报,2023,54(7):2824-2835. |
WANGW N,CHENS J,GAOJ R,et al.Simulation study on genomic selection of sex-limited traits using multilayer perceptron in sheep[J].Acta Veterinaria et Zootechnica Sinica,2023,54(7):2824-2835. | |
13 | 丁纪强,李庆贺,张高猛,等.比较机器学习等算法对肉鸡产蛋性状育种值估计的准确性[J].畜牧兽医学报,2022,53(5):1364-1372. |
DINGJ Q,LIQ H,ZHANGG M,et al.Comparing the accuracy of estimated breeding value by several algorithms on laying traits in broilers[J].Acta Veterinaria et Zootechnica Sinica,2022,53(5):1364-1372. | |
14 |
AZODIC B,BOLGERE,MCCARRENA,et al.Benchmarking parametric and machine learning models for genomic prediction of complex traits[J].G3 (Bethesda),2019,9(11):3691-3702.
doi: 10.1534/g3.119.400498 |
15 |
WANGK Q,YANGB,LIQ,et al.Systematic evaluation of genomic prediction algorithms for genomic prediction and breeding of aquatic animals[J].Genes (Basel),2022,13(12):2247.
doi: 10.3390/genes13122247 |
16 |
XIANGT,LIT,LIJ L,et al.Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs[J].FASEB J,2023,37(6):e22961.
doi: 10.1096/fj.202300245R |
17 |
DENGM T,ZHUF,YANGY Z,et al.Genome-wide association study reveals novel loci associated with body size and carcass yields in Pekin ducks[J].BMC Genomics,2019,20(1):1.
doi: 10.1186/s12864-018-5379-1 |
18 |
TANC,WUZ F,RENJ L,et al.Genome-wide association study and accuracy of genomic prediction for teat number in Duroc pigs using genotyping-by-sequencing[J].Genet Sel Evol,2017,49(1):35.
doi: 10.1186/s12711-017-0311-8 |
19 | GOODFELLOWI,BENGIOY,COURVILLEA.Deep learning[M].Cambridge:The MIT Press,2016. |
20 | PEDREGOSAF,VAROQUAUXG,GRAMFORTA,et al.Scikit-learn: machine learning in Python[J].J Mach Learn Res,2011,12,2825-2830. |
21 |
SLATKINM.Linkage disequilibrium-understanding the evolutionary past and mapping the medical future[J].Nat Rev Genet,2008,9(6):477-485.
doi: 10.1038/nrg2361 |
22 |
HILLW G,ROBERTSONA.Linkage disequilibrium in finite populations[J].Theor Appl Genet,1968,38(6):226-231.
doi: 10.1007/BF01245622 |
23 |
HILLW G,MACKAYT F C.D. S.Falconer and introduction to quantitative genetics[J].Genetics,2004,167(4):1529-1536.
doi: 10.1093/genetics/167.4.1529 |
24 | SVEDJ A,HILLW G.One hundred years of linkage disequilibrium[J].Genetics,2018,209(3):629-636. |
25 |
HENDERSONC R.Best linear unbiased estimation and prediction under a selection model[J].Biometrics,1975,31(2):423-447.
doi: 10.2307/2529430 |
26 |
HABIERD,FERNANDOR L,KIZILKAYAK,et al.Extension of the Bayesian alphabet for genomic selection[J].BMC Bioinformatics,2011,12,186.
doi: 10.1186/1471-2105-12-186 |
27 |
PÉREZP,DE LOS CAMPOSG.Genome-wide regression and prediction with the BGLR statistical package[J].Genetics,2014,198(2):483-495.
doi: 10.1534/genetics.114.164442 |
28 |
GREŠOVÁK,MARTINEKV,AČGECHÁK D,et al.Genomic benchmarks: a collection of datasets for genomic sequence classification[J].BMC Genom Data,2023,24(1):1-25.
doi: 10.1186/s12863-022-01102-5 |
29 |
LUECKENM D,BVTTNERM,CHAICHOOMPUK,et al.Benchmarking atlas-level data integration in single-cell genomics[J].Nature methods,2022,19(1):41-50.
doi: 10.1038/s41592-021-01336-8 |
30 |
LIY,MANSMANNU,DUS,et al.Benchmark study of feature selection strategies for multi-omics data[J].BMC Bioinformatics,2022,23(1):412.
doi: 10.1186/s12859-022-04962-x |
31 |
PRICEA L,PATTERSONN J,PLENGER M,et al.Principal components analysis corrects for stratification in genome-wide association studies[J].Nat Genet,2006,38(8):904-909.
doi: 10.1038/ng1847 |
32 |
BEHARD M,YUNUSBAYEVB,METSPALUM,et al.The genome-wide structure of the Jewish people[J].Nature,2010,466(7303):238-242.
doi: 10.1038/nature09103 |
33 |
ATZMONG,HAOL,PE'ERI,et al.Abraham's children in the genome era: major Jewish diaspora populations comprise distinct genetic clusters with shared Middle eastern Ancestry[J].Am J Hum Genet,2010,86(6):850-859.
doi: 10.1016/j.ajhg.2010.04.015 |
34 |
CAMPBELLC L,PALAMARAP F,DUBROVSKYM,et al.North African Jewish and non-Jewish populations form distinctive, orthogonal clusters[J].Proc Natl Acad Sci U S A,2012,109(34):13865-13870.
doi: 10.1073/pnas.1204840109 |
35 |
ELHAIKE.Principal component analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated[J].Sci Rep,2022,12(1):14683.
doi: 10.1038/s41598-022-14395-4 |
36 |
REND Y,CAIX D,LINQ,et al.Impact of linkage disequilibrium heterogeneity along the genome on genomic prediction and heritability estimation[J].Genet Sel Evol,2022,54(1):47.
doi: 10.1186/s12711-022-00737-3 |
37 |
REICHD E,CARGILLM,BOLKS,et al.Linkage disequilibrium in the human genome[J].Nature,2001,411(6834):199-204.
doi: 10.1038/35075590 |
38 |
CLIMERS,YANGW,DE LAS FUENTESL,et al.A custom correlation coefficient (CCC) approach for fast identification of multi-SNP association patterns in genome-wide SNPs data[J].Genet Epidemiol,2014,38(7):610-621.
doi: 10.1002/gepi.21833 |
39 | ZHOUY,VALESM I,WANGA X,et al.Systematic bias of correlation coefficient may explain negative accuracy of genomic prediction[J].Brief Bioinform,2017,18(5):744-753. |
40 |
SUBRAMANIANJ,SIMONR.Overfitting in prediction models-is it a problem only in high dimensions?[J].Contemp Clin Trials,2013,36(2):636-641.
doi: 10.1016/j.cct.2013.06.011 |
41 |
FROUINA,DANDINE-ROULLANDC,PIERRE-JEANM,et al.Exploring the link between additive heritability and prediction accuracy from a ridge regression perspective[J].Front Genet,2020,11,581594.
doi: 10.3389/fgene.2020.581594 |
[1] | Jing LI, Yuanxu ZHANG, Zezhao WANG, Yan CHEN, Lingyang XU, Lupei ZHANG, Xue GAO, Huijiang GAO, Junya LI, Bo ZHU, Peng GUO. Research Progress in Machine Learning Genomic Selection [J]. Acta Veterinaria et Zootechnica Sinica, 2024, 55(6): 2281-2292. |
[2] | ZHONG Xin, ZHANG Hui, ZHANG Chong, LIU Xiaohong. Research Progress on Genetic Breeding of Reproductive Performance in Sows [J]. Acta Veterinaria et Zootechnica Sinica, 2024, 55(2): 438-450. |
[3] | WANG Wannian, CHEN Sijia, GAO Jinrong, WEN Zhonghao, YUAN Mengjiao, ZHANG Hongzhi, PANG Zhixu, QIAO Liying, LIU Wenzhong. Simulation Study on Genomic Selection of Sex-limited Traits Using Multilayer Perceptron in Sheep [J]. Acta Veterinaria et Zootechnica Sinica, 2023, 54(7): 2824-2835. |
[4] | YANG Kai, LU Zhuoda, HE Jian, ZHANG Ruiqi, WANG Suqing, LI Kebiao, ZHAO Yunxiang, ZHU Xiaoping, GUO Jinbiao. The Population Effect of Commercial Pigs on the Accuracy of Genomic Selection for Carcass Traits in Duroc Pigs [J]. Acta Veterinaria et Zootechnica Sinica, 2023, 54(12): 4943-4951. |
[5] | SUN Dongxiao, ZHANG Shengli, ZHANG Qin, LI Jiao, ZHANG Guixiang, LIU Chousheng, ZHENG Weijie. Application Progress on Genomic Selection Technology for Dairy Cattle in China [J]. Acta Veterinaria et Zootechnica Sinica, 2023, 54(10): 4028-4039. |
[6] | MA Haoran, ZHANG Lupei, JIN Shengyun, BAO Jinshan, LI Hongyan, GAO Huijiang, XU Lingyang, WANG Zezhao, LI Junya. Assessment of the Genomic Relationships for Chinese Indigenous Beef Cattle Using High-density SNP Chip [J]. Acta Veterinaria et Zootechnica Sinica, 2023, 54(10): 4174-4185. |
[7] | SHI Rui, SU Guosheng, CHEN Ziwei, LI Xiang, LUO Hanpeng, LIU Lin, GUO Gang, ZHANG Yi, WANG Yachun, ZHANG Shengli, ZHANG Qin. Comparisons of Genomic Predictions for Fertility Traits in Chinese Holstein Cattle [J]. Acta Veterinaria et Zootechnica Sinica, 2022, 53(9): 2944-2954. |
[8] | PANG Zhixu, ZHANG Hongzhi, QIAO Liying, WANG Wannian, PAN Yangyang, LIU Wenzhong. Simulation Study on Joint Genomic Breeding Using Metafounders [J]. Acta Veterinaria et Zootechnica Sinica, 2022, 53(7): 2172-2181. |
[9] | DING Jiqiang, LI Qinghe, ZHANG Gaomeng, LI Sen, ZHENG Maiqing, WEN Jie, ZHAO Guiping. Comparing the Accuracy of Estimated Breeding Value by Several Algorithms on Laying Traits in Broilers [J]. Acta Veterinaria et Zootechnica Sinica, 2022, 53(5): 1364-1372. |
[10] | DU Yongwang, HUANG Chao, WANG Yidong, LI Sen, WEN Jie, CHEN Zhiwu, ZHAO Guiping, ZHENG Maiqing. Genomic Selection for RFI in Broiler Combining GWAS Prior Marker Information [J]. Acta Veterinaria et Zootechnica Sinica, 2022, 53(10): 3403-3411. |
[11] | ZHANG Pengfei, HE Jun, WANG Lixian, ZHAO Fuping. Simulation Study on the Effects of Different Mating Schemes Based on Genomic and Pedigree Information [J]. Acta Veterinaria et Zootechnica Sinica, 2022, 53(10): 3448-3458. |
[12] | LI Sen, DU Yongwang, WEN Jie, HUANG Chao, CHEN Zhiwu, ZHAO Guiping, ZHENG Maiqing. A Study of Genomic Selection for Feed Efficiency Traits in Fast-growing Yellow-feathered Broilers [J]. Acta Veterinaria et Zootechnica Sinica, 2021, 52(8): 2151-2161. |
[13] | SU Dingran, PENG Peng, YAN Qingxia, CHEN Shaohu, ZHANG Shengli, LI Jiao, LIU Chousheng, SUN Dongxiao. Analysis of the Effects for Genomic Selection in Holstein Young Bulls in China [J]. Acta Veterinaria et Zootechnica Sinica, 2021, 52(6): 1550-1562. |
[14] | ZHU Bo, LI Hongwei, ZHOU Peinuo, LI Qian, GAO Han, WANG Zezhao, WANG Congyong, CAI Wentao, XU Lingyang, CHEN Yan, ZHANG Lupei, GAO Xue, GAO Huijiang, LI Junya. Overview of Genetic Evaluation System of Beef Cattle in China and Abroad [J]. Acta Veterinaria et Zootechnica Sinica, 2021, 52(6): 1447-1460. |
[15] | YUAN Zehu, GE Ling, LI Fadi, YUE Xiangpeng, SUN Wei. The Method of Genomic Selection by Integrating Biological Prior Information and Its Application in Livestock Breeding [J]. Acta Veterinaria et Zootechnica Sinica, 2021, 52(12): 3323-3334. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||