畜牧兽医学报 ›› 2024, Vol. 55 ›› Issue (10): 4325-4333.doi: 10.11843/j.issn.0366-6964.2024.10.008

• 遗传育种 • 上一篇    下一篇

蛋鸡SNP芯片10K到50K基因型填充的准确性研究

吴俊锋1,2(), 闫奕源3, 杨宁1,2, 孙从佼1,2, 李光奇3, 王彬3, 吴桂琴3, 连玲1,2,*()   

  1. 1. 中国农业大学动物科技学院, 北京 100193
    2. 中国农业大学, 分子设计育种前沿科学中心, 北京 100193
    3. 北京市华都峪口禽业有限责任公司, 北京 101206
  • 收稿日期:2024-03-06 出版日期:2024-10-23 发布日期:2024-11-04
  • 通讯作者: 连玲 E-mail:wjf960428@163.com;lianlinglara@126.com
  • 作者简介:吴俊锋(1996-), 男, 河南信阳人, 博士生, 主要从事家禽遗传育种研究, E-mail: wjf960428@163.com
  • 基金资助:
    国家蛋鸡产业技术体系(CARS-40);国家重点研发项目(2021YFD1300600);国家自然科学基金(32272865)

Accuracy Analysis of Genotype Imputation from 10K to 50K SNP Loci in Layers

Junfeng WU1,2(), Yiyuan YAN3, Ning YANG1,2, Congjiao SUN1,2, Guangqi LI3, Bin WANG3, Guiqin WU3, Ling LIAN1,2,*()   

  1. 1. College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
    2. Frontier Science Center for Molecular Design Breeding, China Agricultural University, Beijing 100193, China
    3. Beijing Huadu Yukou Poultry Industry Co., Ltd., Beijing 101206, China
  • Received:2024-03-06 Online:2024-10-23 Published:2024-11-04
  • Contact: Ling LIAN E-mail:wjf960428@163.com;lianlinglara@126.com

摘要:

旨在分析使用低密度芯片的基因分型数据通过基因型填充获取高密度的基因型数据的准确性。本研究利用10K SNP芯片数据填充至50K,分析填充所得基因型与50K真实基因型的基因型一致性。具体方法如下:使用4 435只健康纯系蛋鸡母系个体为试验群体,采用蛋鸡“凤芯壹号”50K芯片进行基因型测定获得基因型数据。在该群体中,随机抽取部分个体分别作为填充群体和参考群体。从填充群体的50K分型数据中均匀抽取10K基因型作为已知信息,其余位点信息将通过填充获得。填充时,结合参考群数据,利用Beagle 4.0软件将填充群体的10K分型数据填充至50K水平,对比填充基因型和真实基因型的一致性,以基因型填充一致性评价基因型填充准确性。同时比较系谱使用与否(所用填充群100只,参考群1 000只)、群体间亲缘关系(所用填充群100只,参考群1 000只)以及参考群体规模(所用填充群100只,参考群500、1 000、2 000、3 000只)3种因素对基因型填充准确性的影响。结果表明,本研究群体中,系谱信息的使用与否未影响基因型填充的一致性(0.973 vs. 0.973)。基因型填充一致性随群体间亲缘关系的改变而变化,当参考群体选取18世代个体(1 000只),来填充19世代群体(100只)基因型时,填充一致性为0.972,当参考群体分别均匀选取16、17、18世代个体(三世代群体总计选择1 000只)时,基因型填充一致性下降至0.968。基因型填充一致性随参考群体规模增大而上升,参考群体规模按500、1 000、2 000、3 000依次扩大时,基因型填充一致性依次提高,分别为0.959、0.973、0.980、0.984。本研究结果表明,蛋鸡基因芯片从10K填充至50K的方法可行,可在基因组选择育种中大规模推广,以降低应用成本。

关键词: 蛋鸡, 基因芯片, 基因型填充, 分子育种

Abstract:

The aim of this study was to analyze the accuracy of using low-density chip genotyping data to obtain high-density genotype data through genotype imputation. In this study, the genotypic consistency between the imputed genotypes based on 10K and the 50K real genotypes was analyzed. The 4 435 healthy brown laying hens from pure-line were selected, and 50K SNP array of laying hens was used for genotype determination. Some individuals were randomly selected as test and reference populations, respectively. The 10K genotype data was evenly selected from 50K typing data as known genotypes, and Beagle 4.0 software was used to impute genotypes of the rest 40K to obtain 50K data. The consistency of the imputed genotype and the real genotype was compared. The accuracy of genotype filling was evaluated by genotype filling consistency. We also analyzed the influence of following 3 aspects on the accuracy of genotype filling: 1) usage of pedigree or not (100 individuals in test population vs. 1 000 individuals in reference population); 2) kinship between reference and test population (100 individuals in test population vs. 1 000 individuals in reference population); 3) reference population size (100 individuals in test population vs. 500, 1 000, 2 000, 3 000 individuals in reference population). The results showed that the consistency of genotypic imputation was not affected by the use of genealogical information or not (0.973 vs. 0.973). The consistency of genotypic imputation was changed as the change of inter-population kinship. Using the individuals from 18th generation (1 000 individuals) as reference population to impute the genotypes of 19th generation population (100 individuals), the consistency of genotypic imputation was to 0.972. When using individuals from the 16th, 17th and 18th generations as reference population to impute the genotypes of 19th generation population, and the consistency of genotypic imputation was decreased to 0.968. The imputation consistency of genotype was increased with the increase of reference population size. The consistency of genotype imputation was 0.959, 0.973, 0.980, and 0.984 when the reference population size was 500, 1 000, 2 000, and 3 000, respectively. Collectively, this study shows that the imputation of layer SNP array from 10K to 50K is feasible, and it can be applied in genome selection breeding to reduce genotyping costs.

Key words: laying hens, SNP array, genotype imputation, molecular breeding

中图分类号: