畜牧兽医学报 ›› 2025, Vol. 56 ›› Issue (10): 4963-4972.doi: 10.11843/j.issn.0366-6964.2025.10.017

• 遗传育种 • 上一篇    下一篇

利用群体特异性参考基因组鉴定中国瘤牛SNPs的优势

李艾欣1(), 李紫阳1, 陈文洁1, 田雨阳1, 雷初朝1, 李志钢2,*(), 陈宁博1,*()   

  1. 1. 西北农林科技大学动物科技学院,杨凌 712100
    2. 平顶山市畜牧业发展中心,平顶山 467000
  • 收稿日期:2025-03-06 出版日期:2025-10-23 发布日期:2025-11-01
  • 通讯作者: 李志钢,陈宁博 E-mail:lax2561957992@163.com;long0014@163.com;ningbochen@nwafu.edu.cn
  • 作者简介:李艾欣(2002-),女,陕西安康人,硕士生,主要从事牛遗传资源研究,E-mail: lax2561957992@163.com
  • 基金资助:
    国家重点研发计划(2024YFF1000102);国家自然科学基金(32341054)

Advantages of Using Population-specific Reference Genome for SNP Calling in Chinese Indicine Cattle

LI Aixin1(), LI Ziyang1, CHEN Wenjie1, TIAN Yuyang1, LEI Chuzhao1, LI Zhigang2,*(), CHEN Ningbo1,*()   

  1. 1. College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
    2. Pingdingshan Livestock Husbandry Development Center, Pingdingshan 467000, China
  • Received:2025-03-06 Online:2025-10-23 Published:2025-11-01
  • Contact: LI Zhigang, CHEN Ningbo E-mail:lax2561957992@163.com;long0014@163.com;ningbochen@nwafu.edu.cn

摘要:

本研究旨在以中国瘤牛群体特异性基因组为参考基因组,系统评估其在单核苷酸多态性(single nucleotide polymorphisms, SNPs)鉴定中的优势。以60头中国瘤牛的全基因组重测序数据为研究对象,分别基于欧洲普通牛参考基因组(ARS-UCD1.2)和中国瘤牛群体特异性参考基因组(雷琼牛参考基因组:ASM3988116v1)进行SNPs的鉴定和比较。针对利用ARS-UCD1.2基因组检测到的多等位SNPs,构建基因组之间的坐标映射链式文件,将其转换为ASM3988116v1基因组坐标下的双等位SNPs,并开展深度注释分析。结果表明,在分析中国瘤牛群体遗传变异时,利用ASM3988116v1群体特异性参考基因组相较于ARS-UCD1.2基因组具有显著优势:1)可以更全面地鉴定内含子和非翻译区变异,提升低频和罕见变异的检测灵敏度;2)可以降低由于参考偏倚造成的变异鉴定过程中出现的假阳性;3)实现将部分由于基因组参考偏倚过滤掉的多等位SNPs转换为双等位SNPs,这些SNPs共注释到8 352个基因,其中包含与瘤牛生长发育及环境适应性相关的重要基因,如肌肉发育(CTNNA1)、免疫(SIL1)、血液循环(VPS13A)、肌肉发育和光周期(EYA3)等。针对我国地方黄牛群体的特异性参考基因组能够提升变异检测灵敏度,降低基因组参考偏倚,并挖掘更多具有重要意义的功能位点,为群体遗传学研究和畜禽精准育种提供了高置信度的数据基础,具有重要的理论与实际意义。

关键词: 群体特异性参考基因组, 单核苷酸多态性, 参考偏倚, 双等位/多等位基因SNPs, 功能基因

Abstract:

This study aimed to use the population-specific reference genome of Chinese indicine cattle and systematically evaluate its advantages in identifying SNPs. This study focused on whole-genome resequencing data from 60 Chinese indicine cattle samples to identify and compare SNPs via both the European taurine cattle reference genome (ARS-UCD1.2) and the population-specific reference genome of Chinese indicine cattle (Leiqiong breed: ASM3988116v1). For the multiallelic SNPs detected in ARS-UCD1.2, genomic coordinate mapping chain files between the genomes was established, converting these SNPs into biallelic SNPs on the basis of the coordinates of the population-specific reference genome, followed by functional annotation. Variants detected in Chinese indicine cattle populations via the ASM3988116v1 population-specific reference genome presented significant advantages over the ARS-UCD1.2 genome: 1) This approach could perform a more comprehensive identification of intronic and untranslated region variants, increasing the detection sensitivity for low-frequency and rare variants; 2) It could reduce false positives in variant identification caused by reference bias; 3) It facilitated the conversion of some multiallelic SNPs, previously filtered out owing to genomic reference bias, into biallelic SNPs. These SNPs were annotated to 8 352 genes, including important genes related to the growth, development, and environmental adaptability of indicine cattle, such as muscle development (CTNNA1), immunity (SIL1), blood circulation (VPS13A), muscle development and photoperiod regulation (EYA3), etc. The specific reference genome for Chinese cattle can increase variant detection sensitivity, reduce reference genome bias, and uncover more functionally significant loci. This provides a high-confidence data foundation for population genetics research and precision breeding, offering important theoretical and practical implications.

Key words: population-specific reference genome, SNP, reference bias, biallelic/multi-allelic SNPs, functional genes

中图分类号: