畜牧兽医学报 ›› 2025, Vol. 56 ›› Issue (8): 3761-3772.doi: 10.11843/j.issn.0366-6964.2025.08.018

• 遗传育种 • 上一篇    下一篇

太行鸡与坝上长尾鸡品种区分的分子标记筛选与鉴定

付伟1(), 张冉1, 丁虹2, 臧素敏1, 李祥龙3, 褚素乔4, 刘华格2, 周荣艳1,*()   

  1. 1. 河北农业大学,保定 071000
    2. 河北省畜牧兽医研究所,保定 071000
    3. 河北科技师范学院,秦皇岛 066600
    4. 石家庄市畜牧技术推广站,石家庄 050000
  • 收稿日期:2025-01-22 出版日期:2025-08-23 发布日期:2025-08-28
  • 通讯作者: 周荣艳 E-mail:fw5820804@163.com;rongyanzhou@126.com
  • 作者简介:付伟(2000-),女,河北沧州人,硕士生,主要从事动物遗传育种与繁殖研究,E-mail: fw5820804@163.com
  • 基金资助:
    石家庄市太行鸡种业科技创新团队建设项目(232500462A);河北省现代农业产业技术体系蛋禽创新团队良繁体系与品种培育岗(HBCT2024260204)

Screening and Identification of Molecular Markers for Differentiating Taihang Chickens and Bashang Long-tailed Chickens

FU Wei1(), ZHANG Ran1, DING Hong2, ZANG Sumin1, LI Xianglong3, CHU Suqiao4, LIU Huage2, ZHOU Rongyan1,*()   

  1. 1. Hebei Agricultural University, Baoding 071000, China
    2. Institute of Animal Science and Veterinary Medicine of Hebei Province, Baoding 071000, China
    3. Hebei Normal University of Science and Technology, Qinhuangdao 066600, China
    4. Shijiazhuang Animal Husbandry Technology Promotion Station, Shijiazhuang 050000, China
  • Received:2025-01-22 Online:2025-08-23 Published:2025-08-28
  • Contact: ZHOU Rongyan E-mail:fw5820804@163.com;rongyanzhou@126.com

摘要:

旨在通过筛选和挖掘太行鸡与坝上长尾鸡的特征SNPs分子标记,实现利用少量SNPs区分两个品种的目标。本研究通过61只太行鸡和56只坝上长尾鸡品种的全基因组重测序数据,采用连锁不平衡和群体分化指数分析筛选重要的SNPs;采用6种机器学习分类模型对个体类别进行预测并评估模型性能;利用随机森林模型评估SNPs重要性,根据准确率、召回率和AUC值筛选出最少数量的SNPs;最后通过主成分分析、系统进化树和基因组关系矩阵验证其区分效果。结果,筛选出28个SNPs标记能够有效区分太行鸡和坝上长尾鸡,这些SNPs位点主要分布于基因间区或内含子区,显著富集在免疫、组蛋白乙酰化、代谢等相关的GO生物学过程以及碱基切除修复的KEGG信号通路,并发现免疫(BCL11B、AvBD13、KAT7)、脂质代谢(URI1、RREB1、ZBTB20)和适应性(ZNF536)相关基因。利用群体遗传学和机器学习方法筛选出能够区分太行鸡和坝上长尾鸡品种的分子标记组合,为地方品种“分子身份证”的构建以及种质资源的保护和鉴定工作提供重要参考。

关键词: 太行鸡, 坝上长尾鸡, 分子标记, 机器学习

Abstract:

The aim was to distinguish between Taihang chickens and Bashang long-tailed chickens using a small number of SNPs by screening characteristic SNPs molecular markers. The whole genome resequencing data from 61 Taihang and 56 Bashang long-tailed chickens were used to extract important SNPs through linkage disequilibrium and population differentiation index analyses. Six machine learning classification models were adopted to predict individual categories and evaluate the model performance. The random forest model was used to evaluate the importance of SNPs, and the minimum number of SNPs were selected according to the accuracy, recall rate and AUC value. The effect was validated using principal component analysis, phylogenetic tree, and a genomic relationship matrix. A total of 28 SNPs were identified that effectively discriminate between Taihang and Bashang long-tailed chickens. The SNPs were primarily located in intergenic and intronic regions and were significantly enriched in the GO biological processes related to immunity, histone acetylation, metabolism as well as in the KEGG signaling pathway for base excision repair. Additionally, we identified genes associated with immunity (BCL11B, AvBD13, KAT7), lipid metabolism (URI1, RREB1, ZBTB20), and adaptability (ZNF536). The identified molecular marker combinations with integration of population genetics and machine learning techniques could distinguish Taihang and Bashang long-tailed chicken breeds. This study provides valuable insights for the development of "molecular identity cards" for local breeds, contributing to the conservation and identification of germplasm resources.

Key words: Taihang chicken, Bashang long-tailed chicken, molecular marker, machine learning

中图分类号: