Logo image
Pangenome and pantranscriptome as the new reference for gene family characterisation – a case study of basic helix-loop-helix (bHLH) genes in barley
Journal article   Open access   Peer reviewed

Pangenome and pantranscriptome as the new reference for gene family characterisation – a case study of basic helix-loop-helix (bHLH) genes in barley

Cen Tong, Yong Jia, Haifei Hu, Zhanghui Zeng, Brett Chapman and Chengdao Li
Plant communications, Vol.6(1), 101190
2024
pdf
Published5.14 MBDownloadView
CC BY-NC-ND V4.0 Open Access

Abstract

Barley pangenome Basic helix-loop-helix (bHLH) Core and dispensable genes Genome-wide gene family evolution Orthologous gene group Pantranscriptome
Genome-wide identification and comparative gene family analyses have been commonly performed to investigate species-specific evolution linked to various traits and molecular pathways. However, most previous studies were limited to gene screening in a single reference genome, failing to account for the gene presence/absence variations (gPAVs) in a species. Here, we propose an innovative pangenome-based approach of gene family analyses based on orthologous gene groups (OGGs). Using the basic helix-loop-helix (bHLH) transcription factor family in barley as an example, we identified 161 ∼ 176 bHLHs in 20 barley genomes, which could be classified into 201 OGGs. These 201 OGGs were further classified into 140 core, 12 soft-core, 29 shell, and 20 line-specific/cloud bHLHs, revealing a complete profile of bHLH in barley. Using a genome-scan approach, we overcome the genome annotation bias and identified on average 1.5 un-annotated core bHLHs per barley genome. We found that all core bHLHs belong to whole genome/segmental duplicates whilst dispensable bHLHs were more likely to result from small scale duplication events. Interestingly, we noticed that the dispensable bHLHs tended to enrich in specific subfamilies SF13, SF27, and SF28, implying the potential biased expansion of specific bHLHs in barley. We found that 50% of the bHLHs contain at least one intact transposon element within the 2kb upstream-to-downstream region. bHLHs with CNV have 1.48 TEs on average, significantly higher than 1.36 for core bHLH without CNV, supporting TEs’ potential role in bHLH expansion. Selection pressure analyses showed that dispensable bHLHs had experienced clear relaxed selection compared to core bHLHs, consistent with their conservation patterns. We further integrate pangenome with recently available barley pantranscriptome data in 5 tissues and discovered apparent transcriptional divergence within and across bHLH subfamilies. We conclude that pangenome-based gene family analyses can better describe the genuine evolution status of bHLHs untapped before and provided novel insights into bHLH evolution in barley. We expect this study will inspire similar analyses in many other gene families and species. Short Summary: Traditional gene family analyses were limited and biased toward gene content in a single reference genome, which failed to account for the gene presence/absence variations (gPAVs) in the population. Here, we described an innovative pangenome-based approach of gene family characterization, which took gPAVs into consideration, identified all bHLH genes present in barley pangenome, and could better describe the genuine evolutionary status of a gene family in a target species. We are also the first to integrate pangenome gene family analyses with pantranscriptome data. This study provided novel insights into bHLH evolution and will inspire similar analyses in many other gene families and species.

Details

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

#12 Responsible Consumption & Production

Metrics

8 File views/ downloads
104 Record Views
Logo image