DNA sequence grammar genomic prediction genomic language model molecular marker
Genomic prediction based on molecular markers has substantially advanced genomic selection; however, prediction accuracy often plateaus despite continued increases in marker density and methodological refinement. This saturation limits the effective use of available genomic information. The emergence of genomic language models (GLMs) offers a new framework for incorporating richer sequence-based information into genomic prediction, potentially capturing biologically meaningful DNA sequence grammar that is poorly represented by traditional marker-based approaches. We conclude that the future of genomic prediction will be shaped not primarily by algorithmic refinement but by the biological expressivity of genomic representations, and that GLMs offer a principled path toward expanding this representational frontier.
Details
Title
Genomic language model-based genomic prediction in plant breeding