VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants

Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security. The advent of Genomic Selection heralds a new epoch in breeding, characterized by its capacity to harness whole-genome variation for genomic prediction. This approach transcends the need for p...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiangyu Zhao, Fuzhen Sun, Jinlong Li, Dongfeng Zhang, Qiusi Zhang, Zhongqiang Liu, Changwei Tan, Hongxiang Ma, Kaiyi Wang
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2025-12-01
Series:Artificial Intelligence in Agriculture
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2589721725000704
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security. The advent of Genomic Selection heralds a new epoch in breeding, characterized by its capacity to harness whole-genome variation for genomic prediction. This approach transcends the need for prior knowledge of genes associated with specific traits. Nonetheless, the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the “curse of dimensionality”, where traditional statistical, machine learning, and deep learning methods are prone to overfitting and suboptimal predictive performance. To surmount this challenge, we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model (VMGP) that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks. This approach provides a robust solution, offering a formidable predictive framework that has been rigorously validated across public datasets for wheat, rice, and maize. Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction, successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility. Furthermore, by integrating VMGP with model interpretability, we can effectively triage relevant single nucleotide polymorphisms, thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions. The VMGP framework, with its simplicity, stable predictive prowess, and open-source code, is exceptionally well-suited for broad dissemination within plant breeding programs. It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.
ISSN:2589-7217