VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants

Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security. The advent of Genomic Selection heralds a new epoch in breeding, characterized by its capacity to harness whole-genome variation for genomic prediction. This approach transcends the need for p...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiangyu Zhao, Fuzhen Sun, Jinlong Li, Dongfeng Zhang, Qiusi Zhang, Zhongqiang Liu, Changwei Tan, Hongxiang Ma, Kaiyi Wang
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2025-12-01
Series:Artificial Intelligence in Agriculture
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2589721725000704
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839639906636267520
author Xiangyu Zhao
Fuzhen Sun
Jinlong Li
Dongfeng Zhang
Qiusi Zhang
Zhongqiang Liu
Changwei Tan
Hongxiang Ma
Kaiyi Wang
author_facet Xiangyu Zhao
Fuzhen Sun
Jinlong Li
Dongfeng Zhang
Qiusi Zhang
Zhongqiang Liu
Changwei Tan
Hongxiang Ma
Kaiyi Wang
author_sort Xiangyu Zhao
collection DOAJ
description Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security. The advent of Genomic Selection heralds a new epoch in breeding, characterized by its capacity to harness whole-genome variation for genomic prediction. This approach transcends the need for prior knowledge of genes associated with specific traits. Nonetheless, the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the “curse of dimensionality”, where traditional statistical, machine learning, and deep learning methods are prone to overfitting and suboptimal predictive performance. To surmount this challenge, we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model (VMGP) that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks. This approach provides a robust solution, offering a formidable predictive framework that has been rigorously validated across public datasets for wheat, rice, and maize. Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction, successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility. Furthermore, by integrating VMGP with model interpretability, we can effectively triage relevant single nucleotide polymorphisms, thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions. The VMGP framework, with its simplicity, stable predictive prowess, and open-source code, is exceptionally well-suited for broad dissemination within plant breeding programs. It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.
format Article
id doaj-art-f0cf5c90cfe24b6184faae896b52f80c
institution Matheson Library
issn 2589-7217
language English
publishDate 2025-12-01
publisher KeAi Communications Co., Ltd.
record_format Article
series Artificial Intelligence in Agriculture
spelling doaj-art-f0cf5c90cfe24b6184faae896b52f80c2025-07-04T04:46:46ZengKeAi Communications Co., Ltd.Artificial Intelligence in Agriculture2589-72172025-12-01154829842VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plantsXiangyu Zhao0Fuzhen Sun1Jinlong Li2Dongfeng Zhang3Qiusi Zhang4Zhongqiang Liu5Changwei Tan6Hongxiang Ma7Kaiyi Wang8Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China; National Engineering Research Center for Information Technology in Agriculture, Beijing, China; Beijing Key Laboratory of Crop Molecular Design and Intelligent Breeding, Beijing, ChinaSchool of Computer Science and Technology, Shandong University of Technology, Zibo, ChinaInformation Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, ChinaInformation Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China; Beijing Key Laboratory of Crop Molecular Design and Intelligent Breeding, Beijing, ChinaInformation Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, ChinaInformation Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China; Beijing Key Laboratory of Crop Molecular Design and Intelligent Breeding, Beijing, ChinaYangzhou University, Yangzhou, ChinaYangzhou University, Yangzhou, ChinaInformation Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China; National Engineering Research Center for Information Technology in Agriculture, Beijing, China; Beijing Key Laboratory of Crop Molecular Design and Intelligent Breeding, Beijing, China; Corresponding author at: Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China.Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security. The advent of Genomic Selection heralds a new epoch in breeding, characterized by its capacity to harness whole-genome variation for genomic prediction. This approach transcends the need for prior knowledge of genes associated with specific traits. Nonetheless, the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the “curse of dimensionality”, where traditional statistical, machine learning, and deep learning methods are prone to overfitting and suboptimal predictive performance. To surmount this challenge, we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model (VMGP) that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks. This approach provides a robust solution, offering a formidable predictive framework that has been rigorously validated across public datasets for wheat, rice, and maize. Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction, successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility. Furthermore, by integrating VMGP with model interpretability, we can effectively triage relevant single nucleotide polymorphisms, thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions. The VMGP framework, with its simplicity, stable predictive prowess, and open-source code, is exceptionally well-suited for broad dissemination within plant breeding programs. It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.http://www.sciencedirect.com/science/article/pii/S2589721725000704Genomic selectionVariational auto-encoderMulti-taskDeep learningGenomic prediction
spellingShingle Xiangyu Zhao
Fuzhen Sun
Jinlong Li
Dongfeng Zhang
Qiusi Zhang
Zhongqiang Liu
Changwei Tan
Hongxiang Ma
Kaiyi Wang
VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants
Artificial Intelligence in Agriculture
Genomic selection
Variational auto-encoder
Multi-task
Deep learning
Genomic prediction
title VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants
title_full VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants
title_fullStr VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants
title_full_unstemmed VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants
title_short VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants
title_sort vmgp a unified variational auto encoder based multi task model for multi phenotype multi environment and cross population genomic selection in plants
topic Genomic selection
Variational auto-encoder
Multi-task
Deep learning
Genomic prediction
url http://www.sciencedirect.com/science/article/pii/S2589721725000704
work_keys_str_mv AT xiangyuzhao vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants
AT fuzhensun vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants
AT jinlongli vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants
AT dongfengzhang vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants
AT qiusizhang vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants
AT zhongqiangliu vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants
AT changweitan vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants
AT hongxiangma vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants
AT kaiyiwang vmgpaunifiedvariationalautoencoderbasedmultitaskmodelformultiphenotypemultienvironmentandcrosspopulationgenomicselectioninplants