A BAC-guided haplotype assembly pipeline increases the resolution of the virus resistance locus CMD2 in cassava

Abstract Background Cassava is an important crop for food security in the tropics where its production is jeopardized by several viral diseases, including the cassava mosaic disease (CMD) which is endemic in Sub-Saharan Africa and the Indian subcontinent. Resistance to CMD is linked to a single domi...

Full description

Saved in:
Bibliographic Details
Main Authors: Luc Cornet, Syed Shan-e-Ali Zaidi, Jia Li, Yvan Ngapout, Sara Shakir, Loic Meunier, Caroline Callot, William Marande, Marc Hanikenne, Stephane Rombauts, Yves Van de Peer, Hervé Vanderschuren
Format: Article
Language:English
Published: BMC 2025-06-01
Series:Genome Biology
Online Access:https://doi.org/10.1186/s13059-025-03620-8
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Cassava is an important crop for food security in the tropics where its production is jeopardized by several viral diseases, including the cassava mosaic disease (CMD) which is endemic in Sub-Saharan Africa and the Indian subcontinent. Resistance to CMD is linked to a single dominant locus, namely CMD2. The cassava genome contains highly repetitive regions making the accurate assembly of a reference genome challenging. Results In the present study, we generate BAC libraries of the CMD-susceptible cassava cultivar (cv.) 60444 and the CMD-resistant landrace TME3. We subsequently identify and sequence BACs belonging to the CMD2 region in both cultivars using high-accuracy long-read PacBio circular consensus sequencing (ccs) reads. We then sequence and assemble the complete genomes of cv. 60444 and TME3 using a combination of ONT ultra-long reads and optical mapping. Anchoring the assemblies on cassava genetic maps reveals discrepancies in our, as well as in previously released, CMD2 regions of the cv. 60444 and TME3 genomes. A BAC-guided approach to assess cassava genome assemblies significantly improves the synteny between the assembled CMD2 regions of cv. 60444 and TME3 and the CMD2 genetic maps. We then performed repeat-unmasked gene annotation on CMD2 assemblies and identify 81 stress resistance proteins present in the CMD2 region, among which 31 were previously not reported in publicly available CMD2 sequences. Conclusions The BAC-assessed approach improved CMD2 region accuracy and revealed new sequences linked to virus resistance, advancing our understanding of cassava mosaic disease resistance.
ISSN:1474-760X