A rapid heuristic algorithm to solve the single individual haplotype assembly problem

The Haplotype Assembly is the computational process in which two distinct nucleotide sequences of chromosomes are reconstructed using the sequencing reads of an individual. The ability to identify haplotypes provides many benefits for future genomic-based studies to be conducted in many areas, such...

Full description

Saved in:
Bibliographic Details
Main Authors: Melina Bagher, Reza Karimzadeh, Mehran Jahed, Babak Hossein Khalaj
Format: Article
Language:English
Published: Amirkabir University of Technology 2023-12-01
Series:AUT Journal of Electrical Engineering
Subjects:
Online Access:https://eej.aut.ac.ir/article_5196_c206979ddc5b6231b7cfc699bfbdc0ba.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The Haplotype Assembly is the computational process in which two distinct nucleotide sequences of chromosomes are reconstructed using the sequencing reads of an individual. The ability to identify haplotypes provides many benefits for future genomic-based studies to be conducted in many areas, such as drug design, population study, and disease diagnosis. Even though several approaches have been put out to achieve highly accurate haplotypes, the problem of quick and precise haplotype assembly remains a challenging task. Due to the enormous bulk of the high-throughput sequencing data, algorithm speed plays a crucial role in the possibility of haplotype assembly in the human genome dimension. This study introduces a heuristic technique that enables rapid haplotype reconstruction while maintaining respectable accuracy. Our approach is divided into two parts. In the first, a partial haplotype is created and enlarged over a number of iterations. We have employed a novel metric to assess the reconstructed haplotype's quality in each iteration to arrive at the optimal answer. The second stage of the algorithm involves refining the reconstructed haplotypes to increase their accuracy. The outcome reveals that the suggested approach is capable of reconstructing the haplotypes with an acceptable level of accuracy. In terms of speed, the performance of the algorithm surpasses the competing approaches, especially in the case of high-coverage sequencing data.
ISSN:2588-2910
2588-2929