GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases [version 1; peer review: 2 approved]

The increase in volume and diversity of relevant data on infectious diseases and their drivers provides opportunities to generate new scientific insights that can support ‘real-time’ decision-making in public health across outbreak contexts and enhance pandemic preparedness. However, utilising the w...

Full description

Saved in:
Bibliographic Details
Main Authors: Samir Bhatt, John-Stuart Brittain, Houriiyah Tegally, Rhys Inward, Joseph Tsui, Gaspary Mwanyika, Bernardo Gutierrez, Sofonias Kifle Tessema, Tuyen Huynh, Abhishek Dasgupta, John T. McCrone, George Githinji, Moritz U.G. Kraemer, Stephen Ratcliffe
Format: Article
Language:English
Published: Wellcome 2025-05-01
Series:Wellcome Open Research
Subjects:
Online Access:https://wellcomeopenresearch.org/articles/10-279/v1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1839648497254531072
author Samir Bhatt
John-Stuart Brittain
Houriiyah Tegally
Rhys Inward
Joseph Tsui
Gaspary Mwanyika
Bernardo Gutierrez
Sofonias Kifle Tessema
Tuyen Huynh
Abhishek Dasgupta
John T. McCrone
George Githinji
Moritz U.G. Kraemer
Stephen Ratcliffe
author_facet Samir Bhatt
John-Stuart Brittain
Houriiyah Tegally
Rhys Inward
Joseph Tsui
Gaspary Mwanyika
Bernardo Gutierrez
Sofonias Kifle Tessema
Tuyen Huynh
Abhishek Dasgupta
John T. McCrone
George Githinji
Moritz U.G. Kraemer
Stephen Ratcliffe
author_sort Samir Bhatt
collection DOAJ
description The increase in volume and diversity of relevant data on infectious diseases and their drivers provides opportunities to generate new scientific insights that can support ‘real-time’ decision-making in public health across outbreak contexts and enhance pandemic preparedness. However, utilising the wide array of clinical, genomic, epidemiological, and spatial data collected globally is difficult due to differences in data preprocessing, data science capacity, and access to hardware and cloud resources. To facilitate large-scale and routine analyses of infectious disease data at the local level (i.e. without sharing data across borders), we developed GRAPEVNE (Graphical Analytical Pipeline Development Environment), a platform enabling the construction of modular pipelines designed for complex and repetitive data analysis workflows through an intuitive graphical interface. Built on the Snakemake workflow management system, GRAPEVNE streamlines the creation, execution, and sharing of analytical pipelines. Its modular approach already supports a diverse range of scientific applications, including genomic analysis, epidemiological modeling, and large-scale data processing. Each module in GRAPEVNE is a self-contained Snakemake workflow, complete with configurations, scripts, and metadata, enabling interoperability. The platform’s open-source nature ensures ongoing community-driven development and scalability. GRAPEVNE empowers researchers and public health institutions by simplifying complex analytical workflows, fostering data-driven discovery, and enhancing reproducibility in computational research. Its user-driven ecosystem encourages continuous innovation in biomedical and epidemiological research but is applicable beyond that. Key use-cases include automated phylogenetic analysis of viral sequences, real-time outbreak monitoring, forecasting, and epidemiological data processing. For instance, our dengue virus pipeline demonstrates end-to-end automation from sequence retrieval to phylogeographic inference, leveraging established bioinformatics tools which can be deployed to any geographical context. For more details, see documentation at: https://grapevne.readthedocs.io
format Article
id doaj-art-cf46a84d24aa4fa889ffce2a01b187af
institution Matheson Library
issn 2398-502X
language English
publishDate 2025-05-01
publisher Wellcome
record_format Article
series Wellcome Open Research
spelling doaj-art-cf46a84d24aa4fa889ffce2a01b187af2025-06-29T01:00:00ZengWellcomeWellcome Open Research2398-502X2025-05-011026280GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases [version 1; peer review: 2 approved]Samir Bhatt0John-Stuart Brittain1https://orcid.org/0000-0002-4172-190XHouriiyah Tegally2Rhys Inward3https://orcid.org/0000-0003-0016-661XJoseph Tsui4https://orcid.org/0000-0001-7871-8627Gaspary Mwanyika5Bernardo Gutierrez6Sofonias Kifle Tessema7Tuyen Huynh8https://orcid.org/0009-0009-4878-8953Abhishek Dasgupta9John T. McCrone10George Githinji11https://orcid.org/0000-0001-9640-7371Moritz U.G. Kraemer12https://orcid.org/0000-0001-8838-7147Stephen Ratcliffe13Department of Public Health, University of Copenhagen, 1352 Copenhagen, DenmarkOxford Research Software Engineering Group, University of Oxford, Oxford, England, UKCentre for Epidemic Response and Innovation (CERI), Stellenbosch University, Stellenbosch, Western Cape, South AfricaPandemic Sciences Institute, University of Oxford, Oxford, England, UKPandemic Sciences Institute, University of Oxford, Oxford, England, UKCentre for Epidemic Response and Innovation (CERI), Stellenbosch University, Stellenbosch, Western Cape, South AfricaPandemic Sciences Institute, University of Oxford, Oxford, England, UKAfrica Centres for Disease Control and Prevention (Africa CDC), Addis Ababa, EthiopiaOxford University Clinical Research Unit, Ho Chi Minh City, Ho Chi Minh, VietnamOxford Research Software Engineering Group, University of Oxford, Oxford, England, UKVaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, Washington, USAKEMRI-Wellcome Trust Research Programme, Kilifi, KenyaPandemic Sciences Institute, University of Oxford, Oxford, England, UKGoogle Inc., Mountain View, USAThe increase in volume and diversity of relevant data on infectious diseases and their drivers provides opportunities to generate new scientific insights that can support ‘real-time’ decision-making in public health across outbreak contexts and enhance pandemic preparedness. However, utilising the wide array of clinical, genomic, epidemiological, and spatial data collected globally is difficult due to differences in data preprocessing, data science capacity, and access to hardware and cloud resources. To facilitate large-scale and routine analyses of infectious disease data at the local level (i.e. without sharing data across borders), we developed GRAPEVNE (Graphical Analytical Pipeline Development Environment), a platform enabling the construction of modular pipelines designed for complex and repetitive data analysis workflows through an intuitive graphical interface. Built on the Snakemake workflow management system, GRAPEVNE streamlines the creation, execution, and sharing of analytical pipelines. Its modular approach already supports a diverse range of scientific applications, including genomic analysis, epidemiological modeling, and large-scale data processing. Each module in GRAPEVNE is a self-contained Snakemake workflow, complete with configurations, scripts, and metadata, enabling interoperability. The platform’s open-source nature ensures ongoing community-driven development and scalability. GRAPEVNE empowers researchers and public health institutions by simplifying complex analytical workflows, fostering data-driven discovery, and enhancing reproducibility in computational research. Its user-driven ecosystem encourages continuous innovation in biomedical and epidemiological research but is applicable beyond that. Key use-cases include automated phylogenetic analysis of viral sequences, real-time outbreak monitoring, forecasting, and epidemiological data processing. For instance, our dengue virus pipeline demonstrates end-to-end automation from sequence retrieval to phylogeographic inference, leveraging established bioinformatics tools which can be deployed to any geographical context. For more details, see documentation at: https://grapevne.readthedocs.iohttps://wellcomeopenresearch.org/articles/10-279/v1data science automated workflows graphical interface snakemake open-source epidemiologyeng
spellingShingle Samir Bhatt
John-Stuart Brittain
Houriiyah Tegally
Rhys Inward
Joseph Tsui
Gaspary Mwanyika
Bernardo Gutierrez
Sofonias Kifle Tessema
Tuyen Huynh
Abhishek Dasgupta
John T. McCrone
George Githinji
Moritz U.G. Kraemer
Stephen Ratcliffe
GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases [version 1; peer review: 2 approved]
Wellcome Open Research
data science
automated workflows
graphical interface
snakemake
open-source
epidemiology
eng
title GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases [version 1; peer review: 2 approved]
title_full GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases [version 1; peer review: 2 approved]
title_fullStr GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases [version 1; peer review: 2 approved]
title_full_unstemmed GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases [version 1; peer review: 2 approved]
title_short GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases [version 1; peer review: 2 approved]
title_sort grapevne graphical analytical pipeline development environment for infectious diseases version 1 peer review 2 approved
topic data science
automated workflows
graphical interface
snakemake
open-source
epidemiology
eng
url https://wellcomeopenresearch.org/articles/10-279/v1
work_keys_str_mv AT samirbhatt grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT johnstuartbrittain grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT houriiyahtegally grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT rhysinward grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT josephtsui grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT gasparymwanyika grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT bernardogutierrez grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT sofoniaskifletessema grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT tuyenhuynh grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT abhishekdasgupta grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT johntmccrone grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT georgegithinji grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT moritzugkraemer grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved
AT stephenratcliffe grapevnegraphicalanalyticalpipelinedevelopmentenvironmentforinfectiousdiseasesversion1peerreview2approved