12  Automating Pipelines: RiFA – Resistance in Plasmodium falciparum Amplicon Sequences

Author
Affiliation

Bethlehem Adnew

RiFA (Resistance in Falciparum Amplicon) is a bioinformatics workflow designed to detect drug resistance mutations in targeted amplicon sequencing data from Plasmodium falciparum, the primary cause of severe malaria. It focuses on key genes linked to resistance against major antimalarial drugs such as artemisinin, chloroquine, sulfadoxine-pyrimethamine, and others.

This pipeline automates processing of amplicon reads → quality control → alignment → variant calling → resistance mutation annotation — making it efficient, reproducible, and scalable for molecular surveillance of antimalarial resistance.

12.1 What Are Workflow Pipelines?

A workflow pipeline is a sequence of automated computational steps that process and analyze data in a structured, reproducible way.

12.1.1 Why Pipelines Matter in Bioinformatics

  • Efficiently handle large datasets (hundreds/thousands of samples)
  • Minimize manual errors and inconsistencies
  • Enable reproducible research — run the same analysis tomorrow or years later with identical results
  • Automate complex multi-step analyses that would otherwise be time-consuming

12.1.2 Classic Amplicon Sequencing Pipeline Example

12.3 Key Benefits of Workflow Pipelines

Efficiency
Automates repetitive tasks → frees researchers for interpretation rather than clicking.

Reproducibility
Version-controlled pipelines ensure others (or future you) can reproduce results exactly.

Scalability
Process 10 samples or 10,000 samples with minimal changes.

Collaboration & Sharing
Share via GitHub, GitLab, or Zenodo → colleagues worldwide can reuse, adapt, and cite your work.

12.4 Hands-on Exercise: Run the RiFA Pipeline

In this exercise, you will use the RiFA pipeline to identify drug resistance mutations in Plasmodium falciparum amplicon sequences.

12.4.1 Target Genes in RiFA

RiFA focuses on key resistance-associated loci:

  • pfcrt — chloroquine resistance transporter
  • pfmdr1 — multidrug resistance protein 1
  • pfk13 (kelch13) — artemisinin partial resistance
  • pfdhfr — pyrimethamine resistance
  • pfdhps — sulfadoxine resistance
  • cytb — atovaquone resistance (cytochrome b)

12.4.2 Pipeline overview

Overview of the RiFA pipeline

Overview of the RiFA pipeline

12.4.3 Getting Started

  1. Visit the official repository:
    https://github.com/BettyAC/RiFA
    → Read the full README, installation guide, and example usage.

  2. Clone the repository:

    git clone https://github.com/BettyAC/RiFA.git
    cd RiFA
  3. Create and run the environment

    # Create and activate the environment (file name may vary — check README)
    conda env create -f environment.yml
    conda activate rifa-env
  4. Run the pipeline

    snakemake --use-conda --cores 8 --rerun-incomplete

Expected Outputs:

  • Annotated variant table (CSV/TSV) with resistance mutations, allele frequencies, and quality metrics
  • MultiQC report summarizing QC and coverage
  • Visual summaries (coverage plots, mutation heatmaps, frequency bar charts)
  • Resistance profile summary per sample

Note: This pipeline is part of efforts like the Ethiopian Malaria Genomics Network (EMAGEN) and similar surveillance projects. Always check the repository for the latest version, example datasets, and any required reference files.