Skip to content

lcerdeira/Pipa

Repository files navigation

PIPA_Logo

PIPA - Pipeline for Microbial Genomic Analysis

Code Count Main Code Base Version License Last Commit Documentation DOI

PIPA is an integrated platform for microbial genomic analysis that supports Illumina, Nanopore, and PacBio sequencing data. It provides a web interface and desktop application with a Flask backend that orchestrates bioinformatics tools for read trimming, genome assembly, gene prediction, and reporting.

Full Documentation | GitHub Repository | Download Desktop App

Pipeline Stages

Stage Tools Description
Trimming Trim Galore, Porechop_ABI Quality trimming of Illumina and Nanopore reads
Assembly SPAdes, Canu, Flye, Unicycler Genome assembly from short and long reads
Annotation 48 tools across 5 categories (see below) Gene annotation, typing, resistance, virulence, and more
Report KEGG-decoder Functional pathway visualization

Annotation Tools (48 total)

Category Tools
General Annotation Prokka, Bakta, MLST, Barrnap, tRNAscan-SE, EggNOG-mapper, KOFAM
Assembly Quality BUSCO, CheckM, QUAST
Resistance & Virulence Abricate (NCBI), Abricate (VFDB), AMRFinderPlus, mcroni
Mobile Elements & Defense PlasmidFinder, MOB-suite, Phigaro, PhiSpy, CRISPRCasFinder, DefenseFinder, ISMapper
Organism-Specific Typing Kleborate, staphtyper, TBProfiler, ClermonTyping, ECTyper, emmtyper, GenoTyphi, hicap, HpSuisSero, legsta, LisSero, meningotype, ngmaster, pasty, pbptyper, PneumoCaT, sccmec, SeqSero2, SeroBA, ShigaPass, ShigaTyper, ShigEiFinder, SISTR, spaTyper, SsuisSero, staphopia-sccmec, STECFinder

Desktop Application

Download native installers from the Releases page:

Platform File Architecture
macOS (Apple Silicon) PIPA_2.0.0_aarch64.dmg M1, M2, M3, M4
macOS (Intel) PIPA_2.0.0_x64.dmg Intel Macs
Windows PIPA_2.0.0_x64-setup.exe 64-bit Windows 10/11
Linux (Debian/Ubuntu) pipa_2.0.0_amd64.deb Debian-based
Linux (Universal) pipa_2.0.0_amd64.AppImage Any 64-bit Linux

Prerequisites

The desktop app requires Docker Desktop to run the bioinformatics backend. On first launch, PIPA automatically pulls the lcerdeira/pipa Docker image (~5 GB compressed) with all 48 analysis tools pre-installed. No manual configuration required.

  1. Install Docker Desktop
  2. Download and install PIPA from Releases
  3. Launch PIPA — the backend starts automatically

Quick Start (Docker)

The PIPA Docker image is available on Docker Hub:

# Pull and run the backend with all 48 bioinformatics tools
docker run -d --name pipa-backend -p 5000:5000 -v pipa-data:/data lcerdeira/pipa:latest

# The API is now available at http://localhost:5000
curl http://localhost:5000/api/jobs

Or clone the repository and build locally:

git clone https://github.com/lcerdeira/Pipa.git
cd Pipa
docker build -t pipa .
docker run -d --name pipa-backend -p 5000:5000 -v pipa-data:/data pipa

Quick Start (Web SPA)

# Install backend dependencies
cd back-end
pip install -r requirements.txt
flask run --host 0.0.0.0 --port 5000

# In another terminal, install and run the frontend
cd UI/DesktopPIPA
yarn install
npx quasar dev

Open http://localhost:8080 in your browser.

Installation (Conda)

For native installation with all bioinformatics tools:

# Create the conda environment
conda env create -f environment.yml
conda activate pipa

# Start the backend
cd back-end
flask run --host 0.0.0.0 --port 5000

API Endpoints

Method Endpoint Description
POST /api/upload Upload sequencing files (multipart form)
POST /api/run Start pipeline with configuration
GET /api/status/<job_id> Get pipeline progress and status
GET /api/results/<job_id> Get pipeline results
GET /api/results/<job_id>/files/<path> Download result file
GET /api/jobs List all jobs

Example API Usage

# Upload files
curl -X POST http://localhost:5000/api/upload \
  -F "illumina=@reads_R1.fastq.gz" \
  -F "illumina=@reads_R2.fastq.gz" \
  -F "illumina_type=paired"

# Start pipeline
curl -X POST http://localhost:5000/api/run \
  -H "Content-Type: application/json" \
  -d '{"job_id": "abc123", "genus": "Klebsiella", "species": "pneumoniae", "genome_size": "5.5m"}'

# Check status
curl http://localhost:5000/api/status/abc123

Configuration

The pipeline accepts these parameters per run:

Parameter Default Description
genus Unknown Organism genus (used by Prokka, Kleborate)
species unknown Organism species
sample_name sample Name for the analysis run
genome_size 5m Estimated genome size (used by Canu)
input_type reads reads (full pipeline) or assembly (annotation only)
tools [] Annotation tools to run (e.g., ["prokka", "mlst", "abricate"])

Environment variable PIPA_DATA_DIR controls where data is stored (default: ./data).

Project Structure

Pipa/
  back-end/              # Flask REST API
    app.py               # Application entry point
    blueprints/          # API and view routes
    extensions/          # Pipeline services
      services.py        # Pipeline orchestrator
      services1/         # Individual service modules
    tests/               # API tests
  UI/DesktopPIPA/        # Desktop & web application
    src/                 # Vue.js components and store
    src-tauri/           # Tauri desktop wrapper (Rust)
  docs/                  # Documentation (MkDocs)
  environment.yml        # Conda environment specification
  Dockerfile             # Container build
  docker-compose.yml     # Container orchestration
  examples/              # Example scripts

Running Tests

cd back-end
pip install pytest
python -m pytest tests/ -v

Contact

Dr Louise Cerdeira - [email protected]

License

Copyright 2026. GPL-3.0.

Citation

If you use PIPA in your research, please cite:

Cerdeira, L. PIPA: Pipeline for Microbial Genomic Analysis. https://github.com/lcerdeira/Pipa

DOI

About

PIPA is an integrated platform for microbial genomic analysis that supports Illumina, Nanopore, and PacBio sequencing data. It provides a web interface with a Flask backend that orchestrates bioinformatics tools for read trimming, genome assembly, gene prediction, and reporting.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors