It's a web-based application that allows users to upload microbiome sequencing data (such as 16S rRNA gene sequencing), perform basic data analysis, and generate visualizations of the microbiome diversity.
Microbiome analysis using 16S rRNA sequencing identifies which bacteria are present in your sample by reading a specific genetic "barcode" that all bacteria have. The sequencing machine reads millions of these DNA barcodes, and specialized software groups them into different bacterial species and measures how abundant each one is. This tells you the diversity of your microbial community - which bacteria are present, how many different types there are, and which ones dominate.
This application was developed as part of the AI Dev Tools Zoomcamp by DataTalks.Club, a free course focused on building AI-powered applications with modern development tools and best practices.
Backend:
🦎 Django • Python 3.12 • Django REST Framework • SQLite
Bioinformatics:
🧬 Nextflow • nf-core/ampliseq 25.10.2 • DADA2 • Cutadapt • Conda/Mamba
Data & Analysis:
📊 Pandas • Matplotlib
DevOps & Deployment:
🐳 Docker • Docker Compose
☁️ Render
CI/CD:
🔄 GitHub Actions
Frontend:
✨ Vibe-coded with Lovable: React 18 • TypeScript • Vite • shadcn-ui • Tailwind CSS
- 🧬 16S rRNA Sequencing Analysis - Upload FASTQ files for bacterial identification
- 🧪 Test Data Mode - Try the pipeline with built-in sample data
- 📊 Interactive Visualizations - View taxonomy composition and diversity metrics
- 🔄 Real-time Status Updates - Track analysis progress live
- 📈 Comprehensive Reports - Get detailed HTML reports with all results
- 🐳 Dockerized Deployment - Easy local development and production deployment
- ✅ Automated Testing - 42 tests (25 backend + 17 frontend) with CI/CD
- ☁️ Cloud-Ready - Deploy to AWS, Render, or Railway
# Clone repository
git clone https://github.com/katwre/Microbiome-ai-dev.git
cd Microbiome-ai-dev
# Start with Docker Compose
cd docker
docker-compose up -d
# Access application
# Frontend: http://localhost
# Backend API: http://localhost:8000/api/Try it out locally:
- Open http://localhost
- Click "Start New Analysis"
- Fill in project details
- Check "Use sample data for testing"
- Click "Run Analysis"
- Wait ~5-10 minutes for results
🌐 Live Demo: https://microbiome-frontend.onrender.com
⚠️ Note: The live demo runs on Render's free tier and may be temporarily offline due to inactivity (15min sleep) or resource limitations. Alternatively, please run it locally using Docker.
Backend - Backend Documentation
- REST API with Django & Django REST Framework
- PostgreSQL (production) / SQLite (development)
- Comprehensive test suite (25 tests)
Frontend - Frontend Documentation
- React SPA with TypeScript
- Component library: shadcn-ui
- Testing with Vitest (17 tests)
Bioinformatics Pipeline
- Nextflow workflow engine
- nf-core/ampliseq v2.15.0
- DADA2 for ASV calling
- GTDB taxonomic classification
Testing - Testing Guide
- 42 total tests (100% passing)
- Unit tests for models and API
- Integration tests for workflows
- CI pipeline with GitHub Actions
Deployment
- Render Deployment Guide - Quick cloud deployment
- CI/CD Documentation - Automated testing and deployment
Create Analysis Job
POST /api/jobs/upload/
Content-Type: multipart/form-data
Parameters:
- project_name: string (required)
- email: string (required)
- data_type: "paired-end" | "single-end" (required)
- files: File[] (optional if use_test_data=true)
- use_test_data: boolean (default: false)
- send_email: boolean (default: true)
Response:
{
"job_id": "uuid",
"status": "pending",
"message": "Job created successfully"
}Get Job Status
GET /api/jobs/{job_id}/status/
Response:
{
"job_id": "uuid",
"status": "pending" | "processing" | "completed" | "failed",
"created_at": "timestamp",
"updated_at": "timestamp",
"completed_at": "timestamp | null",
"error_message": "string | null"
}Get Job Details
GET /api/jobs/{job_id}/
Response:
{
"job_id": "uuid",
"project_name": "string",
"email": "string",
"status": "string",
"files": [...],
"result": {...}
}Get Analysis Results
GET /api/jobs/{job_id}/results/
Response:
{
"report_html": "url",
"taxonomy_plot": "url",
"alpha_diversity_plot": "url",
"beta_diversity_plot": "url",
"execution_time": number
}Get Bacteria Composition
GET /api/jobs/{job_id}/bacteria/
Response:
[
{
"genus": "Lactobacillus",
"family": "Lactobacillaceae",
"phylum": "Firmicutes",
"total_reads": 15234
},
...
]Microbiome-ai-dev/
├── backend/microbiome-backend/ # Django backend
│ ├── analysis/ # Analysis app
│ ├── mysite/ # Django settings
│ ├── tests.py # Test suite
│ └── README.md # Backend docs
├── frontend/ # React frontend
│ ├── src/ # Source code
│ ├── tests/ # Test files
│ └── README.md # Frontend docs
├── docker/ # Docker configs
│ ├── Dockerfile.backend
│ ├── Dockerfile.frontend
│ └── docker-compose.yml
├── .github/workflows/ # CI/CD pipeline
│ └── ci.yml # GitHub Actions
├── deployment/ # Deployment guides
└── ci_cd/ # CI/CD documentation
Backend Tests (25 tests)
cd backend/microbiome-backend
python manage.py testFrontend Tests (17 tests)
cd frontend
bun testAll Tests in CI
# Automatically run on every push
# View results: GitHub Actions tab- Make changes to backend or frontend code
- Run tests locally to verify
- Commit and push to GitHub
- CI runs automatically - tests must pass
- Deploy (manual via Render dashboard or automatic with CD)
- Free tier available
- PostgreSQL included
- Auto-deploy from GitHub
- Full Guide
cd docker
docker-compose up -d- Complete control
- No external dependencies
- Perfect for testing
- EC2 for backend
- S3 for storage
- Batch for pipeline execution
- See detailed AWS guide in README
- ✅ Backend: 25 tests (Models, API, Integration)
- ✅ Frontend: 17 tests (Components, Pages, Utils)
- ✅ Total: 42 tests, 100% passing
Unit Tests
- Model creation and validation
- API endpoint functionality
- Utility functions
Integration Tests
- Complete workflow: upload → process → results
- Database interactions
- Job isolation and concurrency
CI Pipeline
- Runs on every push
- Must pass before merge
- View CI Status
- Quality Control - FastQC on raw reads
- Primer Trimming - Cutadapt removes primers
- Denoising - DADA2 infers ASVs
- Chimera Removal - Filter chimeric sequences
- Taxonomy Assignment - GTDB database classification
- Diversity Analysis - Alpha & beta diversity metrics
- Visualization - Generate plots and reports
- Default: Paired-end Illumina data
- Customizable via Nextflow config
- Supports single-end mode
- Configurable quality thresholds
ASV_table.tsv- Abundance matrixASV_tax.gtdb.tsv- Taxonomic assignmentsreport.html- MultiQC summary- Diversity plots (PNG/PDF)
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Ensure all tests pass
- Submit a pull request
This project is open source and available under the MIT License.
- nf-core/ampliseq - Nextflow pipeline
- DADA2 - ASV inference algorithm
- GTDB - Taxonomic database
- Lovable - Frontend scaffolding
For questions or support, please open an issue on GitHub.
