Skip to content

shivabioinformatics/variant-core

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VariantCore: High-Performance Genomic Data Structures

VariantCore is a lightweight, memory-efficient library for parsing VCF and BED files. It is designed for clinical pipelines where data integrity and memory footprint are critical.

Why this exists? (Engineering Philosophy)

Most ad-hoc bioinformatics scripts lack type safety and consume excessive memory. I built this library to demonstrate how Domain-Driven Design can improve pipeline reliability.

Installation

You can install VariantCore directly from GitHub using pip:

pip install git+https://github.com/shivabioinformatics/variant-core.git

Usage

Reading VCFs

from variant_core import VCFReader

# Lazy loading with generators keeps memory usage low
reader = VCFReader("data/sample.vcf")

for variant in reader:
    if variant.is_snp():
        print(f"Found SNP: {variant.chrom}:{variant.pos}")

Reading BED Files

from variant_core import BEDReader

# Automatically handles whitespace and 0-based coordinates
bed = BEDReader("data/targets.bed")

for region in bed:
    print(f"Target Region: {region}")

Testing

To run the test suite:

pip install -r requirements.txt
pytest

About

A memory-optimized, strictly-typed genomic parser for clinical pipelines

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages