Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Readme.md

Benchmarks

This directory contains a Snakemake pipeline for benchmarking BioNumPy against some other tools and method.

Benchmarking on small data sets is run on every push, and a bigger benchmark is run nightly on bigger data. The latest reports can be found here:

Reproducing the benchmark results

The reports can be generated by running the Snakemake pipeline in this directory. You will need to have Snakemake installed to do this.

Follow these steps to run the benchmarks:

  1. Clone this repository:
[email protected]:bionumpy/bionumpy.git
  1. Install BioNumPy and some other Python packages:
cd bionumpy
pip install -r requirements_dev.txt
pip install .
  1. Run the benchmarks:
snakemake --use-conda --cores 1 report_small.md

Change report_small.md with report_big.md to run on bigger data (takes ~15 minutes to run). The report_small.md and report_big.md files will include the plots generated from the various benchmarks.

Adding tools/methods to the benchmark

Feel free to add other tools and methods and make a pull request.

Follow these guidelines/rules:

  • Specify dependencies using isolated conda-environments for the rule you implement. For instance, if you make a rule for running tool X, make a conda env file for that tool and pinpoint the exact version.
  • For Python packages, you may instead add the Python dependencies to requirements_dev.txt if they don't conflict with any other Python dependencies.
  • Edit the config.yml file in this directory when you add a tool or a benchmark (follow the system defined there).
  • After adding a tool or a method to configy.yml that method should be automatically included in the resulting report_small/big.md. Make sure that rules for running a specific tool generated a benchmark-file (see the other rules for examples).
  • If the analysis in config.yml has validate_equal: true, it will be asserted that output from all tools are identical using the diff command. This is asserted if you run snakemake validation_report_small/big.md.