This directory contains a Snakemake pipeline for benchmarking BioNumPy against some other tools and method.
Benchmarking on small data sets is run on every push, and a bigger benchmark is run nightly on bigger data. The latest reports can be found here:
The reports can be generated by running the Snakemake pipeline in this directory. You will need to have Snakemake installed to do this.
Follow these steps to run the benchmarks:
- Clone this repository:
[email protected]:bionumpy/bionumpy.git- Install BioNumPy and some other Python packages:
cd bionumpy
pip install -r requirements_dev.txt
pip install .
- Run the benchmarks:
snakemake --use-conda --cores 1 report_small.mdChange report_small.md with report_big.md to run on bigger data (takes ~15 minutes to run). The report_small.md and report_big.md files will include the plots generated from the various benchmarks.
Feel free to add other tools and methods and make a pull request.
Follow these guidelines/rules:
- Specify dependencies using isolated conda-environments for the rule you implement. For instance, if you make a rule for running tool X, make a conda env file for that tool and pinpoint the exact version.
- For Python packages, you may instead add the Python dependencies to requirements_dev.txt if they don't conflict with any other Python dependencies.
- Edit the
config.ymlfile in this directory when you add a tool or a benchmark (follow the system defined there). - After adding a tool or a method to
configy.ymlthat method should be automatically included in the resultingreport_small/big.md. Make sure that rules for running a specific tool generated a benchmark-file (see the other rules for examples). - If the analysis in
config.ymlhasvalidate_equal: true, it will be asserted that output from all tools are identical using thediffcommand. This is asserted if you runsnakemake validation_report_small/big.md.