Skip to content

alihkz94/long-chimeric-reads-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

435 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

"Are we throwing away good data? Evaluation of chimera detection algorithms on long-read amplicons reveals high false positive rates across algorithms" (Hakimzadeh et al. 2025)

Structure

This repository contains the data and part of the analysis stack for the abovementioned paper. It is structured as follows:

Simulated data holds scripts related to the simulated dataset from generating the simulated data, chimeric sequence creation, quality filtering, and chimera filtering related to the simulated dataset. Moreover, the scripts for the simulated dataset and statistical analysis were used to calculate the F1 score.

Real data holds scripts related to real data analysis.

BlasCh contains the BLAST scripts for alignment and specific module BlasCh designed for processing XML outputs to find false positive chimeras and false negative chimeras.

Figures & tables contain the scripts used for generating graphs and tables.

The workflow we followed for the real dataset was like this: workflow for real dataset

About

Scripts related to chimeric sequences filtering and evaluating them

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors