Tags: ACEnglish/truvari
Tags
Fixing BNDs. Improving Refine See https://github.com/ACEnglish/truvari/wiki/Updates#truvari-510
- Reference context sequence comparison is now deprecated and sequenc… …e similarity calculation improved by also checking lexicographically minimum rotation's similarity. details - Symbolic variants (<DEL>, <INV>, <DUP>) can now be resolved for sequence comparison when a --reference is provided. The function for resolving the sequences is largely similar to this discussion - Symbolic variants can now match to resolved variants, even with --pctseq 0, with or without the new sequence resolving procedure. - Symbolic variant sub-types are ignored e.g. <DUP:TANDEM> == <DUP> - --sizemax now default to -1, meaning all variant ≥ --sizemin / --sizefilt are compared - Redundant variants which are collapsed into kept (a.k.a. removed) variants now more clearly labeled (--removed-output instead of --collapsed-output) - Fixed 'Unknown error' caused by unset TMPDIR (#229 and #245) - Fixes to minor record keeping bugs in refine/ga4gh better ensure all variants are counted/preserved - BND variants are now compared by bench (details) - Cleaner outputs by not writing matching annotations (e.g. PctSeqSimilarity) that are None - Major refactor of Truvari package API for easy reuse of SV comparison functions (details)
Small bug fixes bench - Correctly filtering ALT=* alleles and monomorphic reference stratify - Default behavior is to count variants within collapse - Faster sub-chunking operations by dropping use of pyintervaltree anno chunks - New command for identifying windows with a high number of SVs
Minor bug fixes * `refine` & `stratify` * Fixed variant and bed boundary overlapping issue * general * Definition of variants within a region now includes replacement style from TR callers having an anchor base 1bp upstream of catalog/includebed regions * Propagating MAFFT errors (#204) * FIPS compliance (#205) * Allow csi-indexed vcf files (#209) * bcftools sort tempfile (#213)
* collapse * Fewer comparisons needed per-chunk on average * Fixed --chain functionality (details) * Fixed --gt consolidation of format fields * bench * Faster result at the cost of less complete annotations with --short flag * refine * Assures variants are sequence resolved before incorporating into consensus * bench --passonly --sizemax parameters are used when building consensus for a region. Useful for refine --use-original-vcfs * When a refined region has more than 5k variants, it is skipped and a warning of the region is written to the log * Flag --use-region-coords now expands --region coordinates by 100bp (phab --buffer default) to allow variants to harmonize out of regions. * general * Dynamic bed/vcf parsing tries to choose faster of streaming/fetching variants
Changes to regions and collapse - Off by one error was mis-classifying some variants at the ends of regions as within the region. - Collapse's handling of genotypes (`--gt` and `--keep common` is faster) - New ga4gh command for reformatting `bench` results
PreviousNext