Skip to content

v0.1.0a1

Pre-release
Pre-release

Choose a tag to compare

@maskedsyntax maskedsyntax released this 02 Oct 19:31
· 57 commits to main since this release

Improved correlation checks and reduced false positives in missing patterns

Improvements

  • Refined correlation checks in calculate_correlations
    • Fixed type inference errors by iterating over analyzer.column_types instead of analyzer.df
    • Updated mixed-variable thresholds to {'warning': 0.5, 'critical': 0.8} for consistency with Cramer’s V
    • Ensured seamless integration with run_checks
  • Reduced over-flagging in missing patterns detection
    • Introduced effect size thresholds:
      • Categorical: Cramer’s V > 0.1
      • Numeric: Cohen’s d > 0.2
    • Tightened p-value threshold to 0.01
    • Increased minimum samples per group to 10
    • Replaced ANOVA (f_oneway) with Mann-Whitney U test for better handling of skewed distributions
    • Added pattern grouping to summarize correlations per missing column (top 3 shown for conciseness)

Fixes

Corrected correlation dictionary iteration (analyzer.column_types)
Prevented spurious warnings by filtering weak associations