Skip to content

Commit 727cb46

Browse files
committed
chore: release v0.1.0a3 with updated documentation
1 parent f244bf8 commit 727cb46

2 files changed

Lines changed: 31 additions & 5 deletions

File tree

README.md

Lines changed: 30 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,37 @@ Key features include:
4545

4646
- **Intelligent Profiling**: Detect missing values, skewed distributions, outliers, and data type inconsistencies.
4747
- **ML-Specific Checks**: Identify data leakage, dataset drift, class imbalance, and high-cardinality features.
48-
- **Automated Preparation**: Get suggestions for encoding, imputation, scaling, and transformations, and optionally apply them automatically.
49-
- **Rich Reporting**: Generate statistical summaries and exportable reports for collaboration.
50-
- **Production-Ready Pipelines**: Output reproducible cleaning and preprocessing code that integrates seamlessly with ML workflows.
48+
- **Automated Preparation**: Get suggestions for encoding, imputation, scaling, and transformations.
49+
- **Rich Reporting**: Generate statistical summaries and exportable reports (HTML/PDF/Markdown/JSON) with embedded visualizations.
50+
- **Production-Ready Pipelines**: Output reproducible cleaning and preprocessing code (`fixes.py`) that integrates seamlessly with ML workflows.
51+
- **Modern Themes**: Choose between "Minimal" (professional) and "Neubrutalism" (bold) report styles.
5152

52-
HashPrep turns dataset debugging into a guided, automated process - saving time, improving model reliability, and standardizing best practices across teams.
53+
---
54+
55+
## Usage
56+
57+
### 1. Quick Scan
58+
Get a quick summary of critical issues in your terminal.
59+
```bash
60+
hashprep scan dataset.csv
61+
```
62+
63+
### 2. Generate Report
64+
Generate a comprehensive HTML report with visualizations.
65+
```bash
66+
hashprep report dataset.csv --format html --theme minimal
67+
```
68+
69+
**Options:**
70+
- `--theme`: `minimal` (default) or `neubrutalism`
71+
- `--format`: `html`, `pdf`, `md`, or `json`
72+
- `--no-visualizations`: Disable plot generation for faster performance.
73+
74+
### 3. Generate Fixes
75+
Automatically generate a Python script (`dataset_fixes.py`) to apply suggested fixes.
76+
```bash
77+
hashprep report dataset.csv --with-code
78+
```
5379

5480
---
5581

hashprep/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
from .core.analyzer import DatasetAnalyzer
22

3-
__version__ = "0.1.0a2"
3+
__version__ = "0.1.0a3"

0 commit comments

Comments
 (0)