The most powerful AI-driven CSV data quality analyzer
Upload β Analyze β Fix β Export β All in seconds!
π Live Demo β’ π Documentation β’ π Report Bug β’ β¨ Request Feature
π‘ Try it now: https://smart-csv-health-checker.streamlit.app/
| Traditional Tools | Smart CSV Health Checker |
|---|---|
| β Manual inspection | β AI-powered auto-detection |
| β Hours of work | β Results in seconds |
| β Miss hidden issues | β Finds complex anomalies |
| β No fix suggestions | β One-click auto-fix |
| β Basic statistics | β Deep profiling & PCA |
| β No code export | β Export Python code |
|
|
| Tab | Feature | Description |
|---|---|---|
| π | Overview | Quick summary with health score, issue breakdown, and key metrics |
| π§ | AI Deep Dive | Advanced ML-powered anomaly detection and insights |
| π οΈ | Fix Data | One-click fixes for missing values, outliers, and formatting |
| π§ | Pipeline | Build custom data cleaning pipelines |
| π | Visualizations | Interactive charts, distributions, and heatmaps |
| π | PCA Analysis | Dimensionality reduction and component analysis |
| π» | Code Export | Get Python code for all transformations |
| π | Deep Profile | PII detection and sensitive data scanning |
| π | Compare | Side-by-side dataset comparison |
| π² | Synthetic Data | Generate realistic test data |
π Enterprise-Grade Security
βββ π§ Email/Password Authentication
βββ β¨ User Registration with Verification
βββ π Secure Password Reset
βββ π€ User Profile Management
βββ πͺ Session Management
Powered by Supabase β Enterprise-grade authentication and database.
- π Dark Mode β Easy on the eyes
- β¨ Glassmorphism Design β Modern and sleek
- π± Responsive β Works on any device
- π Animated Elements β Smooth interactions
- π¨ Gradient Accents β Professional look
- Python 3.8+
- pip package manager
# 1. Clone the repository
git clone https://github.com/Prajwal18py/SMART-CSV-HEALTH-CHECKER.git
# 2. Navigate to directory
cd SMART-CSV-HEALTH-CHECKER
# 3. Create virtual environment
python -m venv venv
# 4. Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# 5. Install dependencies
pip install -r requirements.txt
# 6. Run the app
streamlit run app.pyCreate .streamlit/secrets.toml:
[supabase]
url = "your-supabase-url"
key = "your-supabase-anon-key"
These are just a few highlights. Explore the full app to discover:
- π§ Custom Data Pipelines
- π PCA Analysis
- π» Code Export
- π Deep Profiling with PII Detection
- π Dataset Comparison
- π² Synthetic Data Generation
smart-csv-health-checker/
β
βββ π app.py # Main application entry point
β
βββ π auth/ # Authentication module
β βββ __init__.py
β βββ auth_functions.py # Supabase auth functions
β βββ login.py # Login page UI
β
βββ π core/ # Core functionality
β βββ analysis.py # AI analysis engine
β βββ data_loader.py # CSV loading & validation
β βββ type_detection.py # Column type detection
β
βββ π tabs/ # Application tabs
β βββ tab_overview.py # Overview tab
β βββ tab_ai_deep_dive.py # AI analysis tab
β βββ tab_fix_data.py # Data fixing tab
β βββ tab_pipeline.py # Pipeline builder
β βββ tab_visualizations.py # Charts & graphs
β βββ tab_pca.py # PCA analysis
β βββ tab_code.py # Code export
β βββ tab_deep_profile.py # Deep profiling
β βββ tab_compare.py # Dataset comparison
β βββ tab_synthetic.py # Synthetic data
β
βββ π ui/ # UI components
β βββ layout.py # Page layout
β βββ styles.py # Custom CSS
β βββ sidebar.py # Sidebar component
β
βββ π database/ # Database module
β βββ __init__.py
β βββ db_functions.py # Database operations
β βββ schema.sql # Database schema
β
βββ π config/ # Configuration
β βββ supabase_config.py # Supabase client
β
βββ π .streamlit/ # Streamlit config
β βββ secrets.toml # API keys (gitignored)
β
βββ π requirements.txt # Python dependencies
βββ π README.md # This file
βββ π LICENSE # MIT License
β
Missing Value Detection β
Duplicate Row Detection
β
Outlier Identification β
Data Type Validation
β
Format Consistency β
Range Validation
β
Pattern Anomalies β
Correlation Analysis
β
PII Detection β
Statistical Profiling
π Numeric β int, float, currency, percentage
π Text β string, categorical, free-text
π
DateTime β date, time, datetime, timestamp
βοΈ Identifiers β email, phone, ID, UUID
π Web β URL, IP address, domain
π Location β address, coordinates, postal code
|
|
|
Contributions are what make the open source community amazing! Any contributions you make are greatly appreciated.
# 1. Fork the Project
# 2. Create your Feature Branch
git checkout -b feature/AmazingFeature
# 3. Commit your Changes
git commit -m 'Add some AmazingFeature'
# 4. Push to the Branch
git push origin feature/AmazingFeature
# 5. Open a Pull RequestDistributed under the MIT License. See LICENSE for more information.
MIT License
Copyright (c) 2026 Prajwal.A
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
- Streamlit β Amazing web framework
- Supabase β Backend as a Service
- Scikit-learn β ML algorithms
- Pandas β Data manipulation
- Plotly β Interactive visualizations