Skip to content

YugynDprodigy10/aiops-data-validator

Repository files navigation

🚀 AI Ops Data Validator

AI-powered schema validation for scientific mission datasets (NASA PDS4, XML, JSON, CSV)

CI Coverage Python License


📖 Overview

AI Ops Data Validator is a Python tool that validates scientific datasets against official schemas (NASA PDS4, JSON Schema, CSV metadata).
It goes beyond schema checks: the built-in AI reasoning layer explains validation errors in plain English and suggests concrete fixes, making it easier for researchers to debug and correct data.

✨ Key Features

  • XML Validation — supports XSD and Schematron (PDS4-compliant).
  • JSON Validation — fully compliant with JSON Schema Draft 2020-12.
  • Human-readable reports — outputs Markdown/HTML summaries for researchers.
  • AI Reasoning Layer — groups issues, explains them in plain English, and suggests fixes.
  • Extensible — designed for anomaly detection and automated fix suggestions.
  • CLI Tool — run validations directly from the terminal.

📸 Demo

Example CLI Run

$ aiops-validate examples/bad_label.xml --kind xml --xsd schemas/pds4.xsd --schematron schemas/pds4.sch

Wrote report.md

Example Report (Markdown Snippet)

# Validation Report — bad_label.xml

**Summary:** FAIL  
Errors: 3 | Warnings: 1  

1. ERROR — schema  
- Path: `/Product_Observational/Observation_Area`  
- Message: Missing child element required by XSD.  
- Suggested fix: Add `<Observation_Area>` element according to schema.  

🖼️ Screenshots

Streamlit UI — drag & drop and Run validation HTML report — errors, warnings, explanations and suggested fixes


⚡ Quick Start

Install

git clone https://github.com/YugynDprodigy10/aiops-data-validator.git
cd aiops-data-validator
pip install -r requirements.txt

Validate XML

python -m aiops_validator.cli validate examples/bad_label.xml     --kind xml --xsd schemas/pds4.xsd --schematron schemas/pds4.sch

Validate JSON

python -m aiops_validator.cli validate examples/sample.json     --kind json --json-schema schemas/schema.json

🏗️ Architecture

aiops_validator/
  ├── core/         # Models, reasoner, reporting
  ├── validators/   # XML, JSON, CSV validators
  ├── fixes/        # Suggested fix generation
  ├── templates/    # Report templates
  └── cli.py        # Command-line entrypoint
  • Validators: Handle schema-level checks (XSD/JSON Schema).
  • Reasoner: Interprets raw logs, produces plain-English explanations.
  • Reporter: Outputs Markdown/HTML/JSON reports.
  • Fixes: Suggests patches or snippets to resolve issues.

✅ Roadmap

  • Add CSV validation (via frictionless or pandera).
  • Implement anomaly detection (range checks, statistical outliers).
  • Enable automated fixes with JSON Patch / XML transformations.
  • Dockerize for deployment in research pipelines.

🤝 Contributing

Pull requests welcome! See CONTRIBUTING.md.


📜 License

MIT © 2025 Eugene Taabazuing


👨‍💻 Author

Eugene TaabazuingLinkedIn

About

AI-powered data validation for scientific missions — validates NASA PDS4, XML, JSON, and CSV datasets, explains errors in plain English, and suggests fixes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors