Diversity-IQ

This curve doesn’t lie.

🔍 About the Project

We were inspired by a simple yet often overlooked truth: AI models are only as fair as the data they're trained on. Biased datasets can lead to real-world consequences — from misdiagnoses in healthcare to discriminatory hiring tools. Yet, most teams jump into model development without first assessing the fairness of their data. We wanted to change that by making bias detection an accessible and essential first step.

🤖 What We Built

We created Diversity IQ — an AI-powered fairness analysis agent that helps teams catch imbalance before training begins. Users can upload any CSV dataset and, in just seconds, the tool evaluates whether categorical features (like gender or ethnicity) are fairly represented.

It returns a structured JSON output that includes:

📊 Category-wise distribution stats
⚖️ A fairness verdict ("Equally distributed" or not)
💡 Actionable suggestions to improve dataset representation

The backend uses FastAPI and OpenAI’s API for intelligent reasoning, while the React frontend visualizes the analysis with clear charts and insights — no coding required.

🧠 What We Learned

Building this tool taught us how often biases go unnoticed, even in familiar datasets. We realized the value of having a system that not only highlights statistical imbalances but also explains them in human terms. By automating fairness checks, we give teams a chance to build more ethical AI from the ground up.

We also saw how helpful it is to provide immediate visual feedback, helping developers and non-technical users alike understand where the dataset might be falling short.

🚧 Challenges We Faced

⚙️ Accurately identifying categorical columns across different kinds of datasets
📉 Defining thresholds to decide when a distribution is considered unfair
🧾 Formatting outputs to be both machine-parseable (JSON) and human-readable
⏱️ Keeping performance fast, especially when dealing with large files or multiple categories