Inspiration

Online platforms are flooded with noisy reviews, ads disguised as feedback, irrelevant stories, or rants from people who never visited. This undermines trust and makes it hard for genuine experiences to stand out. We wanted to build a system that could automatically separate useful reviews from noise at scale.

What it does

ReviewShield is an AI-powered moderation pipeline for location-based reviews. It classifies each review into one of four categories:

Advertisement — promo codes, self-promotion, links

Irrelevant — off-topic content

No-visit rant — complaints from people who never visited

None — valid, on-topic review

This helps platforms automatically filter or flag problematic reviews, improving overall trustworthiness.

How we built it

Data ingestion & cleaning: Parsed ~500k Google Reviews (CSV + JSON) into a consistent format, yielding ~265k usable rows. For the hackathon demo, we downsampled to ~1,000 rows for faster iteration.

Policy definition: Wrote a clear policy.md describing the violation categories (Advertisement, Irrelevant, No-visit rant) plus “None.”

Pseudo-labeling: Generated ~5k labeled examples using GPT-4o, which served as training data.

Classical ML baseline: Built a TF-IDF + Logistic Regression classifier trained on pseudo-labels.

Zero-shot experiments: Tested Hugging Face zero-shot models (facebook/bart-large-mnli, MoritzLaurer/deberta-v3-large-zeroshot-v2.0) for direct classification without training. Useful as a baseline but too slow and noisy for large-scale runs.

Evaluation: Used scikit-learn to compute precision, recall, F1, and plot confusion matrices for validation.

Challenges we ran into

Parsing large JSON files with inconsistent fields (some reviews missing text).

Running zero-shot classification at scale — 265k reviews took hours, and labels were inaccurate.

Prompt engineering: designing examples that LLMs interpret consistently.

Accomplishments that we're proud of

Built a full end-to-end NLP moderation pipeline in just a few days.

Successfully processed and cleaned hundreds of thousands of reviews.

Created a clear and reusable policy framework for review moderation.

Achieved working baselines and evaluation metrics without needing human-labeled data.

What we learned

How to combine classical ML + modern LLMs for complementary strengths.

The importance of data cleaning and preprocessing when dealing with real-world datasets.

Practical trade-offs between zero-shot, few-shot, and supervised learning in time-constrained settings.

That prompt design can dramatically affect results even with the same model.

What's next for ReviewShield

Fine-tune transformer models (e.g. DistilBERT) on pseudo-labels for higher accuracy.

Expand categories to detect spam, offensive language, or biased reviews.

Integrate into a dashboard or API for real-time review moderation.

Explore human-in-the-loop validation for critical cases.

Built With

Share this project:

Updates