Inspiration

With the rise of user-generated content across platforms like YouTube, comment sections have become a vital space for feedback, discussion, and community building—but they are often cluttered with toxicity, irrelevant chatter, or overwhelming volume. commenTrix was born out of the need to create a semantic dashboard that helps users, content creators, and moderators better understand the emotional tone, relevance, and thematic structure of their comment sections using modern NLP techniques.

What it does

commenTrix is a semantic analysis dashboard designed for YouTube comments. It automatically performs:

Ternary Sentiment Classification (positive, negative, neutral)

Emotion Detection (e.g., joy, anger, fear, surprise)

Aspect-Based Sentiment Analysis to identify sentiments linked to specific topics or features

Toxicity Detection to highlight harmful or offensive content

All insights are visualized in an interactive frontend that enables users to filter, explore, and make sense of their comment ecosystem efficiently.

How we built it

We designed commenTrix with a modular architecture:

Backend (Python): Built using FastAPI and integrated with NLP models from Hugging Face Transformers for sentiment, emotion, and aspect extraction.

Frontend (React.js): Displays the dashboard with charts, filters, and keyword highlights using Styled Components and Recharts.

Database: Comment data is stored and queried through a Cloudflare-hosted PostgreSQL setup.

YouTube API: Fetches comments and metadata using video links provided by users.

Multi-task NLP pipeline: We trained and fine-tuned models to run multiple analyses simultaneously with optimized performance.

Challenges we ran into

Balancing model performance and inference speed for large-scale comment sections

Integrating multi-task learning pipelines without excessive latency

Parsing and cleaning YouTube comment data, which often contains slang, emojis, and mixed languages

Ensuring frontend reactivity and scalability for dynamic filtering and display

Accomplishments that we're proud of

Successfully integrated multiple NLP tasks into a unified, multi-task inference model

Built a complete end-to-end system with both backend APIs and a clean frontend dashboard

Created a solution that can scale across video comment sections of any size with meaningful analytics

What we learned

The power of multi-task NLP models in reducing resource overhead

Best practices in building scalable frontend-backend pipelines

Real-world challenges of working with user-generated data and making it interpretable

How to use cloud services and deployment tools like Cloudflare and Vercel for robust architecture

What's next for commenTrix

Add support for multilingual comment analysis

Improve real-time comment monitoring for creators and moderators

Launch a Chrome extension for on-page semantic insights

Expand to other platforms like Twitter and Reddit

Offer custom model fine-tuning for niche domains (e.g., education, finance, gaming)

Built With

Share this project:

Updates