Best Design at PennApps XXVI🏆
A project built to collect, duplicate, cluster, and summarize breaking news stories from across the web to mitigate biases that different news sources give.
DEMO - https://www.youtube.com/watch?v=rZpY3STlQxs
noogie automatically fetches articles from major outlets, groups similar stories, and generates concise AI-powered headlines.
It helps users see the big picture of what’s happening, filtering the opinions of different news sources.

- Multi-Source Collection – Pulls stories via RSS and scraping from a variety of news sources
- Clean Parsing – Uses newspaper4k to extract titles and content, filtering out empty or paywalled pages.
- Smart Deduplication – Computes sentence embeddings (sentence-transformers) and stores them in a FAISS vector index
- Clustering & Summarization – Groups similar articles and calls OpenAI/Gemini to generate short, human-readable cluster headlines
- Scalable Design – Supports multithreading (planned) so multiple feeds can be fetched concurrently.

- TypeScript + React with
d3.jsfor node visualization - Python
newspaper4kto extract news sources &flaskfor the servers - OpenAI's GPT 4o-mini for summaries
- Supabase for storage
cd front-end
npm install
npm run dev
cd server
python -m venv
pip install -r requirements.txt
python server.py
python main.py
GETapi/clustersGETapi/articlesGETapi/clusters/:clusters_id/articlesGETapi/clusters/:cluster_idGETapi/articles/:article_idPOSTapi/clusters/batchPOSTapi/data/bulkPOSTapi/clusters/:cluster_id/articles/batch
A special thanks to PennApps 2025.
Copyright (c) 2025 Jatin Punjabi, Andres Lopez, Ruslan Akmyradov
