A Retrieval-Augmented Generation (RAG) system for fetching, indexing, and summarizing news articles using local AI models.
RAG News Summarizer demonstrates a complete RAG pipeline β from data ingestion to semantic search to LLM-powered summarization β all running locally on your machine.
This project was intentionally built as a simple, educational implementation of RAG concepts.
Modern AI applications increasingly rely on RAG architectures, but many tutorials are either:
- Too abstract (theory without working code)
- Too complex (production systems with overwhelming features)
- Cloud-dependent (requiring API keys and paid services)
RAG News Summarizer fills the gap by providing:
- β A complete, working implementation you can run locally
- β Clean, readable code that's easy to understand and modify
- β No API keys required β uses local models only
- β A foundation to build upon for your own projects
This is ideal for learning, portfolio projects, or as a starting point for production systems.
- π Semantic Search β Find relevant articles using vector similarity
- π° Multi-Source Ingestion β Fetches from BBC, Reuters, TechCrunch, and more
- π€ Local LLM Integration β Runs entirely on your machine with Ollama
- πΎ Persistent Storage β ChromaDB vector database for efficient retrieval
- π¨ Modern Web UI β Beautiful Streamlit interface with real-time updates
- π Privacy-First β No data leaves your machine
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit Web UI β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RAG Pipeline (LangChain) β
β βββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββ β
β β News Fetcher β β Vector Store β β LLM Chain β β
β β (RSS Parser) β β (ChromaDB) β β (Ollama) β β
β βββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββ ββββββββββββββββ βββββββββββββββ
β RSS Feedsβ β Sentence β β Ollama β
β (Free) β β Transformers β β (Llama 3.2) β
ββββββββββββ ββββββββββββββββ βββββββββββββββ
For detailed architecture documentation, see docs/ARCHITECTURE.md.
- Python 3.10 or higher
- Ollama installed
# 1. Clone the repository
git clone https://github.com/charanpool/rag-news-summarizer.git
cd rag-news-summarizer
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Set up Ollama (in a separate terminal)
ollama pull llama3.2
ollama serve
# 5. Run the application
streamlit run app/main.pyOpen your browser at http://localhost:8501 π
- Click "π Fetch & Index News" in the sidebar
- Wait for articles to be fetched and indexed
- Ask a question like "What's the latest in AI?"
- View the AI-generated summary with source citations
RAG_News_Summarizer/
βββ app/ # Application code
β βββ __init__.py
β βββ config.py # Configuration settings
β βββ embeddings.py # Embedding generation
β βββ main.py # Streamlit web interface
β βββ news_fetcher.py # RSS feed parser
β βββ rag_chain.py # RAG pipeline logic
β βββ vector_store.py # ChromaDB operations
βββ data/ # Runtime data (gitignored)
β βββ chroma_db/ # Vector database storage
βββ docs/ # Documentation
β βββ ARCHITECTURE.md # Technical architecture
β βββ CONCEPTS.md # Core RAG concepts
βββ tests/ # Test suite
β βββ test_rag.py
βββ .env.example # Environment template
βββ .gitignore
βββ CONTRIBUTING.md # Contribution guidelines
βββ LICENSE # MIT License
βββ README.md # This file
βββ ROADMAP.md # Future enhancement plans
βββ requirements.txt # Python dependencies
| Component | Technology | Purpose |
|---|---|---|
| Orchestration | LangChain | RAG pipeline framework |
| Embeddings | sentence-transformers | Text vectorization (local) |
| Vector DB | ChromaDB | Similarity search storage |
| LLM | Ollama (Llama 3.2) | Text generation (local) |
| Web UI | Streamlit | Interactive interface |
| News Source | RSS Feeds | Free news data |
RSS Feeds β Parse Articles β Chunk Text β Generate Embeddings β Store in ChromaDB
User Query β Embed Query β Semantic Search β Retrieve Top-K Articles
Retrieved Context + User Query β LLM Prompt β Generated Summary β Display
For a deeper dive, see docs/CONCEPTS.md.
Key settings in app/config.py:
| Setting | Default | Description |
|---|---|---|
EMBEDDING_MODEL |
all-MiniLM-L6-v2 |
Sentence-transformer model |
OLLAMA_MODEL |
llama3.2 |
Local LLM model |
CHUNK_SIZE |
1000 |
Text chunk size (characters) |
CHUNK_OVERLAP |
200 |
Overlap between chunks |
TOP_K_RESULTS |
5 |
Documents to retrieve |
Override via .env file:
cp .env.example .env
# Edit .env with your preferences| Source | Category |
|---|---|
| BBC World | World News |
| Reuters | World News |
| Al Jazeera | World News |
| NPR News | World News |
| The Guardian | World News |
| BBC Tech | Technology |
| TechCrunch | Technology |
| Hacker News | Technology |
| Ars Technica | Technology |
| The Verge | Technology |
| Wired | Technology |
| Science Daily | Science |
| NASA | Science |
| CNBC | Business |
| Bloomberg | Business |
Add custom sources in app/config.py:
RSS_FEEDS = {
"Your Source": "https://example.com/rss/feed.xml",
}# Run all tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=app --cov-report=htmlThis project follows intentionally lean design principles:
| Principle | Implementation |
|---|---|
| Simplicity | Minimal dependencies, clear code structure |
| Local-First | No external API calls required |
| Educational | Well-documented, easy to understand |
| Extensible | Clean abstractions for modification |
| Stateless Core | No database beyond vector store |
This project skips a few things on purpose to stay beginner-friendly:
- User authentication
- Cloud deployments
- Complex caching layers
- Microservices architecture
- Kubernetes configurations
Want to add these? Check out ROADMAP.md for ideas!
# Start Ollama server
ollama serve
# Verify it's running
curl http://localhost:11434/api/tags# Pull the required model
ollama pull llama3.2- Use a smaller embedding model in config
- Reduce
CHUNK_SIZE - Use a smaller Ollama model:
llama3.2:1b
Ensure you're running from the project root:
cd /path/to/RAG_News_Summarizer
source venv/bin/activate
streamlit run app/main.py| Document | Description |
|---|---|
| CONCEPTS.md | Core RAG concepts explained |
| ARCHITECTURE.md | Technical architecture details |
| ROADMAP.md | Future enhancement plans |
| CONTRIBUTING.md | How to contribute |
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Areas where help is appreciated:
- Improving heuristic parsing
- Adding test fixtures
- Documentation improvements
- UI/UX enhancements
See ROADMAP.md for planned features including:
- Hybrid search (semantic + keyword)
- Multiple LLM provider support
- REST API endpoints
- Docker containerization
- Enhanced UI features
This project is licensed under the MIT License β see the LICENSE file for details.
- β Free for personal and commercial use
- β Modify and distribute freely
- β Attribution required
- LangChain β RAG framework
- Ollama β Local LLM inference
- ChromaDB β Vector storage
- Sentence-Transformers β Embeddings
- Streamlit β Web interface
If you find this project useful:
- β Star the repository
- π Report issues you encounter
- π‘ Share ideas for improvements
- π§βπ» Contribute code or documentation
Built with β€οΈ for learning and demonstration purposes