📡 Newscrux

AI-powered news aggregator with structured multilingual summaries and push notifications

What It Does

Newscrux monitors 13 AI/ML RSS feeds, filters articles by relevance using AI, extracts full article content when needed, generates structured summaries in your chosen language, and delivers them as rich push notifications to your phone via Pushover or displays them in the terminal (--no-push).

Every notification tells you what happened, why it matters, and one key detail — in English, Turkish, German, French, or Spanish.

You can choose between OpenRouter (cloud-based AI) or Ollama (local AI) for all AI operations, ensuring privacy and cost control when using local models.

Notification Examples

English (--lang=en):

Title: OpenAI announces enterprise agent toolkit

📰 TechCrunch AI

What happened: OpenAI released a new suite of tools for building
enterprise-grade autonomous agents, including improved function
calling, a persistent memory API, and a new orchestration layer.

Why it matters: This could significantly accelerate agent-based
automation in large organizations by reducing integration complexity.

💡 Initial access is being rolled out to select enterprise customers.

Turkish (--lang=tr):

Title: AGI'ye doğru ilerlemeyi ölçmek: Bilişsel bir çerçeve

📰 Google DeepMind

Ne oldu: Google DeepMind, yapay genel zeka (AGI) yolunda ilerlemeyi
değerlendirmek için bilişsel bilim temelli bir çerçeve yayınladı.
10 temel bilişsel yeteneği tanımlıyor ve AI sistemlerinin yeteneklerini
sınıflandırmaya yönelik bir taksonomi sunuyor.

Neden önemli: Bu çerçeve, AI sistemlerinin genel zeka yeteneklerini
bilişsel perspektiften değerlendirmek için ortak bir temel sağlayabilir.

💡 200.000 dolar ödüllü Kaggle hackathonu başlatıldı.

🤖 Flexible AI providers — Use OpenRouter (cloud) or Ollama (local) for AI operations
🧠 Structured summaries — What happened + Why it matters + Key detail, generated by AI
📰 13 RSS sources — OpenAI, Google AI, DeepMind, TechCrunch, arXiv, and more
🔍 AI relevance filtering — Only delivers news that matters; irrelevant articles are dropped before summarization
📄 Hybrid content extraction — RSS snippet first, full-text scraping (via cheerio) when snippet is too short
⚡ Article state pipeline — discovered → enriched → summarized → sent with persistence
🔒 No data loss — Atomic queue writes, retry on transient failure, articles survive restarts
📊 Operational metrics — Per-cycle stats logged (discovered, enriched, sent, failed, truncated)
🏷️ Feed typing — Official blogs (official_blog) bypass relevance filter automatically
🔁 Cross-source deduplication — Title similarity check prevents same story from multiple sources
🖥️ Terminal-only mode — Use --no-push to skip notifications and display results in console

Quick Start

# Clone the repository
git clone https://github.com/alicankiraz1/newscrux.git
cd newscrux
npm install
cp .env.example .env        # Edit with your API keys (optional for --no-push)
npm run build

# Option 1: Run with OpenRouter (requires API key)
npm start -- --lang=en      # or: tr, de, fr, es

# Option 2: Run with Ollama (local, requires Ollama installed)
ollama pull deepseek-qwen-8b:latest
ollama serve
npm start -- -p ollama --lang=tr

# Option 3: Run once, show in terminal (no Pushover needed)
npm start -- --no-push -p ollama --lang=tr

Prerequisites:

Node.js 18+
OpenRouter API key (free tier available, or use Ollama for free local AI)
Ollama (optional, for local AI inference)
Pushover account (optional, one-time $5 app fee; use --no-push to skip)

Architecture

RSS Feeds (13 sources)
        │
        ▼
  Fetch + Parse
        │
        ▼
  Cross-source Dedup (title similarity)
        │
        ▼
  Discover → Queue (persistent JSON)
        │
        ├─ high priority (official_blog) ────────────────────┐
        │                                                     │
        ▼                                                     │
  Relevance Filter (AI scores 1-10)                          │
  Drop below threshold                                        │
        │                                                     │
        └─────────────────────────────────────────────────── ▼
                                                   Enrich (snippet or scrape)
                                                             │
                                                             ▼
                                                   Summarize (DeepSeek JSON)
                                                             │
                                                             ▼
                                                   Render Notification
                                                   (HTML, smart truncation)
                                                             │
                                                             ▼
                                                   Send via Pushover
                                                             │
                                                             ▼
                                                   Mark Sent in Queue

Supported Languages

Code	Language	Notification labels
`en`	English	"What happened:" / "Why it matters:" / "Read More"
`tr`	Turkish	"Ne oldu:" / "Neden önemli:" / "Devamını Oku"
`de`	German	"Was passiert ist:" / "Warum es wichtig ist:" / "Weiterlesen"
`fr`	French	"Ce qui s'est passé :" / "Pourquoi c'est important :" / "Lire la suite"
`es`	Spanish	"Qué pasó:" / "Por qué importa:" / "Leer más"

Each language pack includes a full AI system prompt in that language, feed kind labels, and all notification UI strings. The AI model produces translated_title, what_happened, why_it_matters, and key_detail in the selected language.

Configuration

CLI Options

Flag	Description	Default
`--lang <code>`, `-l <code>`	Summary language: `en`, `tr`, `de`, `fr`, `es`	`en`
`--provider <type>`, `-p <type>`	AI provider: `openrouter`, `ollama`	`openrouter`
`--no-push`	Skip Pushover notifications, show results in terminal only	—

newscrux --lang=tr      # Start with Turkish summaries
newscrux -l de          # Start with German summaries
newscrux -p ollama      # Use local Ollama model
newscrux --no-push      # Run once, show results in terminal only
newscrux                # Start with English summaries (default)

newscrux -l de # Start with German summaries newscrux # Start with English summaries (default)


### Environment Variables (`.env`)

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `OPENROUTER_API_KEY` | Yes | — | OpenRouter API key |
| `PUSHOVER_USER_KEY` | Yes | — | Pushover user key |
| `AI_PROVIDER` | No | `openrouter` | AI provider: `openrouter`, `ollama` |
| `OLLAMA_BASE_URL` | No | `http://localhost:11434/v1` | Ollama API URL |
| `OLLAMA_MODEL` | No | `deepseek-qwen-8b:latest` | Ollama model name |
| `OLLAMA_TEMPERATURE` | No | `0.2` | Ollama sampling temperature |
| `OLLAMA_THINK` | No | `false` | Enable/disable reasoning-heavy output |
| `OLLAMA_SUMMARY_MAX_TOKENS` | No | `260` | Max tokens for summary generation |
| `OLLAMA_RELEVANCE_MAX_TOKENS` | No | `1200` | Max tokens for relevance scoring output |
| `OLLAMA_TIMEOUT_MS` | No | `45000` | Ollama request timeout in milliseconds |
| `OLLAMA_MAX_RETRIES` | No | `2` | Retry count for failed/empty Ollama responses |

| `PUSHOVER_APP_TOKEN` | Yes | — | Pushover app token |
| `OPENROUTER_MODEL` | No | `deepseek/deepseek-v3.2-speciale` | AI model for summarization |
| `POLL_INTERVAL_MINUTES` | No | `15` | Minutes between feed polls |
| `MAX_ARTICLES_PER_POLL` | No | `10` | Max regular articles processed per cycle |
| `ARXIV_MAX_PER_POLL` | No | `15` | Max arXiv papers processed per cycle |
| `ENRICH_CONCURRENCY` | No | `4` | Parallel workers for enrichment stage |
| `SUMMARIZE_CONCURRENCY` | No | `2` | Parallel workers for summarization stage |
| `SEND_CONCURRENCY` | No | `3` | Parallel workers for send stage |
| `SUMMARIZE_DELAY_MS` | No | `0` | Delay after each summary (ms) |
| `SEND_DELAY_MS` | No | `0` | Delay after each send (ms) |
| `SNIPPET_MIN_LENGTH` | No | `300` | Skip scraping when snippet length is at least this many chars |
| `ENRICHED_CONTENT_MAX_LENGTH` | No | `3000` | Max content chars passed to summarizer |
| `SCRAPING_ENABLED` | No | `true` | Enable/disable full-page scraping fallback |
| `SCRAPING_TIMEOUT_MS` | No | `10000` | Scraping request timeout in ms |
| `SCRAPING_DOMAIN_DELAY_MS` | No | `2000` | Delay between requests to same domain in ms |
| `RELEVANCE_THRESHOLD` | No | `6` | Minimum AI relevance score (1–10) |
| `RELEVANCE_BATCH_SIZE` | No | `100` | Max discovered entries scored by relevance per cycle |
| `LOG_LEVEL` | No | `info` | Log verbosity: `debug`, `info`, `warn`, `error` |

---

## RSS Sources

| Source | Type | Priority |
|--------|------|----------|
| OpenAI News | `official_blog` | high (bypasses filter) |
| Google AI Blog | `official_blog` | high (bypasses filter) |
| Google DeepMind | `official_blog` | high (bypasses filter) |
| Hugging Face Blog | `official_blog` | normal |
| TechCrunch AI | `media` | normal |
| MIT Technology Review AI | `media` | normal |
| The Verge AI | `media` | normal |
| Ars Technica | `media` | normal |
| arXiv cs.CL | `research` | normal |
| arXiv cs.LG | `research` | normal |
| arXiv cs.AI | `research` | normal |
| Import AI | `newsletter` | normal |
| Ahead of AI | `newsletter` | normal |

To add or remove feeds, edit the `feeds` array in `src/config.ts`.

---

## Deployment

### Raspberry Pi / Linux server (systemd)

```bash
# 1. Clone and build
git clone https://github.com/alicankiraz1/newscrux.git ~/newscrux
cd ~/newscrux
npm install
cp .env.example .env
nano .env                                       # fill in your API keys
npm run build

# 2. Install and configure service
cp newscrux.service ~/.config/systemd/user/
nano ~/.config/systemd/user/newscrux.service    # adjust --lang flag if needed

# 3. Enable and start (user-level systemd)
systemctl --user daemon-reload
systemctl --user enable newscrux
systemctl --user start newscrux

# 4. View live logs
journalctl --user -u newscrux -f

Note: The service file uses %h (systemd home directory specifier) so paths are automatically resolved to your home directory. No root access needed.

How It Works

Fetch — Polls all 13 RSS feeds every 15 minutes (configurable) using rss-parser
Deduplicate — Cross-source title similarity check prevents the same story from appearing twice
Discover — New articles are added to a persistent JSON queue (data/article-queue.json) with state discovered
Filter — AI scores each article's relevance 1–10; articles below the threshold are dropped before any summarization cost is incurred. High-priority (official_blog) sources bypass this step entirely.
Enrich — Checks RSS snippet length; if shorter than 300 characters, scrapes the full article using cheerio. Content is capped at 3,000 characters for the summarizer.
Summarize — Sends article content to the configured AI provider (OpenRouter or Ollama) with a structured JSON prompt in the selected language. Output: translated_title, what_happened, why_it_matters, key_detail, source_type.
Render — If --no-push is not set, builds the Pushover notification message with HTML formatting and smart truncation to stay within the 1,024-character limit. Otherwise, displays results in the terminal.
Send — POSTs the notification to the Pushover API (or skips if --no-push is set). The article is only marked sent after a confirmed successful delivery (or after terminal display).
Retry — Articles that fail enrichment, summarization, or sending remain in the queue as failed and are retried on the next cycle.

Contributing

See CONTRIBUTING.md for how to add languages, submit fixes, or suggest features.

Author

Alican Kiraz

License

MIT — see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.githooks		.githooks
.github		.github
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGES.md		CHANGES.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
article-queue.json		article-queue.json
newscrux.service		newscrux.service
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📡 Newscrux

What It Does

Notification Examples

Quick Start

Architecture

Supported Languages

Configuration

CLI Options

How It Works

Contributing

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📡 Newscrux

What It Does

Notification Examples

Quick Start

Architecture

Supported Languages

Configuration

CLI Options

How It Works

Contributing

Author

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages