Unmask the Truth with AI - Detect potentially misleading news quickly using advanced Machine Learning and Natural Language Processing.
# 1. Start Backend (Terminal 1)
cd backend
python -m uvicorn main:app --reload --host 127.0.0.1 --port 8000
# 2. Start Frontend (Terminal 2)
cd frontend
npm run dev
# 3. Open Browser
http://localhost:5173- β‘ Lightning Fast - Results in <1 second
- π― Probabilistic Scoring - Confidence-based output, not absolute certainty
- π¨ Beautiful UI - Glassmorphism design with smooth animations
- π Privacy First - Zero data collection or tracking
- π§ Versioned API - Stable
/api/v1contract with legacy route compatibility - π‘οΈ Abuse Protection - Rate-limiting plus optional Turnstile captcha
- π§ͺ CI Browser E2E - Playwright prediction-flow gate with safe captcha bypass in CI
- π± Responsive - Works on all devices
- π Dark Mode - Easy on the eyes
- π Educational - Learn how fake news detection works
| Metric | Value |
|---|---|
| Validation Approach | Held-out evaluation + live model monitoring |
| Output Type | Probabilistic classification with confidence score |
| Training Samples | 640 articles |
| Test Samples | 160 articles |
| Features Extracted | 1,360 linguistic markers |
| Prediction Speed | <100ms |
- TruthShield predictions can produce false positives and false negatives.
- Confidence scores indicate model certainty, not factual proof.
- Outputs should be treated as decision support, not final truth.
- Always verify important claims with independent, authoritative sources.
- Use TruthShield for research, education, and editorial triage.
- Do not use it as the sole basis for legal, medical, financial, or safety-critical decisions.
- Do not submit illegal, abusive, or harmful content.
- Do not misrepresent TruthShield output as guaranteed fact.
- Structured JSON logging with request and trace IDs.
/metricsendpoint for Prometheus scraping (request count, latency, 5xx, fallback counters).- Optional Sentry integration via
SENTRY_DSN. - Built-in threshold alerts in logs for high latency and elevated 5xx rates.
- Configurable timeout/retries/jitter (
GEMINI_*env vars). - Circuit breaker on repeated Gemini failures.
- Explicit Gemini success and local fallback metrics.
- Bounded TTL cache for repeated requests.
- Versioned response field
model_version. - Training now emits:
backend/models/model_metadata.jsonbackend/models/evaluation_report.json
- Startup integrity checks validate model/vectorizer interface compatibility and runtime/training version alignment.
- Capture weekly aggregate prediction distributions and confidence buckets.
- Compare with baseline ranges from the last stable evaluation report.
- Trigger retraining if drift exceeds threshold for 2 consecutive windows.
- Re-run training and publish updated metadata/evaluation artifacts.
- Promote only after staging smoke checks pass.
- Preferred stable API namespace is
/api/v1. - Legacy unversioned routes remain available for backward compatibility.
- Compatibility policy endpoint:
/api/v1/version-policy. - OpenAPI schemas include curated examples for request/response models.
- Default submitted-text retention is
0days (no request body persistence). - Optional privacy-first mode: set
NO_STORE_MODE=trueto disable API cache and enforceCache-Control: no-storeheaders. - Governance details are exposed via
/api/v1/healthunderdata_governance.
- Incident response, backup/rollback strategy, and post-deploy verification are defined in docs/operations-runbook.md.
- CI includes browser E2E prediction-flow validation using Playwright with staging test-mode captcha bypass.
TruthShield/
βββ backend/
β βββ main.py # FastAPI app and API endpoints
β βββ train_v3.py # Model training script (v3)
β βββ tests/
β β βββ test_api.py
β β βββ test_deployed_smoke.py
β βββ models/ # model.pkl, vectorizer.pkl, metadata
β βββ Enhanced_Dataset_v3.csv # Training data
βββ frontend/
β βββ src/
β β βββ components/
β β βββ pages/
β β βββ test/
β β βββ lib/
β βββ .env.example
βββ docs/
β βββ operations-runbook.md
βββ docker-compose.yml
βββ render.yaml
- Python 3.8+
- Node.js 14+
- Docker (optional, for containerized setup)
Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows, use `.venv\Scripts\activate`Install Python dependencies:
pip install -r backend/requirements.txtTrain the ML model:
The repository already includes Enhanced_Dataset_v3.csv for training.
If you want larger/alternative data, you can also add Fake.csv and True.csv from Kaggle: https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset
Then, run the training script:
python backend/train_v3.pyThis will create model.pkl and vectorizer.pkl in the backend/models/ directory.
Run the backend server:
uvicorn backend.main:app --reload --port 8000The API will be available at http://localhost:8000.
Navigate to the frontend directory and install dependencies:
cd frontend
npm install
cp .env.example .env # On Windows PowerShell: Copy-Item .env.example .envRun the frontend development server:
npm run devThe application will be available at http://localhost:5173 (or another port if 5173 is busy).
With Docker installed, you can run the entire application with a single command:
docker-compose up --buildThe frontend will be available at http://localhost:3000 and the backend at http://localhost:8000.
The application is designed to be easily deployable on platforms like Vercel (for the frontend) and Render (for the backend).
- Connect your Git repository to Vercel.
- Set the framework to "Vite".
- Add environment variables:
VITE_API_URL= your deployed backend base URLVITE_REQUEST_TIMEOUT_MS=12000VITE_CAPTCHA_BYPASS=false(keep disabled in production)VITE_ANALYTICS_ENABLED=falseunless you have a telemetry endpoint
- Deploy!
- Connect your Git repository to Render.
- Create a new Web Service.
- Set the runtime to "Python 3".
- Set the build command to
pip install -r requirements.txt. - Set the start command to
uvicorn main:app --host 0.0.0.0 --port $PORT. - Configure required production environment variables:
ENVIRONMENT=productionFORCE_HTTPS=trueTRUSTED_HOSTS=<your-render-domain>CORS_ALLOW_ORIGINS=<your-frontend-domain>CORS_DEV_ALLOW_ALL=falseGEMINI_API_KEY=<secret>NO_STORE_MODE=trueRETENTION_DAYS_SUBMITTED_TEXT=0[email protected]- If captcha is enabled:
CAPTCHA_ENABLED=true,CAPTCHA_SECRET_KEY=<secret>
- Deploy!
This project is licensed under a Proprietary All Rights Reserved License. See the LICENSE file for details.
LegendarySumit
- GitHub: @LegendarySumit
- Project: TruthShield