Skip to content

techmehedi/Hackharvard2025

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

35 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

HaloAudit

HaloAudit is an AI-powered data auditor that lives quietly at the top of your screen. Just drag a file to the halo, and it instantly scans for sensitive data, security risks, and quality issues β€” right on your device.

Built for developers, analysts, and compliance teams, HaloAudit bridges privacy and convenience by running intelligent audits locally or securely through the cloud. Whether it's CSVs, PDFs, or spreadsheets, HaloAudit identifies PII, secrets, and inconsistencies, then delivers a clear, actionable report in seconds.

HaloAudit β€” drop your files, illuminate your data.


πŸš€ Quick Start

What You Have

A complete, production-ready document audit system with:

  • βœ… Backend API - Deployed at https://auditor-edge.evanhaque1.workers.dev
  • βœ… Python Agent - Ready to run (dependencies installed)
  • βœ… Swift macOS App - HaloAudit rebranded and ready

Test the System (2 minutes)

# 1. Test backend is working
curl https://auditor-edge.evanhaque1.workers.dev/

# 2. Check job queue  
curl https://auditor-edge.evanhaque1.workers.dev/jobs/stats \
  -H "Authorization: Bearer OnOGTTCQw1Y4+qyah8n0xKDXRe5RLFqu6BM/P+UjR3k"

# 3. Start the agent
cd agent
source venv/bin/activate
python -m src.main

# 4. Test the Swift app
cd ../swift-frontend
./build_and_run.sh

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     macOS App (Swift)                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  AuditorUploadView - Drag & drop PDF/CSV           β”‚    β”‚
β”‚  β”‚  WebSocketManager - Real-time progress             β”‚    β”‚
β”‚  β”‚  AuditorAPIClient - HTTP requests                  β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚ HTTPS/WSS
                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Cloudflare Workers Edge (TypeScript)                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  Hono Router + Rate Limiting + Auth                β”‚    β”‚
β”‚  β”‚  β€’ POST /uploads/create    β†’ Signed R2 URLs        β”‚    β”‚
β”‚  β”‚  β€’ POST /runs/:id/enqueue  β†’ D1 job queue          β”‚    β”‚
β”‚  β”‚  β€’ GET  /ws/run/:id        β†’ WebSocket (DO)        β”‚    β”‚
β”‚  β”‚  β€’ POST /jobs/pull|ack     β†’ Job queue API         β”‚    β”‚
β”‚  β”‚  β€’ POST /vector/*          β†’ Vectorize proxy       β”‚    β”‚
β”‚  β”‚  β€’ POST /d1/query          β†’ D1 safe proxy         β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚ Job Queue (D1-backed, free tier)
                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Python Agent (LangGraph)                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  EdgeJobClient - Poll /jobs/pull every 1s          β”‚    β”‚
β”‚  β”‚  LangGraph Pipeline (9 nodes):                     β”‚    β”‚
β”‚  β”‚    1. Ingest      β†’ Download from R2               β”‚    β”‚
β”‚  β”‚    2. Extract     β†’ Gemini multimodal (no OCR!)    β”‚    β”‚
β”‚  β”‚    3. Chunk       β†’ Split text                     β”‚    β”‚
β”‚  β”‚    4. Embed       β†’ Gemini embeddings              β”‚    β”‚
β”‚  β”‚    5. Index       β†’ Vectorize via edge             β”‚    β”‚
β”‚  β”‚    6. Checks      β†’ 3 deterministic checks         β”‚    β”‚
β”‚  β”‚    7. Analyze     β†’ AI summary                     β”‚    β”‚
β”‚  β”‚    8. Report      β†’ Generate Markdown              β”‚    β”‚
β”‚  β”‚    9. Persist     β†’ Save to D1 via edge            β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β–Ό              β–Ό              β–Ό             β–Ό          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”
β”‚   R2   β”‚    β”‚    D1    β”‚   β”‚Vectorizeβ”‚  β”‚   DO   β”‚  β”‚ AI  β”‚
β”‚Storage β”‚    β”‚Database  β”‚   β”‚ Index   β”‚  β”‚RunRoom β”‚  β”‚ Gwy β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”˜

πŸ“± Swift App Features

HaloAudit Branding

  • βœ… App Name: Changed from "boringNotch" to "HaloAudit"
  • βœ… Bundle ID: Updated to com.haloaudit.app
  • βœ… Display Name: "HaloAudit" throughout the system
  • βœ… GitHub Link: Points to https://github.com/Evandabest/Hackharvard2025
  • βœ… Logo Assets: Ready for replacement

Auditor Tab

  • Location: Third tab in the notch (Home | Shelf | Auditor)
  • Icon: Document with magnifying glass
  • Features:
    • Drag & drop PDF/CSV files
    • Real-time progress updates via WebSocket
    • Professional audit reports with markdown rendering
    • Black theme for report display
    • "Show Report" button redirects to Next.js website

Upload States

  • Idle: Dashed border drop zone with "Drag & drop PDF or CSV"
  • Uploading: Progress bar with percentage
  • Processing: Circular progress with phase updates
  • Completed: Green checkmark with "Show Report" button
  • Failed: Error message with retry option

πŸ”§ Backend (Cloudflare Workers)

Deployed Endpoints

URL: https://auditor-edge.evanhaque1.workers.dev

// Client-facing
POST   /uploads/create         // Create upload, get R2 URL
POST   /runs/:id/enqueue        // Queue for processing  
GET    /runs/:id/status         // Get status
GET    /runs/:id/report-url     // Get report URL
GET    /runs/:id/report-content // Serve report content
WS     /ws/run/:id              // Real-time updates

// Server-only (requires Bearer token)
POST   /jobs/enqueue            // Add job to queue
POST   /jobs/pull               // Pull jobs (agent)
POST   /jobs/ack                // Acknowledge jobs
GET    /jobs/stats              // Queue statistics
POST   /vector/upsert           // Index embeddings
POST   /vector/query            // Semantic search
POST   /d1/query                // Safe DB queries

Database Schema

runs      - Upload and processing runs
findings  - Audit findings from analysis
events    - Log stream for debugging
jobs      - Job queue (replaces Cloudflare Queues)

🐍 Python Agent

Pipeline (9 nodes)

  1. Ingest - Download from R2
  2. Extract - Gemini multimodal API (NO local OCR!)
  3. Chunk - Smart text splitting
  4. Embed - Gemini 768-dim vectors
  5. Index - Store in Vectorize
  6. Checks - 3 deterministic checks:
    • Duplicate invoices
    • Round numbers
    • Weekend postings
  7. Analyze - AI-powered summary
  8. Report - Markdown generation
  9. Persist - Save to D1

Start Agent

cd agent
source venv/bin/activate
python -m src.main

🌐 Next.js Website

Report Display

  • URL: http://localhost:3000/display?reportUrl=...
  • Features:
    • Black theme with professional styling
    • Markdown rendering with syntax highlighting
    • Authentication via EDGE_API_TOKEN
    • Responsive design for audit reports

Setup

cd landing-page
npm install
npm run dev

πŸš€ Complete Setup (From Zero)

1. Backend Setup

cd backend

# Create resources
wrangler d1 create auditor
wrangler r2 bucket create auditor
wrangler vectorize create auditor-index --dimensions=768 --metric=cosine

# Set secrets
wrangler secret put TURNSTILE_SECRET
wrangler secret put JWT_SECRET

# Deploy
npm run migrate:prod
npm run deploy

2. Agent Setup

cd agent

# Install dependencies
poetry install

# Configure .env (see agent/env.example)
# Start agent
make dev

3. Swift App

cd swift-frontend

# Build and run
./build_and_run.sh

4. Next.js Website

cd landing-page

# Install dependencies
npm install

# Configure .env.local with EDGE_API_TOKEN
# Start development server
npm run dev

πŸ§ͺ Testing

Quick Health Checks

# Backend health
curl https://auditor-edge.evanhaque1.workers.dev/

# Agent health
curl http://localhost:8080/healthz

# Job queue stats
curl https://auditor-edge.evanhaque1.workers.dev/jobs/stats \
  -H "Authorization: Bearer OnOGTTCQw1Y4+qyah8n0xKDXRe5RLFqu6BM/P+UjR3k"

End-to-End Test

  1. Start agent: cd agent && python -m src.main
  2. Open Swift app: cd swift-frontend && ./build_and_run.sh
  3. Open notch β†’ Auditor tab β†’ Drop a PDF
  4. Watch real-time progress!

πŸ’° Cost Breakdown

Free Tier Only!

  • Workers: First 10M requests free
  • D1: First 5GB + 5M reads/day free
  • R2: First 10GB free
  • Vectorize: Free tier available
  • Durable Objects: First 1M requests free

Estimated monthly cost: $0 for moderate usage (< 100K documents/month)


🎯 Key Features

macOS App

  • βœ… Drag & drop file upload
  • βœ… Real-time progress visualization
  • βœ… WebSocket status updates
  • βœ… Professional report display
  • βœ… HaloAudit branding throughout

Backend API

  • βœ… Signed R2 upload URLs
  • βœ… D1-backed job queue (free!)
  • βœ… Durable Objects for WebSocket
  • βœ… Vectorize proxy for embeddings
  • βœ… Rate limiting and authentication

AI Pipeline

  • βœ… Gemini multimodal text extraction (no OCR!)
  • βœ… Gemini embeddings (768-dim)
  • βœ… LangGraph orchestration
  • βœ… Deterministic audit checks
  • βœ… AI-powered analysis
  • βœ… Markdown report generation

πŸ†˜ Troubleshooting

Common Issues

  1. Agent not connecting: Check .env configuration
  2. Swift build fails: Ensure all files are in Xcode project
  3. WebSocket not working: Check Durable Object deployment
  4. Report not displaying: Verify Next.js environment variables

Debug Commands

# Check agent logs
cd agent && python -m src.main

# Check backend logs
wrangler tail

# Check Swift app logs
cd swift-frontend && ./build_and_run.sh

πŸ“š Documentation

  • Backend: backend/README.md
  • Agent: agent/README.md
  • Swift App: swift-frontend/boringnotch/components/Auditor/README.md
  • Architecture: backend/ARCHITECTURE.md
  • Deployment: backend/DEPLOYMENT_CHECKLIST.md

πŸŽ‰ What You Built

A complete, production-ready audit system with:

  • 🌐 Global edge API
  • πŸ€– AI-powered processing
  • πŸ“± Beautiful macOS app
  • πŸ’° $0/month cost
  • πŸ“š Comprehensive docs
  • βœ… All tests passing

Total build time: ~2 hours (with AI assistance!)
Total cost: $0/month
Total awesomeness: 100% πŸš€


Built with ❀️ using Cloudflare Workers, LangGraph, Gemini AI, and SwiftUI

About

Ai Auditing Assistant [Won best use of GEMINI API]

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Swift 51.8%
  • TypeScript 23.8%
  • Python 7.9%
  • JavaScript 7.6%
  • CSS 6.3%
  • HTML 1.0%
  • Other 1.6%