Skip to content

janprasad/HealthDetection

Repository files navigation

VetAI

Hands-free, multimodal AI diagnostic assistant for veterinarians and pet owners

VetAI enables veterinarians to diagnose animals through voice and vision while keeping their hands on the patient — and gives pet owners plain-English explanations with cost estimates. Built for TreeHacks 2026.


The Problem

Veterinarians examine 20-30+ animals daily in high-pressure, hands-on environments. Current diagnostic tools require:

  • Stopping the examination to type queries
  • Navigating complex databases with dirty hands
  • Separate workflows for image analysis
  • No conversational back-and-forth with AI

Pet owners face a different problem: they get technical diagnoses they can't understand, don't know if it's urgent, and have no way to connect with others going through the same thing.

VetAI solves both sides.


Our Solution

VetAI is a voice-first, vision-enabled diagnostic assistant with two modes:

For Veterinarians (Professional Mode)

  • Voice Queries - Ask questions hands-free via Whisper STT, get spoken responses via OpenAI TTS
  • Image Analysis - Snap photos mid-exam for AI-powered visual diagnosis with Claude Sonnet
  • Agentic Reasoning - Claude searches veterinary databases and builds differential diagnoses with tool use
  • Evidence-Based Research - Perplexity Sonar retrieves peer-reviewed citations for every diagnosis
  • Clarifying Questions - When confidence is low (<85%), the AI generates targeted follow-up questions to refine the diagnosis

For Pet Owners (Consumer Mode)

  • Plain English Diagnoses - Technical terms converted to simple language ("atopic dermatitis" -> "allergies causing skin irritation")
  • Urgency Assessment - Clear guidance: emergency, high confidence, moderate, or low confidence
  • Cost Estimates - Estimated vet visit cost ranges (routine $80-250, specialist $200-800, emergency $500-2000)
  • Emergency Detection - Automatic flagging of conditions like bloat, seizures, poisoning
  • Community Support - Find other pet owners with similar conditions nearby
  • Activity Feed - See community engagement, trending conditions, and success stories

Demo Flow

Vet examining dog with skin rash:
  1. Press mic: "What causes red patches on dog abdomens?"
  2. Snap photo of the affected area
  3. AI responds (voice): "Based on the image, this appears to be atopic
     dermatitis with 65% confidence..."
  4. AI shows clarifying questions:
     - "Is the rash seasonal or year-round?"
     - "Are the paws and face also affected?"
     - "Did symptoms start before age 3?"
  5. Vet answers via voice: "Yes, it's seasonal and the paws are red too"
  6. AI refines: "With seasonal presentation and paw involvement,
     atopic dermatitis is confirmed. Recommending allergy testing..."

Pet owner at home:
  1. Takes photo of their dog's rash
  2. Gets: "Allergies Causing Skin Irritation" (not "Atopic Dermatitis")
  3. Urgency: "This is a likely diagnosis, but a vet should confirm."
  4. Cost estimate: $80-$250 (routine visit)
  5. Finds 3 other dog owners with allergies in their state

Features

Implemented

  • Voice Conversation: OpenAI Whisper (STT) + TTS for hands-free interaction
  • Vision Analysis: Claude Sonnet for multimodal image diagnosis
  • Agentic Backend: Tool-calling architecture with disease database search, treatment protocols, and differential diagnosis
  • Research Citations: Perplexity Sonar API retrieves peer-reviewed veterinary literature with inline citations
  • Clarifying Questions: When diagnosis confidence < 85%, generates 2-3 targeted yes/no questions to help narrow down the diagnosis
  • Consumer Mode: Plain English diagnoses with urgency levels, cost estimates, and emergency detection for pet owners
  • Community Support Groups: Find relevant support groups by condition and species with fuzzy matching
  • Pet Profile Matching: Connect pet owners with similar conditions in the same area (same species, overlapping conditions, same state)
  • Activity Feed: Community engagement metrics, trending conditions, and statistics
  • Multi-Turn Voice Sessions: Clarifying question answers carry context across follow-up voice interactions
  • Mobile App: Cross-platform React Native (Expo) with camera integration
  • Structured Outputs: Confidence scores, risk levels, recommendations, and cited sources

In Progress

  • Frontend UI for Community Features: Screens for profiles, matching, groups, and activity feed
  • Consumer Mode Toggle: UI switch between professional and consumer views
  • Fine-Tuned VLM: Custom vision model trained on veterinary imagery
  • Multi-Image Context: Accumulate photos across conversation for progressive diagnosis

Tech Stack

Layer Technology
Frontend React Native (Expo), TypeScript, expo-av, expo-image-picker
Backend FastAPI (Python 3.11+), Pydantic v2
Vision AI Anthropic Claude Sonnet (claude-sonnet-4-20250514)
Speech OpenAI Whisper (STT) + OpenAI TTS (tts-1, alloy voice)
Research Perplexity Sonar Pro API
Tunneling ngrok (for Expo Go device testing)

Architecture

                    +---------------------+
                    |    Mobile App       |
                    |  React Native/Expo  |
                    |  - Voice Input      |
                    |  - Camera Capture   |
                    |  - Results Display  |
                    +---------+-----------+
                              |
                    +---------v-----------+
                    |   FastAPI Backend    |
                    |                     |
                    |  POST /analyze      | <- Image -> diagnosis (professional or consumer)
                    |  POST /chat         | <- Agentic chat with tools
                    |  POST /voice/query  | <- Audio + optional image
                    |  GET  /voice/audio  | <- Cached TTS playback
                    |  GET  /health       | <- Status check
                    |                     |
                    |  Community & Feed   |
                    |  POST /api/community/profile
                    |  GET  /api/community/groups/{condition}/{species}
                    |  GET  /api/community/matches/{id}
                    |  GET  /api/activity/feed
                    |  GET  /api/activity/stats
                    |  GET  /api/activity/trending
                    +---------+-----------+
                              |
          +-----------+-------+-------+--------------+
          v           v               v              v
    +-----------+ +--------+  +------------+  +----------+
    |  Claude   | |Whisper |  | Perplexity |  |  OpenAI  |
    |  Sonnet   | |  STT   |  |   Sonar    |  |   TTS    |
    |(Vision+   | |        |  | (Research) |  | (Voice)  |
    | Agent)    | |        |  |            |  |          |
    +-----------+ +--------+  +------------+  +----------+

Data Flow: Image Analysis

Photo -> /analyze -> Claude Vision -> Structured diagnosis JSON
                                    -> Perplexity Sonar -> Research citations
                                    -> Claude (if conf < 85%) -> Clarifying questions
                                    -> Consumer mode? -> Simplified response + cost estimate
                                    -> Combined AnalysisResult response

Data Flow: Voice Query

Audio -> /voice/query -> Whisper STT -> Transcribed text
                       -> Claude Agent (with tool use) -> Text response
                       -> OpenAI TTS -> Audio response
                       -> Session tracking for multi-turn context

Setup

Prerequisites

  • Node.js 18+
  • Python 3.11+
  • Expo CLI (npm install -g expo-cli)
  • API keys: Anthropic, OpenAI, Perplexity
  • ngrok account (sign up)

Backend

cd backend

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your API keys:
#   ANTHROPIC_API_KEY=sk-ant-...
#   OPENAI_API_KEY=sk-proj-...
#   PERPLEXITY_API_KEY=pplx-...

# Start server
uvicorn main:app --reload --host 0.0.0.0

Frontend

cd HealthDetect

cp .env.example .env
# Edit .env with your API URL:
#   EXPO_PUBLIC_API_URL=https://your-subdomain.ngrok-free.dev
npm install

# Start Expo
npx expo start
# Or with tunnel for remote device testing:
npx expo start --tunnel

ngrok Setup (for physical device testing)

# In a separate terminal, expose backend:
ngrok http 8000

# Copy the https URL (e.g., https://abc123.ngrok-free.dev)
# Update EXPO_PUBLIC_API_URL in .env

Physical Device Notes

  • Phone and computer must be on the same Wi-Fi network
  • Backend must run with --host 0.0.0.0
  • Use your machine's LAN IP or ngrok URL (not localhost)

API Reference

GET /health

Returns server status and API key configuration.

{
  "status": "ok",
  "anthropic_key_set": true,
  "openai_key_set": true
}

POST /analyze

Analyze a pet image for health conditions. Supports professional and consumer modes.

Request (multipart form):

Field Type Required Description
image file Yes JPEG/PNG image of the animal
species string No Animal species (e.g., "dog", "cat")
symptoms string No Reported symptoms
user_type string No "professional" (default) or "consumer"

Professional mode (default):

curl -X POST http://localhost:8000/analyze \
  -F "[email protected]" \
  -F "species=dog" \
  -F "symptoms=skin rash and redness"

Consumer mode:

curl -X POST http://localhost:8000/analyze \
  -F "[email protected]" \
  -F "species=dog" \
  -F "symptoms=skin rash" \
  -F "user_type=consumer"

Professional Response (AnalysisResult):

{
  "id": "analysis-1739520000000",
  "timestamp": "2026-02-14T10:00:00",
  "species": "dog",
  "diagnosis": {
    "primary": {
      "condition": "Canine Atopic Dermatitis",
      "commonName": "Allergic skin disease",
      "confidence": 0.65,
      "riskLevel": "moderate"
    },
    "alternatives": [
      { "condition": "Contact Dermatitis", "confidence": 0.20 }
    ]
  },
  "recommendations": [
    "Perform intradermal allergy testing",
    "Consider a hypoallergenic diet trial"
  ],
  "research_summary": "Canine atopic dermatitis is a genetically predisposed inflammatory...",
  "clinical_citations": [
    "https://pmc.ncbi.nlm.nih.gov/articles/PMC9204668/"
  ],
  "clarifying_questions": [
    "Is this a seasonal or year-round problem?",
    "Are the feet and face primarily affected?"
  ]
}

Consumer Response (ConsumerAnalysisResult):

{
  "id": "analysis-1739520000000",
  "timestamp": "2026-02-14T10:00:00",
  "simple_diagnosis": "Allergies Causing Skin Irritation",
  "technical_diagnosis": "Canine Atopic Dermatitis",
  "urgency": "low_confidence",
  "recommended_action": "We're not certain. A vet examination is recommended.",
  "is_emergency": false,
  "estimated_vet_cost": { "min": 80, "max": 250, "category": "routine" },
  "confidence": 0.65,
  "recommendations": [
    "Perform intradermal allergy testing",
    "Consider a hypoallergenic diet trial"
  ]
}

Urgency levels: "emergency" | "high_confidence" | "moderate_confidence" | "low_confidence"

Cost categories: "emergency" ($500-2000) | "specialist" ($200-800) | "routine" ($80-250)

POST /chat

Multi-turn agentic chat with veterinary tool use.

Request:

{
  "session_id": "chat-789",
  "message": "What tests should I run for suspected pancreatitis?",
  "image_context": { "species": "dog" }
}

Response:

{
  "session_id": "chat-789",
  "message": "For suspected pancreatitis in dogs, I recommend...",
  "toolResults": [
    {
      "tool": "get_treatment_protocols",
      "input": { "condition": "pancreatitis", "species": "dog" },
      "output": { "protocols": ["..."] }
    }
  ]
}

Available agent tools:

  • search_disease_database - Look up diseases by symptoms or species
  • get_treatment_protocols - Retrieve standard treatment plans
  • get_differential_diagnoses - Ranked differential diagnosis list

POST /voice/query

Voice-based query with optional image attachment.

Request (multipart form):

Field Type Required Description
audio file Yes Audio recording (m4a, wav, mp3)
species string No Animal species (default: "unknown")
session_id string No Session ID for multi-turn context
image file No Optional image for visual analysis
curl -X POST http://localhost:8000/voice/query \
  -F "[email protected]" \
  -F "species=cat" \
  -F "[email protected]"

GET /voice/audio/{audio_id}

Retrieve cached TTS audio as MP3 stream.

POST /api/community/profile

Create an anonymous pet profile for matching.

Query params: pet_name, species, breed, conditions (repeatable), location_city, location_state

curl -X POST "http://localhost:8000/api/community/profile?pet_name=Buddy&species=dog&breed=Golden&conditions=allergies&conditions=arthritis&location_city=Seattle&location_state=WA"

GET /api/community/groups/{condition}/{species}

Find support groups for a condition. Supports exact match, fuzzy/substring match, and fallback.

curl http://localhost:8000/api/community/groups/allergies/dog

GET /api/community/matches/{pet_profile_id}

Find nearby pets with similar conditions (same species, overlapping conditions, same state).

curl http://localhost:8000/api/community/matches/pet-12345

GET /api/activity/feed?limit=20

Recent community activity feed, sorted by recency.

GET /api/activity/stats

Community statistics (total diagnoses, active groups, satisfaction rating, etc.).

GET /api/activity/trending

Top 5 trending health conditions this week, sorted by score.


Project Structure

treehacks2026/
├── backend/
│   ├── main.py                     # FastAPI app, CORS, route mounting
│   ├── requirements.txt            # Python dependencies
│   ├── .env.example                # Environment variable template
│   ├── test_full_suite.py          # Comprehensive test suite (31 tests)
│   ├── models/
│   │   ├── schemas.py              # Pydantic models (AnalysisResult, etc.)
│   │   ├── consumer_schemas.py     # Consumer-friendly response model
│   │   └── community.py            # PetProfile, SupportGroup models
│   ├── routes/
│   │   ├── analyze.py              # POST /analyze (professional + consumer)
│   │   ├── chat.py                 # POST /chat
│   │   ├── voice_routes.py         # POST /voice/query, GET /voice/audio
│   │   ├── community_routes.py     # Community profiles, groups, matching
│   │   └── activity_routes.py      # Activity feed, stats, trending
│   └── services/
│       ├── vlm.py                  # Claude Vision integration
│       ├── agent.py                # Agentic chat + clarifying questions
│       ├── research_service.py     # Perplexity Sonar research lookup
│       ├── voice_service.py        # Whisper STT + OpenAI TTS
│       ├── consumer_mode.py        # Diagnosis simplification + cost estimates
│       ├── community_service.py    # Pet profiles, groups, matching
│       └── activity_service.py     # Activity feed generation
│
└── HealthDetect/                   # React Native (Expo) mobile app
    ├── app/
    │   ├── (tabs)/
    │   │   ├── index.tsx           # Home screen with voice
    │   │   ├── history.tsx         # Analysis history
    │   │   ├── learn.tsx           # Veterinary articles
    │   │   └── profile.tsx         # Practice profile
    │   ├── camera.tsx              # Camera capture
    │   ├── photo-review.tsx        # Photo preview + species input
    │   ├── processing.tsx          # Analysis loading screen
    │   └── results.tsx             # Diagnosis results + research + questions
    ├── components/
    │   └── VoiceButton.tsx         # Animated voice input button
    ├── services/
    │   ├── analysis-service.ts     # Backend API client
    │   └── voice-service.ts        # Audio recording/playback
    ├── constants/
    │   ├── types.ts                # TypeScript type definitions
    │   └── theme.ts                # Design system (colors, typography)
    └── context/
        └── AnalysisContext.tsx      # App-wide state management

Testing

Run the full test suite (31 tests across 8 sections):

cd backend
uvicorn main:app --host 0.0.0.0 --port 8000 &
python test_full_suite.py
Section Tests What's Covered
1. Core Infrastructure 2 Health check, all 11 routes registered
2. Chat Endpoint 4 Basic query, tool use, multi-turn context, image context
3. Clarifying Questions 3 High/boundary/low confidence thresholds
4. Perplexity Research 1 Summary + peer-reviewed citations
5. Community Features 6 Groups (exact/fuzzy/fallback), profiles, matching filters
6. Activity Feed 5 Feed generation, limits, sorting, stats, trending
7. Schema Validation 3 Field presence, defaults, JSON serialization
8. Consumer Mode 7 Simplification, urgency, emergency detection, costs, schema

Key Design Decisions

Decision Rationale
Voice-first UX Vets have hands on the animal during exams
Dual-mode analysis Professional mode for vets, consumer mode for pet owners
Claude Sonnet for vision Best multimodal reasoning for complex medical imagery
Perplexity Sonar for citations Real-time access to peer-reviewed literature
Clarifying questions at <85% Reduces diagnostic errors without slowing high-confidence results
Emergency auto-detection Immediate flagging of life-threatening conditions (bloat, seizures, etc.)
Fuzzy condition matching Claude returns clinical names; fuzzy match maps to support group templates
In-memory storage Hackathon scope; swap for Redis/DB in production
FIFO audio cache (50 entries) Prevents unbounded memory growth from TTS responses
Structured JSON output Enables rich UI rendering with confidence bars, risk badges, etc.

Cost Optimization

  • Audio caching: 50-entry FIFO cache reduces repeated TTS calls
  • Image compression: Mobile photos compressed before upload
  • Rate limiting: 3-second cooldown between voice queries
  • Duration limits: 30-second max recording length
  • Model selection: Uses tts-1 (not tts-1-hd) for speed
  • Conditional API calls: Clarifying questions and research only when needed

Estimated costs per 100 queries:

Query Type Cost
Voice only (no image) ~$2-3
Voice + image ~$15-20
Image analysis only ~$10-15
Chat (text only) ~$1-2

Roadmap

Phase 1 (TreeHacks) - Complete

  • Voice conversation (Whisper STT + OpenAI TTS)
  • Image analysis with Claude Sonnet
  • Agentic chat with tool use
  • Perplexity Sonar research citations
  • Clarifying questions for low-confidence diagnoses
  • Consumer mode with plain English + cost estimates
  • Community support groups and pet matching
  • Activity feed and community stats
  • Multi-turn voice session tracking
  • Mobile app with camera integration
  • Comprehensive test suite (31 tests)

Phase 2 (Post-Hackathon)

  • Frontend UI for community features
  • Consumer/professional mode toggle in app
  • Fine-tuned VLM on veterinary dataset
  • Multi-image context across conversation
  • Real-time bounding box annotations
  • Veterinary report generation (PDF export)

Phase 3 (Production)

  • Database persistence (PostgreSQL)
  • HIPAA-compliant medical record storage
  • Offline mode with local models
  • Specialist consultation marketplace
  • Analytics dashboard for clinics

Team

Built for TreeHacks 2026


License

MIT License - see LICENSE file for details


Acknowledgments

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors