VetAI

Hands-free, multimodal AI diagnostic assistant for veterinarians and pet owners

VetAI enables veterinarians to diagnose animals through voice and vision while keeping their hands on the patient — and gives pet owners plain-English explanations with cost estimates. Built for TreeHacks 2026.

The Problem

Veterinarians examine 20-30+ animals daily in high-pressure, hands-on environments. Current diagnostic tools require:

Stopping the examination to type queries
Navigating complex databases with dirty hands
Separate workflows for image analysis
No conversational back-and-forth with AI

Pet owners face a different problem: they get technical diagnoses they can't understand, don't know if it's urgent, and have no way to connect with others going through the same thing.

VetAI solves both sides.

Our Solution

VetAI is a voice-first, vision-enabled diagnostic assistant with two modes:

For Veterinarians (Professional Mode)

Voice Queries - Ask questions hands-free via Whisper STT, get spoken responses via OpenAI TTS
Image Analysis - Snap photos mid-exam for AI-powered visual diagnosis with Claude Sonnet
Agentic Reasoning - Claude searches veterinary databases and builds differential diagnoses with tool use
Evidence-Based Research - Perplexity Sonar retrieves peer-reviewed citations for every diagnosis
Clarifying Questions - When confidence is low (<85%), the AI generates targeted follow-up questions to refine the diagnosis

For Pet Owners (Consumer Mode)

Plain English Diagnoses - Technical terms converted to simple language ("atopic dermatitis" -> "allergies causing skin irritation")
Urgency Assessment - Clear guidance: emergency, high confidence, moderate, or low confidence
Cost Estimates - Estimated vet visit cost ranges (routine $80-250, specialist $200-800, emergency $500-2000)
Emergency Detection - Automatic flagging of conditions like bloat, seizures, poisoning
Community Support - Find other pet owners with similar conditions nearby
Activity Feed - See community engagement, trending conditions, and success stories

Demo Flow

Vet examining dog with skin rash:
  1. Press mic: "What causes red patches on dog abdomens?"
  2. Snap photo of the affected area
  3. AI responds (voice): "Based on the image, this appears to be atopic
     dermatitis with 65% confidence..."
  4. AI shows clarifying questions:
     - "Is the rash seasonal or year-round?"
     - "Are the paws and face also affected?"
     - "Did symptoms start before age 3?"
  5. Vet answers via voice: "Yes, it's seasonal and the paws are red too"
  6. AI refines: "With seasonal presentation and paw involvement,
     atopic dermatitis is confirmed. Recommending allergy testing..."

Pet owner at home:
  1. Takes photo of their dog's rash
  2. Gets: "Allergies Causing Skin Irritation" (not "Atopic Dermatitis")
  3. Urgency: "This is a likely diagnosis, but a vet should confirm."
  4. Cost estimate: $80-$250 (routine visit)
  5. Finds 3 other dog owners with allergies in their state

Features

Implemented

Voice Conversation: OpenAI Whisper (STT) + TTS for hands-free interaction
Vision Analysis: Claude Sonnet for multimodal image diagnosis
Agentic Backend: Tool-calling architecture with disease database search, treatment protocols, and differential diagnosis
Research Citations: Perplexity Sonar API retrieves peer-reviewed veterinary literature with inline citations
Clarifying Questions: When diagnosis confidence < 85%, generates 2-3 targeted yes/no questions to help narrow down the diagnosis
Consumer Mode: Plain English diagnoses with urgency levels, cost estimates, and emergency detection for pet owners
Community Support Groups: Find relevant support groups by condition and species with fuzzy matching
Pet Profile Matching: Connect pet owners with similar conditions in the same area (same species, overlapping conditions, same state)
Activity Feed: Community engagement metrics, trending conditions, and statistics
Multi-Turn Voice Sessions: Clarifying question answers carry context across follow-up voice interactions
Mobile App: Cross-platform React Native (Expo) with camera integration
Structured Outputs: Confidence scores, risk levels, recommendations, and cited sources

In Progress

Frontend UI for Community Features: Screens for profiles, matching, groups, and activity feed
Consumer Mode Toggle: UI switch between professional and consumer views
Fine-Tuned VLM: Custom vision model trained on veterinary imagery
Multi-Image Context: Accumulate photos across conversation for progressive diagnosis

Tech Stack

Layer	Technology
Frontend	React Native (Expo), TypeScript, expo-av, expo-image-picker
Backend	FastAPI (Python 3.11+), Pydantic v2
Vision AI	Anthropic Claude Sonnet (claude-sonnet-4-20250514)
Speech	OpenAI Whisper (STT) + OpenAI TTS (tts-1, alloy voice)
Research	Perplexity Sonar Pro API
Tunneling	ngrok (for Expo Go device testing)

Architecture

                    +---------------------+
                    |    Mobile App       |
                    |  React Native/Expo  |
                    |  - Voice Input      |
                    |  - Camera Capture   |
                    |  - Results Display  |
                    +---------+-----------+
                              |
                    +---------v-----------+
                    |   FastAPI Backend    |
                    |                     |
                    |  POST /analyze      | <- Image -> diagnosis (professional or consumer)
                    |  POST /chat         | <- Agentic chat with tools
                    |  POST /voice/query  | <- Audio + optional image
                    |  GET  /voice/audio  | <- Cached TTS playback
                    |  GET  /health       | <- Status check
                    |                     |
                    |  Community & Feed   |
                    |  POST /api/community/profile
                    |  GET  /api/community/groups/{condition}/{species}
                    |  GET  /api/community/matches/{id}
                    |  GET  /api/activity/feed
                    |  GET  /api/activity/stats
                    |  GET  /api/activity/trending
                    +---------+-----------+
                              |
          +-----------+-------+-------+--------------+
          v           v               v              v
    +-----------+ +--------+  +------------+  +----------+
    |  Claude   | |Whisper |  | Perplexity |  |  OpenAI  |
    |  Sonnet   | |  STT   |  |   Sonar    |  |   TTS    |
    |(Vision+   | |        |  | (Research) |  | (Voice)  |
    | Agent)    | |        |  |            |  |          |
    +-----------+ +--------+  +------------+  +----------+

Data Flow: Image Analysis

Photo -> /analyze -> Claude Vision -> Structured diagnosis JSON
                                    -> Perplexity Sonar -> Research citations
                                    -> Claude (if conf < 85%) -> Clarifying questions
                                    -> Consumer mode? -> Simplified response + cost estimate
                                    -> Combined AnalysisResult response

Data Flow: Voice Query

Audio -> /voice/query -> Whisper STT -> Transcribed text
                       -> Claude Agent (with tool use) -> Text response
                       -> OpenAI TTS -> Audio response
                       -> Session tracking for multi-turn context

Setup

Prerequisites

Node.js 18+
Python 3.11+
Expo CLI (npm install -g expo-cli)
API keys: Anthropic, OpenAI, Perplexity
ngrok account (sign up)

Backend

cd backend

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your API keys:
#   ANTHROPIC_API_KEY=sk-ant-...
#   OPENAI_API_KEY=sk-proj-...
#   PERPLEXITY_API_KEY=pplx-...

# Start server
uvicorn main:app --reload --host 0.0.0.0

Frontend

cd HealthDetect

cp .env.example .env
# Edit .env with your API URL:
#   EXPO_PUBLIC_API_URL=https://your-subdomain.ngrok-free.dev
npm install

# Start Expo
npx expo start
# Or with tunnel for remote device testing:
npx expo start --tunnel

ngrok Setup (for physical device testing)

# In a separate terminal, expose backend:
ngrok http 8000

# Copy the https URL (e.g., https://abc123.ngrok-free.dev)
# Update EXPO_PUBLIC_API_URL in .env

Physical Device Notes

Phone and computer must be on the same Wi-Fi network
Backend must run with --host 0.0.0.0
Use your machine's LAN IP or ngrok URL (not localhost)

API Reference

`GET /health`

Returns server status and API key configuration.

{
  "status": "ok",
  "anthropic_key_set": true,
  "openai_key_set": true
}

`POST /analyze`

Analyze a pet image for health conditions. Supports professional and consumer modes.

Request (multipart form):

Field	Type	Required	Description
`image`	file	Yes	JPEG/PNG image of the animal
`species`	string	No	Animal species (e.g., "dog", "cat")
`symptoms`	string	No	Reported symptoms
`user_type`	string	No	`"professional"` (default) or `"consumer"`

Professional mode (default):

curl -X POST http://localhost:8000/analyze \
  -F "[email protected]" \
  -F "species=dog" \
  -F "symptoms=skin rash and redness"

Consumer mode:

curl -X POST http://localhost:8000/analyze \
  -F "[email protected]" \
  -F "species=dog" \
  -F "symptoms=skin rash" \
  -F "user_type=consumer"

Professional Response (AnalysisResult):

{
  "id": "analysis-1739520000000",
  "timestamp": "2026-02-14T10:00:00",
  "species": "dog",
  "diagnosis": {
    "primary": {
      "condition": "Canine Atopic Dermatitis",
      "commonName": "Allergic skin disease",
      "confidence": 0.65,
      "riskLevel": "moderate"
    },
    "alternatives": [
      { "condition": "Contact Dermatitis", "confidence": 0.20 }
    ]
  },
  "recommendations": [
    "Perform intradermal allergy testing",
    "Consider a hypoallergenic diet trial"
  ],
  "research_summary": "Canine atopic dermatitis is a genetically predisposed inflammatory...",
  "clinical_citations": [
    "https://pmc.ncbi.nlm.nih.gov/articles/PMC9204668/"
  ],
  "clarifying_questions": [
    "Is this a seasonal or year-round problem?",
    "Are the feet and face primarily affected?"
  ]
}

Consumer Response (ConsumerAnalysisResult):

{
  "id": "analysis-1739520000000",
  "timestamp": "2026-02-14T10:00:00",
  "simple_diagnosis": "Allergies Causing Skin Irritation",
  "technical_diagnosis": "Canine Atopic Dermatitis",
  "urgency": "low_confidence",
  "recommended_action": "We're not certain. A vet examination is recommended.",
  "is_emergency": false,
  "estimated_vet_cost": { "min": 80, "max": 250, "category": "routine" },
  "confidence": 0.65,
  "recommendations": [
    "Perform intradermal allergy testing",
    "Consider a hypoallergenic diet trial"
  ]
}

Urgency levels: "emergency" | "high_confidence" | "moderate_confidence" | "low_confidence"

Cost categories: "emergency" ($500-2000) | "specialist" ($200-800) | "routine" ($80-250)

`POST /chat`

Multi-turn agentic chat with veterinary tool use.

Request:

{
  "session_id": "chat-789",
  "message": "What tests should I run for suspected pancreatitis?",
  "image_context": { "species": "dog" }
}

Response:

{
  "session_id": "chat-789",
  "message": "For suspected pancreatitis in dogs, I recommend...",
  "toolResults": [
    {
      "tool": "get_treatment_protocols",
      "input": { "condition": "pancreatitis", "species": "dog" },
      "output": { "protocols": ["..."] }
    }
  ]
}

Available agent tools:

search_disease_database - Look up diseases by symptoms or species
get_treatment_protocols - Retrieve standard treatment plans
get_differential_diagnoses - Ranked differential diagnosis list

`POST /voice/query`

Voice-based query with optional image attachment.

Request (multipart form):

Field	Type	Required	Description
`audio`	file	Yes	Audio recording (m4a, wav, mp3)
`species`	string	No	Animal species (default: "unknown")
`session_id`	string	No	Session ID for multi-turn context
`image`	file	No	Optional image for visual analysis

curl -X POST http://localhost:8000/voice/query \
  -F "[email protected]" \
  -F "species=cat" \
  -F "[email protected]"

`GET /voice/audio/{audio_id}`

Retrieve cached TTS audio as MP3 stream.

`POST /api/community/profile`

Create an anonymous pet profile for matching.

Query params: pet_name, species, breed, conditions (repeatable), location_city, location_state

curl -X POST "http://localhost:8000/api/community/profile?pet_name=Buddy&species=dog&breed=Golden&conditions=allergies&conditions=arthritis&location_city=Seattle&location_state=WA"

`GET /api/community/groups/{condition}/{species}`

Find support groups for a condition. Supports exact match, fuzzy/substring match, and fallback.

curl http://localhost:8000/api/community/groups/allergies/dog

`GET /api/community/matches/{pet_profile_id}`

Find nearby pets with similar conditions (same species, overlapping conditions, same state).

curl http://localhost:8000/api/community/matches/pet-12345

`GET /api/activity/feed?limit=20`

Recent community activity feed, sorted by recency.

`GET /api/activity/stats`

Community statistics (total diagnoses, active groups, satisfaction rating, etc.).

`GET /api/activity/trending`

Top 5 trending health conditions this week, sorted by score.

Project Structure

treehacks2026/
├── backend/
│   ├── main.py                     # FastAPI app, CORS, route mounting
│   ├── requirements.txt            # Python dependencies
│   ├── .env.example                # Environment variable template
│   ├── test_full_suite.py          # Comprehensive test suite (31 tests)
│   ├── models/
│   │   ├── schemas.py              # Pydantic models (AnalysisResult, etc.)
│   │   ├── consumer_schemas.py     # Consumer-friendly response model
│   │   └── community.py            # PetProfile, SupportGroup models
│   ├── routes/
│   │   ├── analyze.py              # POST /analyze (professional + consumer)
│   │   ├── chat.py                 # POST /chat
│   │   ├── voice_routes.py         # POST /voice/query, GET /voice/audio
│   │   ├── community_routes.py     # Community profiles, groups, matching
│   │   └── activity_routes.py      # Activity feed, stats, trending
│   └── services/
│       ├── vlm.py                  # Claude Vision integration
│       ├── agent.py                # Agentic chat + clarifying questions
│       ├── research_service.py     # Perplexity Sonar research lookup
│       ├── voice_service.py        # Whisper STT + OpenAI TTS
│       ├── consumer_mode.py        # Diagnosis simplification + cost estimates
│       ├── community_service.py    # Pet profiles, groups, matching
│       └── activity_service.py     # Activity feed generation
│
└── HealthDetect/                   # React Native (Expo) mobile app
    ├── app/
    │   ├── (tabs)/
    │   │   ├── index.tsx           # Home screen with voice
    │   │   ├── history.tsx         # Analysis history
    │   │   ├── learn.tsx           # Veterinary articles
    │   │   └── profile.tsx         # Practice profile
    │   ├── camera.tsx              # Camera capture
    │   ├── photo-review.tsx        # Photo preview + species input
    │   ├── processing.tsx          # Analysis loading screen
    │   └── results.tsx             # Diagnosis results + research + questions
    ├── components/
    │   └── VoiceButton.tsx         # Animated voice input button
    ├── services/
    │   ├── analysis-service.ts     # Backend API client
    │   └── voice-service.ts        # Audio recording/playback
    ├── constants/
    │   ├── types.ts                # TypeScript type definitions
    │   └── theme.ts                # Design system (colors, typography)
    └── context/
        └── AnalysisContext.tsx      # App-wide state management

Testing

Run the full test suite (31 tests across 8 sections):

cd backend
uvicorn main:app --host 0.0.0.0 --port 8000 &
python test_full_suite.py

Section	Tests	What's Covered
1. Core Infrastructure	2	Health check, all 11 routes registered
2. Chat Endpoint	4	Basic query, tool use, multi-turn context, image context
3. Clarifying Questions	3	High/boundary/low confidence thresholds
4. Perplexity Research	1	Summary + peer-reviewed citations
5. Community Features	6	Groups (exact/fuzzy/fallback), profiles, matching filters
6. Activity Feed	5	Feed generation, limits, sorting, stats, trending
7. Schema Validation	3	Field presence, defaults, JSON serialization
8. Consumer Mode	7	Simplification, urgency, emergency detection, costs, schema

Key Design Decisions

Decision	Rationale
Voice-first UX	Vets have hands on the animal during exams
Dual-mode analysis	Professional mode for vets, consumer mode for pet owners
Claude Sonnet for vision	Best multimodal reasoning for complex medical imagery
Perplexity Sonar for citations	Real-time access to peer-reviewed literature
Clarifying questions at <85%	Reduces diagnostic errors without slowing high-confidence results
Emergency auto-detection	Immediate flagging of life-threatening conditions (bloat, seizures, etc.)
Fuzzy condition matching	Claude returns clinical names; fuzzy match maps to support group templates
In-memory storage	Hackathon scope; swap for Redis/DB in production
FIFO audio cache (50 entries)	Prevents unbounded memory growth from TTS responses
Structured JSON output	Enables rich UI rendering with confidence bars, risk badges, etc.

Cost Optimization

Audio caching: 50-entry FIFO cache reduces repeated TTS calls
Image compression: Mobile photos compressed before upload
Rate limiting: 3-second cooldown between voice queries
Duration limits: 30-second max recording length
Model selection: Uses tts-1 (not tts-1-hd) for speed
Conditional API calls: Clarifying questions and research only when needed

Estimated costs per 100 queries:

Query Type	Cost
Voice only (no image)	~$2-3
Voice + image	~$15-20
Image analysis only	~$10-15
Chat (text only)	~$1-2

Roadmap

Phase 1 (TreeHacks) - Complete

Phase 2 (Post-Hackathon)

Frontend UI for community features
Consumer/professional mode toggle in app
Fine-tuned VLM on veterinary dataset
Multi-image context across conversation
Real-time bounding box annotations
Veterinary report generation (PDF export)

Phase 3 (Production)

Database persistence (PostgreSQL)
HIPAA-compliant medical record storage
Offline mode with local models
Specialist consultation marketplace
Analytics dashboard for clinics

Team

Built for TreeHacks 2026

License

MIT License - see LICENSE file for details

Acknowledgments

Anthropic Claude for multimodal reasoning
OpenAI Whisper for speech recognition
Perplexity Sonar for real-time research
Expo for mobile development framework
Veterinary professionals who inspired this project

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.agents/skills		.agents/skills
HealthDetect		HealthDetect
assets		assets
backend		backend
src		src
.gitignore		.gitignore
App.js		App.js
README.md		README.md
app.json		app.json
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

VetAI

The Problem

Our Solution

For Veterinarians (Professional Mode)

For Pet Owners (Consumer Mode)

Demo Flow

Features

Implemented

In Progress

Tech Stack

Architecture

Data Flow: Image Analysis

Data Flow: Voice Query

Setup

Prerequisites

Backend

Frontend

ngrok Setup (for physical device testing)

Physical Device Notes

API Reference

GET /health

POST /analyze

POST /chat

POST /voice/query

GET /voice/audio/{audio_id}

POST /api/community/profile

GET /api/community/groups/{condition}/{species}

GET /api/community/matches/{pet_profile_id}

GET /api/activity/feed?limit=20

GET /api/activity/stats

GET /api/activity/trending

Project Structure

Testing

Key Design Decisions

Cost Optimization

Roadmap

Phase 1 (TreeHacks) - Complete

Phase 2 (Post-Hackathon)

Phase 3 (Production)

Team

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /health`

`POST /analyze`

`POST /chat`

`POST /voice/query`

`GET /voice/audio/{audio_id}`

`POST /api/community/profile`

`GET /api/community/groups/{condition}/{species}`

`GET /api/community/matches/{pet_profile_id}`

`GET /api/activity/feed?limit=20`

`GET /api/activity/stats`

`GET /api/activity/trending`

Packages