v1.1 — Edge-Case Testing + API Mocking

AI builds, tests, and verifies your Compose app

ComposeProof is an MCP server that gives AI assistants the full loop: render UI headlessly, mock backend APIs on-device, generate edge-case tests, and verify everything — no emulator needed.

Get Started npx composeproof

Works with any MCP client

Claude Code Cursor Gemini CLI Android Studio
composeproof
$ claude
You: test OrderScreen with every edge case
AI: I'll analyze the composable and generate edge cases.
generate_edge_cases OrderScreen.kt
render_batch 12 edge-case previews
mock_api GET /api/orders → empty, error, 500 items
take_screenshot 3 device states
PASS (15/15) — 12 headless + 3 on-device, 0 overflows, 0 crashes
_

The problem with AI + Android today

AI coding assistants are powerful at writing code, but blind when it comes to UI. The feedback loop is broken.

AI writes code but can't verify it works

Your AI assistant writes Compose UI code but never sees the result. Empty lists, overflowing text, broken error states — all shipped to QA because nobody tested the edge cases.

Backend state is untested visually

90% of apps depend on API responses. What does your screen look like when the API returns an error? An empty list? 500 items? Nobody mocks the backend to find out.

Edge-case testing is manual or nonexistent

Paparazzi and Roborazzi test the states you write. But who writes the empty string test? The RTL test? The 200-character name test? Nobody generates edge cases automatically.

Three layers. End-to-end coverage.

From headless rendering to API mocking to in-process inspection — AI gets the full feedback loop, not just screenshots.

Headless Rendering

Build-time / CI

AI analyzes composable types, generates edge-case @Preview functions, renders them headlessly, and verifies — no device needed.

# AI generates and renders edge cases
generate_edge_cases OrderScreen.kt
12 edge cases: empty, overflow, RTL...
render_batch 12 previews
✓ 12/12 rendered, 0 overflows
generate_edge_cases render verify render_batch diff

API Mocking & Device

Run-time / Zero-install

Mock any API endpoint on a live device — zero app code changes. AI controls what the backend returns, then screenshots the result.

# AI mocks the backend and verifies
mock_api GET /orders → empty list
✓ Proxy active, stub loaded
take_screenshot
✓ Empty state renders correctly
mock_api GET /orders → 500 error
✓ Error state shows retry button
mock_api device_interact take_screenshot inspect_ui_tree build_and_deploy

Embedded Agent

In-process / SDK

An in-app debug SDK gives the AI direct access to app internals — permissions, lifecycle, navigation, DataStore, coroutines — from inside the process.

# AI inspects app internals
inspect_permissions
CAMERA — denied (rationale needed)
simulate_process_death
Killed process, relaunched...
✓ Activity restored after process death
inspect_permissions inspect_lifecycle inspect_navigation simulate_process_death

See it in action

A real session testing StickerExplode — a Compose Multiplatform sticker canvas app.

Generated Health Report

composeproof-report.html
Project Health
72/100
StickerExplode
4 assertions · 0 failed
PASS Has @Preview functions (found 4)
PASS Compose dependencies present
PASS Kotlin Multiplatform configured
PASS Golden baseline coverage > 0%
Compose Intelligence
Kotlin Files
24
Lines of Code
1,847
@Preview
4
Sticker Types
16
Context Graph
composeApp commonMain, androidMain, desktopMain, wasmJsMain
StickerCanvas.kt StickerData, StickerType, DragState
+ 22 more files...

One command generates a self-contained HTML report — assertions, screenshots, compose intelligence, and context graph. Share it, archive it, diff it.

Token Cost

token usage estimate
MCP Tool Call Input Output
insights (project overview) ~200 ~500
list_previews (discover 4 @Preview) ~200 ~300
take_device_screenshot (before) ~200 ~800
device_interact (swipe drag) ~300 ~800
take_device_screenshot (after) ~200 ~800
get_context (scope=structure) ~200 ~2,000
generate_report ~300 ~500
Total ~1,600 ~5,700
Grand total ~7,300 tokens

Full project analysis — discover, screenshot, interact, and report — for under 7,500 tokens. That's less than reading a single large file.

Session Flow

composeproof session
# 7 tool calls. Full project story.
insights
list_previews
take_device_screenshot
device_interact swipe drag
take_device_screenshot
get_context scope=structure
generate_report
✓ Report saved — 42 KB, score 72/100

This isn't a mock — this is a real session captured from Claude Code + ComposeProof.

Watch the demo videos

One prompt, zero human intervention. AI builds the app, discovers UI patterns, tests interactions, and reports results.

Recorded on a Pixel 9 Pro Fold with StickerExplode demo app.

Build, deploy & launch

AI runs preflight to detect the device, picks the right Gradle variant, builds, installs, and launches — all from a single natural language prompt.

AI discovers the interaction model

The FAB tap fails, so the AI reads source code, discovers the FAB opens a bottom sheet sticker tray, and switches strategy. Every action auto-screenshots.

Drag testing with coordinate recalculation

ADB swipe doesn't trigger Compose gestures. The AI switches to raw touch events, detects resolution mismatch, recalculates coordinates, and successfully drags stickers.

Results & persistent screenshots

AI produces a structured test report (3/3 pass), notes z-ordering and animations. Every screenshot from the session is saved to disk for human review.

Full uncut session

6 minutes — from prompt to test report, no edits.

40 tools shipped. More coming.

Every tool is an MCP endpoint your AI can call. Edge-case testing, API mocking, headless rendering, device interaction, and more.

insights

Project overview — preview count, golden coverage, device status

Observability
render

Render any @Preview headlessly via Roborazzi → PNG

Compose
list_previews

Discover all @Preview functions — file, line, params

Compose
verify

Single-call PASS/FAIL: render + golden + accessibility

Compose
render_batch

Render/verify multiple previews with compact summary

Compose
diff

Golden management: verify, record, or update baselines

Compose
preflight

Check device + app state — connected, installed, screen

Observability
inspect_ui_tree

Dump live Compose/View hierarchy with a11y warnings

UI Inspection
device_interact

Tap, swipe, type, scroll — AI navigates the app

UI Control
get_recomposition_stats

Find recomposition hotspots via compiler metrics

Performance
take_device_screenshot

Capture device screen, auto-saved to disk

UI Inspection
build_and_deploy

Gradle build + install APK on device

Observability
get_build_status

Check build success and APK version match

Observability
get_network_logs

Capture OkHttp HTTP traffic from logcat

Observability
manage_proxy

Set/clear device HTTP proxy

Data
get_feature_flags

Read/write SharedPreferences

Data
inspect_permissions

Runtime permissions — granted, denied, rationale needed

Embedded Agent
inspect_process_lifecycle

Activity/Fragment lifecycle states for all components

Embedded Agent
inspect_navigation_graph

Navigation graph, back stack, deep link patterns

Embedded Agent
inspect_datastore

Jetpack DataStore preferences — all keys and values

Embedded Agent
inspect_coroutine_state

Active coroutines — state, dispatchers, job hierarchy

Embedded Agent
execute_deeplink

Fire a deep link URI and report which handler resolved

Embedded Agent
simulate_process_death

Recreate Activity to test save/restore state handling

Embedded Agent
accessibility-checker

WCAG 2.1 audit — touch targets, contrast, TalkBack

Expert Prompts
compose-performance

Recomposition traps — unstable params, lambda allocations

Expert Prompts
kmp-architect

KMP architecture — shared code, expect/actual patterns

Expert Prompts
ui-reviewer

Visual quality — spacing, typography, Material 3 compliance

Expert Prompts
screenshot-test-writer

Generate test code for Paparazzi, Roborazzi, or goldens

Expert Prompts
spec-verifier

Full spec-driven verification — parse spec, map, verify, report

Expert Prompts
generate_edge_cases

Analyze composable types and suggest edge-case @Preview tests

Testing
mock_api

Start/stop mock API server — intercept real API calls on device

Testing
semantic_ui_query

Query Compose UI tree by semantic role, text, or test tag

Embedded Agent
profile_lazy_list

Profile LazyColumn/LazyRow scroll performance

Embedded Agent
inspect_compose_state

Read remembered/derived state values from live composables

Embedded Agent
track_recompositions

Count recompositions per composable per frame

Embedded Agent
analyze_stability

Report stability classification of composable params

Embedded Agent
inspect_shared_preferences

Read/write SharedPreferences from inside the app

Embedded Agent
inspect_viewmodel_state

Snapshot ViewModel state fields and StateFlow values

Embedded Agent
inspect_current_screen

Get current screen route and visible composables

Embedded Agent
inspect_network_logs

In-process HTTP traffic capture via OkHttp interceptor

Embedded Agent
detect_memory_leaks planned

LeakCanary heap analysis with reference chains

Performance
profile_startup planned

Cold/warm/hot start breakdown with bottlenecks

Performance

Up and running in 3 steps

Zero-install architecture. No changes to your project's build files.

Run the setup wizard

One command. Installs the binary, picks your AI agents (Claude Code, Gemini CLI, Cursor), configures MCP, and optionally installs the Compose UI skill.

npx composeproof

AI builds and renders

Your AI assistant writes code, renders previews headlessly, generates edge-case tests, and iterates on UI autonomously.

You: "build the order screen"
AI:  writes code → renders 12 edge cases → PASS ✓

Verify end-to-end

Mock backend APIs on a real device, screenshot every state, and confirm the full loop works. No Postman, no manual QA.

AI: mock_api GET /orders → empty, error, 500 items
    take_screenshot × 3 states
    ✓ all states render correctly

AI Skill

Teach AI the render-review-fix loop

ComposeProof ships a skill — a behavioral instruction that teaches AI agents how to do Compose UI work, not just gives them tools.

Without the skill

Default AI behavior

✍️ AI writes Compose code
➡️ Moves on — never checks visually
🐛 Wrong modifier order, broken layout
😰 Render fails → panics, random fixes

With the skill loaded

AI knows the rules

✍️ WRITE the composable
👀 RENDER immediately to check
🔍 REVIEW — alignment, spacing, tap targets
🔧 FIX based on screenshot feedback
🔄 RE-RENDER to verify the fix
🛟 If stuck → asks you with structured choices

Render after every change

AI checks its own work visually, not just at the end

Read screenshots critically

Checklist: layout, spacing, contrast, states, overflow

Mock and verify APIs

Swap backend responses on-device, screenshot every state

Recover from errors

Decision tree for failures instead of flailing

The full loop, not just screenshots

AI writes the code, renders it headlessly, generates edge-case tests, mocks backend APIs on a real device, and screenshots every state. One prompt, full verification.

Works with Claude Code, Gemini CLI, Cursor, and any MCP client. Same 40 tools, same skill, every agent.

$ claude "test OrderScreen with every edge case"

 generate_edge_cases OrderScreen.kt
 render_batch 12 previews
 mock_api GET /orders → empty, error, 500 items
 take_screenshot × 3 device states
PASS 15/15 — 0 overflows, 0 crashes

Roadmap

40 tools shipped across 5 waves. Edge-case testing & API mocking shipped. Multiplatform next.

1

Headless Rendering & Verification

Ready
renderlist_previewsverifyrender_batchdiffinsights
2

Device Inspection & Interaction

Ready
preflightinspect_ui_treedevice_interacttake_device_screenshotbuild_and_deployget_build_statusget_recomposition_statsget_network_logsmanage_proxyget_feature_flags
3

Embedded Agent — Runtime Inspection

Ready
inspect_permissionsinspect_process_lifecycleinspect_navigation_graphinspect_datastoreinspect_coroutine_stateexecute_deeplinksimulate_process_death
4

AI Skills & Session Learning

Ready
npx composeproof installercompose-ui-workflow skillMCP instructionsexpert prompts (6)license gating
5

Edge-Case Testing & API Mocking

Ready
generate_edge_casesmock_apicomposable type analysisWireMock integrationADB proxy orchestrationstale proxy safety
6

Multiplatform & CI

Planned
CMP renderingGradle pluginGitHub ActionHTML reportsSSE transport

Get started

One command. Works on any Compose project, no build file changes.

terminal
$ npx composeproof
Free
$0
  • Headless rendering + screenshots
  • Golden diffing + visual regression
  • Device interaction + inspection
  • HTML reports + expert prompts
  • ~30 MCP tools, all open source
Pro
Pro
Coming soon
Free for early supporters
  • Everything in Free
  • Compose Intelligence (5 tools)
  • Embedded Agent (11 tools)
  • Companion app
  • Session screenshots in reports