Skyheart

Drone to Phone to PC setup
Dispatcher-facing application w/ OpenStreetMap pathfinding & straight, flyable line segmentations as waypoints in drone GPS software.
Drone classifies patient
Drone sees non-patient
Patient photo upload (stored w/ AWS Rekognitoin)
Settings page for infra we built to control drone app UI by spoofing Android accessibility & forwarding data packets via sockets

Inspiration

This project started as a real problem I faced in the last few weeks. Several friends at Duke got sick with pneumonia, and while they were recovering, something as simple as picking up medication became a major burden. Many were tired and weak, and often couldn’t move around easily, so they had to depend on others for basic delivery logistics. Every pickup meant delays, coordination risk, and stress for both the patients and the people trying to help them.

That felt wrong—especially since timely medication can directly affect recovery. Skyheart was built to reduce that friction: an autonomous system that can travel to a destination, verify the correct person, and complete a safe handoff with minimal manual control.

What it does

Skyheart is an end-to-end autonomous drone medication delivery system.

A dispatcher opens the app, enters a patient’s address, and uploads a reference photo. The drone is then routed to that exact location using turn-by-turn road directions (not direct-flight shortcuts), which is safer in dense urban environments.

When the drone arrives, it enters identification mode: onboard vision segments and detects people, then verifies identity by comparing the detected face against the reference photo using AWS Rekognition. The drone only descends and releases the medication after positive identification.

The control app is built in React Native with an Uber-style interface: dispatchers can select destinations via autocomplete, view route and live progress, and monitor mission state. Since the drone manufacturer’s app is closed-source and offers no public SDK, Skyheart injects touch gestures via Android Accessibility API to issue equivalent joystick swipes and button presses.

The phone streams the drone camera over USB (its Wi-Fi is tied up with the drone connection), while all CV and navigation logic runs on a Python backend. A live dashboard shows segmentation overlays, detection boxes, GPS telemetry, and mission state in real time.

How we built it

We implemented this as a modular system:

React Native + Kotlin Module app for operator control and phone-to-backend communication.
Python backend (FastAPI/Uvicorn) for all mission logic and computer vision.
USB transport via ADB reverse port forwarding, with the phone connected to the drone over Wi-Fi and to the backend over USB.
A 7-state mission pipeline: INPUT → NAVIGATION → IDENTIFICATION → APPROACH → DELIVERY → DONE/HOVER.
A modular CV stack with configurable backends and fallbacks for person detection, segmentation, and face matching.
A browser live dashboard for monitoring with real-time overlays and telemetry.
A proxy routing layer for geocoding, reverse geocoding, routing, and map tiles to support mapping features in the phone environment.

Challenges we ran into

No SDK / closed-source drone app required building a control interface through Android Accessibility gesture injection instead of official APIs.
Inference speed and latency from face/person detection running too frequently at high resolution.
No phone internet access while connected to the drone via Wi-Fi, forcing USB- based architecture and additional proxy constraints.
End-to-end latency from drone → phone → backend created about a 1-second reaction delay in practice.
Inconsistent cheap drone GPS, which made precision navigation harder after a few waypoints.
Real-world data variability (signal drops, unstable frame quality, variable subject scale, inconsistent lighting).
Hardware/software instability, including ADB failures that required phone resets and driver reinstallation.

Accomplishments that we’re proud of

Built a working autonomous delivery loop: enter an address, route safely, identify the correct person, and complete delivery.
Replaced an SDK dependency with a robust Accessibility-based control path and action recorder for consistent cross-device tap mapping.
Designed a truly swappable AI pipeline where person detection, face matching, segmentation, and obstacle logic can be swapped without rewiring the system.
Delivered real-time situational awareness through a live dashboard with overlays, telemetry, and mission state in one place.

What we learned

In robotics, reliability is often about control flow, timing, and graceful fallback paths—not just model accuracy.
End-to-end latency tuning (throttling, scheduling, and frame strategy) is as important as per-frame precision.
Deployments benefit from strong defaults plus optional upgrades for different hardware profiles (GPU laptops vs CPU-only machines).
Well-defined typed protocol contracts are essential to coordinate frontend, backend, and control layers as complexity grows.

What’s next for Skyheart

Upgrade to a production-grade drone with a real SDK and stronger onboard control reliability.
Add edge/on-device inference to reduce cloud dependency and latency.
Improve temporal tracking (object IDs, smoothing, and memory) to reduce flicker and jitter.
Add adaptive scheduling for frame rate, resolution, and inference budget based on mission state and network/battery conditions.
Expand dashboard tooling with session replay and stage-level precision/latency metrics for faster iteration and tuning.