Vanguard_AI

Inspiration

Security patrol is dangerous, repetitive, and expensive. Warehouses lose billions to theft annually. Military personnel risk their lives in reconnaissance. Farmers can't monitor vast fields 24/7.

We asked: What if you could just talk to a robot and have it patrol for you?

No complex interfaces. No joysticks. No training. Just speak naturally: "Go forward. Turn left. Start patrol."

VanguardAI makes advanced robotics accessible to anyone with a voice.

What it does

VanguardAI transforms the Unitree Go2 quadruped robot into a voice-controlled security system.

The flow is simple:

🎤 You speak → "Go forward and patrol the area"
🧠 AI understands → Converts speech to structured commands
🤖 Robot acts → Executes movement in real-time

Supported commands:

Movement: forward, backward, left, right
Actions: patrol, stop, look around
Natural variations work: "walk ahead", "move up", "go straight"

How we built it

Voice → Smallest.ai STT → Together AI LLM → Cyberwave SDK → Robot

Speech-to-Text (Smallest.ai Pulse)

WebSocket streaming for 64ms latency
Records 4-second audio chunks from microphone
Returns transcribed text in real-time

Command Parsing (Together AI)

Llama-3.3-70B-Instruct-Turbo model
Structured prompt converts natural language to JSON
Example: "go forward" → {"action": "move_forward", "value": 1.0}

Robot Control (Cyberwave SDK)

Digital twin of Unitree Go2
Motion bindings: Forward, Backward, Turn Left, Turn Right, Idle
Commands execute on physical robot via edge connection

Stack:

Python 3.12
WebSockets for real-time audio streaming
Async/await for non-blocking I/O

Challenges we ran into

SDK API Discovery - Cyberwave SDK methods weren't what we expected. robot.move(x=1) didn't work. Had to dig through docs to find robot.motion.asset.animation("Forward").
Audio Latency - Initial implementation had 2+ second delay. Switched from REST API to WebSocket streaming to get real-time response.
LLM Output Parsing - Sometimes the LLM returned extra text with JSON. Added robust parsing with fallback to {"action": "stop"}.
Hackathon Time Pressure - 6.5 hours to go from zero to working demo. Prioritized core voice→robot loop over nice-to-haves like vision.

Accomplishments that we're proud of

✅ End-to-end voice control working - Speak and the robot moves. No lag.

✅ Clean modular architecture - Each component (voice, brain, robot) is independent and testable.

✅ Natural language understanding - Say "walk forward", "go ahead", or "move up" - all work.

✅ Built in one day - From empty folder to working demo in under 7 hours.

What we learned

Smallest.ai Pulse is incredibly fast - 64ms TTFT makes voice control feel instant
Cyberwave's digital twin approach abstracts away robot complexity
LLMs as command parsers are powerful - no need for rigid grammar rules
WebSockets >> REST for real-time applications

What's next for VanguardAI

🔮 Vision Module - Use Go2's RGB camera + VLM to detect threats (intruders, anomalies)

🔮 Alert System - Email/SMS notifications when threats detected

🔮 Autonomous Patrol - Waypoint-based navigation without voice commands

🔮 Multi-Robot Fleet - Coordinate multiple Go2 robots for large areas

🔮 Edge Inference - Run models on robot's onboard compute for offline operation

Built With

smallest-ai
together-ai
cyberwave
python
websockets
unitree-go2



**GitHub:** https://github.com/IneshReddy249/VanguardAI


Copy this whole thing into Devpost. For the "Built with" tags section, add these individually:
- `smallest-ai`
- `together-ai`  
- `cyberwave`
- `python`
- `websockets`
- `unitree-go2`

Built With

langchain
python

Updates

Inesh Reddy 4015 started this project — Mar 13, 2026 08:17 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.