Don't share your pool with your Claws.
Give each AI employee its own identity and digital workspace.
You run the fleet like a boss.
vmClaw captures your VM screen, sends it to an AI vision model, and executes the actions it decides on — clicks, typing, keyboard shortcuts, scrolling — in a continuous loop until the task is done.
Imagine running a company staffed by AI employees. Each one has a unique identity and works inside its own VM, giving every agent a clean, isolated workspace that never touches your host system. From your host machine, you act as the boss—assigning tasks, supervising their work, and interacting with each employee separately—while keeping your personal identity and environment fully isolated from the identities of your AI workforce.
- Multi-model — GPT-5.4, Claude Opus 4.6, GPT-4o, DeepSeek, Grok, and 15+ more models.
- Local — Runs on your Windows machine. Screenshots never leave your network (sent directly to the AI API).
- Universal — Supports Hyper-V, VMware, VirtualBox, and QEMU VMs.
- AI Memory — Stores past task executions in a local vector database and recalls similar successes as few-shot examples, so it improves with every run. All memory stays on your machine — nothing is shared or uploaded.
- Simple — One command to start. No complex setup.
What if one AI agent isn't enough? Fleet mode lets you command VMs across every machine on your network from a single GUI.
┌──────────────────────────────────────────────────────┐
│ Your Machine (Hub) │
│ vmClaw GUI ────────────────────────────────────── │
│ ┌──────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ VM: Alice │ │ VM: Bob │ │ VM: Carol │ │
│ │ (local) │ │ (local) │ │ (local) │ │
│ └──────────────┘ └────────────┘ └────────────┘ │
└──────────────┬───────────────────────────────────────┘
│ WebSocket
┌────────┴────────┐
│ Lab Server │ ┌──────────────────┐
│ 10.0.0.9 │─────────│ More machines │
│ ┌────────────┐ │ │ ... │
│ │ VM: Dev-01 │ │ └──────────────────┘
│ │ VM: Dev-02 │ │
│ │ VM: Dev-03 │ │
│ └────────────┘ │
└──────────────────┘
- One click, any VM — Browse VMs across all machines in a sidebar tree. Click one, assign a task, watch it execute in real-time.
- Live streaming — Screenshots, logs, and actions from remote nodes stream back over WebSocket instantly. It feels like the VM is running locally.
- Zero config discovery — Just add a peer's IP to
config.toml. No VPN, no cloud, no port forwarding gymnastics. It works on your LAN out of the box. - Scale your AI workforce — Run 3 VMs on your desktop, 5 on a lab server, 10 in a rack. Assign tasks to any of them from one place. Each AI employee works independently inside its own VM.
- Proxy chains — Node A discovers Node B's peers automatically, so
A -> B -> Crouting works without configuring every node.
# config.toml — that's all you need
[fleet]
enabled = true
node_name = "my-pc"
listen_port = 8077
[[fleet.peers]]
name = "lab-server"
url = "http://192.168.1.50:8077"Fleet turns vmClaw from a single-machine tool into a distributed AI operations center. Think Ansible, but instead of running shell commands, your agents see the screen and use it like a human would.
# Install
.\.venv\Scripts\pip.exe install -e .
# Run as Administrator (required to inject input into VM windows)
.\.venv\Scripts\python.exe -m vmclaw run
# Or launch the GUI
.\.venv\Scripts\python.exe -m vmclaw guiThat's it. vmClaw will walk you through selecting a provider, model, and VM window interactively.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Capture VM │────>│ AI Vision │────>│ Execute │
│ Screenshot │ │ Model │ │ Action │
└─────────────┘ └─────────────┘ └─────────────┘
^ │
└────────────────────────────────────────┘
repeat until done
- Capture — Takes a screenshot of the selected VM window
- Think — Sends the screenshot + task description to an AI vision model
- Act — Executes the AI's decision (click, type, key press, scroll)
- Repeat — Loops until the AI reports the task is done (or hits the action limit)
| Provider | Models | Auth |
|---|---|---|
| GitHub Copilot (free) | Claude Opus 4.6, Claude Sonnet 4.6, GPT-5.4, GPT-5-mini, GPT-4o, GPT-4.1, o3, o4-mini, DeepSeek-R1, Grok-3, and more | gh auth login (browser) |
| OpenAI (API key) | GPT-4o, GPT-4.1, o3, o4-mini, and any OpenAI model | OPENAI_API_KEY env var |
python -m vmclaw run # Start the AI agent loop (CLI)
python -m vmclaw gui # Launch the graphical interface
python -m vmclaw list # List detected VM windows
python -m vmclaw list-all # List all windows (for debugging)
python -m vmclaw capture # Capture a VM screenshot- Windows 10/11 with Python 3.10+
- A running VM (Hyper-V, VMware, VirtualBox, or QEMU)
- GitHub CLI (
gh) for GitHub Copilot auth, or an OpenAI API key
vmClaw works out of the box with interactive prompts. For automation, create a config.toml:
[api]
provider = "github" # or "openai"
model = "claude-opus-4.6"
[agent]
max_actions = 50 # Safety limit
action_delay = 1.0 # Seconds between actions
screenshot_width = 1024- Run VS Code with Administrator
MIT