Vision Machine · Edge intelligence for autonomous drones

Foundation models
small enough
to fly.

We compress frontier vision-language and vision-language-action models to run inside small drones — without a link, without a pilot. Two reference platforms, one autonomy stack.

[01]Thesis

The next decade of autonomy
is decided on the airframe, not in the cloud.

Frontier vision-language models can already reason about a battlefield in paragraphs and infer intent from a single frame — but none of them have ever flown one. They live in datacenters, lean on fat links, and respond on cloud timescales. In contested airspace, none of that survives contact.

Vision Machine is the team bringing that class of model down to the airframe. We compress, distill, and re-architect frontier vision-language and vision-language-action models until they perceive, reason, and act inside the drone itself — without ever phoning home. That is the gap we exist to close.

That capability rewrites two doctrines at once: how counter-UAS engagements are won, and how forward reconnaissance is gathered. We demonstrate both with our own reference platforms.

On the airframe

Models run inside the drone, on embedded compute. No cloud, no datacenter, no round-trip.

Link-independent

Mission proceeds when jammed, spoofed, or out of range. The link is a convenience, not a requirement.

One model family · two platforms

VM-Eye, VM-Act, and VM-Voice power both the VM-01 Interceptor and the VM-02 Sentinel.

[02]Reference Platforms

Two airframes. Two doctrines.
One answers the threat every force is already facing. One opens a category they hadn't seen.

PLATFORM · 01

VM-01 · Autonomous Counter-UAS

Interceptor

The doctrine no force can ignore.

Detect, classify, and engage hostile drones inside the operator’s reaction window.

VM-01 is purpose-built for the counter-UAS engagement that now defines modern ground warfare. A compact on-board perception stack maintains visual track across occlusion and spoofing, a vision-language-action policy executes terminal guidance, and operator-authored rules of engagement are enforced cryptographically before the airframe leaves the rail.

Operator Loop

"Threat, bearing two-four-zero."

Launched. Closed. Reported — before the operator finishes the cue.

Deployed by
Forward air-defense · base defense · convoy escort
Targets
Small UAS · loitering munitions
ROE
Operator-signed, on-board, auditable
Link
Optional · launch-and-forget
PLATFORM · 02

VM-02 · Voice-native Autonomous Reconnaissance

Sentinel

A new category. Recon you talk to.

Tasked by voice. Flies its own search pattern. Reports back in plain language.

VM-02 is the first reconnaissance drone built for the radio, not the screen. The operator briefs an objective the way they would brief a wingman. The drone executes, sees, reasons, and returns a written and spoken situation report — entity counts, behavior, pattern-of-life — without the operator ever having to watch raw video.

Operator Loop

"Sweep east of grid 432-87. Watch for technicals."

Two minutes later — a written situation report. No video to scrub.

Deployed by
Forward operators · small-unit ISR · QRF
Tasking
Voice over tactical radio when link is up · plain-text brief when not
Output
Structured SITREP, not raw video
Link
Optional · mission completes regardless
[03]How It Works

Three primitives.
Running on one SoC.

P-01

Vision-Language Model

Perceive

The drone does not just classify pixels — it captions, grounds, and reasons about what it sees in natural language. Running on the airframe, at the frame rate of the sensor.

VM-Eye

P-02

Intent & Mission Memory

Reason

Operator intent — spoken, typed, or radioed — is interpreted against on-board perception, rules of engagement, and mission memory. The drone knows what it was asked to do, and why.

On-airframe context

P-03

Vision-Language-Action

Act

A small action head closes the loop from perception and intent directly to flight envelopes and gimbal commands. Language in. Motion out. Hard real-time guarantees underneath.

VM-Act

[04]The Model Family

Underneath both airframes,
one model family, trained in-house under export control.

The airframes are how we prove the model family in the field. The model family is how we keep proving it across the next airframes, the next sensors, the next doctrines.

VM-Eye

Perception

VM-Act

Action

VM-Voice

Voice

Technical Brief — under NDA
[05]Doctrine

Autonomy between operator and objective.
Never between operator and decision to use force.

Rules of engagement, target sets, and engagement envelopes are authored before launch by a human, in writing, cryptographically signed, and enforced by the airframe. Audit trails and post-mission accountability are built into the platform, not bolted on.

Human-authored ROE

Loaded pre-flight · cryptographically signed

Auditable autonomy

Every decision logged, timestamped, replayable

Operator-in-command

Abort, override, recall — link permitting

[06]Engage

Request an operational brief.
Available to vetted defense and government customers.

Briefings include a live flight demonstration of VM-01 in a controlled environment, an architecture review of the on-device VLM/VLA stack, and a discussion of integration paths into your existing C4ISR posture.

Contact

ahmet@visionmachine.dev

Secure Inquiry · VM-INTAKE-01

By submitting you consent to vetting. We reply within 5 business days to verified addresses only.