Local-first AI runtime for iPhone and Mac

Intelligence that runs near you.

QWON brings together PREXUS mobile runtime research and QWON Terminal's AI teaming model into one execution layer for local inference, context compression, and cloud escalation.

Technical Preview Explore Runtime

QWON Orchestrator

Input

Local

Cloud

Intent analysis 12 ms

Context compression -68%

Privacy routing local

Beyond chat UI

Not another cloud prompt box. An execution layer for intelligence.

QWON's core is not the chat surface. It is a decision plane that understands, classifies, and compresses input locally, then routes only the necessary context to OpenAI, Anthropic, Gemini, or LAN models when the task calls for it.

Users should not have to choose the model, prompt shape, or context budget every time. QWON weighs latency, cost, privacy, and capability before selecting the right execution path.

PREXUS runtime

A local-first cognitive runtime designed for iPhone.

Local Orchestrator

A compact local model acts as a control model, not merely a responder, deciding intent, task class, policy, and memory recall close to the device.

Context Compression

Before anything leaves the device, QWON structures conversation history and input into the smallest useful context for speed, cost, and privacy.

Multimodal Sensors

Camera, microphone, OCR, and sensor inputs are preprocessed near the device, expanding AI from text response into situated understanding.

Cloud Escalation

Deep reasoning and long-context analysis escalate only when needed, using user-configured external models and filtered context.

Executor building runtime adapter

Reviewer waiting on diff evidence

Tester swift test complete

Writer handoff summary ready

10:23 Session A started 10:25 Worktree allocated 10:28 Review handoff queued

QWON Terminal

A Mac native terminal for AI teams.

QWON Terminal coordinates Codex CLI, Claude Code, and other agents through multiple sessions, isolated Git worktrees, shared context, and an event timeline built for parallel AI work.

Session Grid for 2-16 agents
Worktree isolation to prevent conflicts
User-controlled handoff between sessions
Timeline monitoring for errors and completion

One network, many nodes

Treat iPhone, Mac, and cloud as one AI team.

Mobile Node Job input, notification, lightweight inference

Worker Node Agent execution, local inference, long-running tasks

Compute Node Parallel processing and heavy local workloads

Cloud Node Escalated reasoning with explicit user policy

Roadmap

Think nearby first. Reach farther only when needed.

Phase 1

Core Runtime

Local LLM, OpenAI and Anthropic adapters, context compression, and routing UI.

Phase 2

Multimodal Expansion

Vision, OCR, voice input, local memory, and device-side preprocessing.

Phase 3

AI Runtime Platform

QWON Terminal, LAN models, MCP, Shortcuts, and a node selection algorithm.

Technical Preview

Follow QWON as it takes shape.

The technical preview is being prepared for people who want to follow PREXUS alpha work, QWON Terminal, and local inference evaluation.