Local Orchestrator
A compact local model acts as a control model, not merely a responder, deciding intent, task class, policy, and memory recall close to the device.
Local-first AI runtime for iPhone and Mac
QWON brings together PREXUS mobile runtime research and QWON Terminal's AI teaming model into one execution layer for local inference, context compression, and cloud escalation.
Beyond chat UI
QWON's core is not the chat surface. It is a decision plane that understands, classifies, and compresses input locally, then routes only the necessary context to OpenAI, Anthropic, Gemini, or LAN models when the task calls for it.
Users should not have to choose the model, prompt shape, or context budget every time. QWON weighs latency, cost, privacy, and capability before selecting the right execution path.
PREXUS runtime
A compact local model acts as a control model, not merely a responder, deciding intent, task class, policy, and memory recall close to the device.
Before anything leaves the device, QWON structures conversation history and input into the smallest useful context for speed, cost, and privacy.
Camera, microphone, OCR, and sensor inputs are preprocessed near the device, expanding AI from text response into situated understanding.
Deep reasoning and long-context analysis escalate only when needed, using user-configured external models and filtered context.
QWON Terminal
QWON Terminal coordinates Codex CLI, Claude Code, and other agents through multiple sessions, isolated Git worktrees, shared context, and an event timeline built for parallel AI work.
One network, many nodes
Roadmap
Local LLM, OpenAI and Anthropic adapters, context compression, and routing UI.
Vision, OCR, voice input, local memory, and device-side preprocessing.
QWON Terminal, LAN models, MCP, Shortcuts, and a node selection algorithm.
Technical Preview
The technical preview is being prepared for people who want to follow PREXUS alpha work, QWON Terminal, and local inference evaluation.