Local-first AI runtime for iPhone and Mac

Intelligence that runs near you.

QWON brings together PREXUS mobile runtime research and QWON Terminal's AI teaming model into one execution layer for local inference, context compression, and cloud escalation.

QWON Orchestrator
Input
Local
Cloud
Intent analysis 12 ms
Context compression -68%
Privacy routing local

Beyond chat UI

Not another cloud prompt box. An execution layer for intelligence.

QWON's core is not the chat surface. It is a decision plane that understands, classifies, and compresses input locally, then routes only the necessary context to OpenAI, Anthropic, Gemini, or LAN models when the task calls for it.

Users should not have to choose the model, prompt shape, or context budget every time. QWON weighs latency, cost, privacy, and capability before selecting the right execution path.

PREXUS runtime

A local-first cognitive runtime designed for iPhone.

01

Local Orchestrator

A compact local model acts as a control model, not merely a responder, deciding intent, task class, policy, and memory recall close to the device.

02

Context Compression

Before anything leaves the device, QWON structures conversation history and input into the smallest useful context for speed, cost, and privacy.

03

Multimodal Sensors

Camera, microphone, OCR, and sensor inputs are preprocessed near the device, expanding AI from text response into situated understanding.

04

Cloud Escalation

Deep reasoning and long-context analysis escalate only when needed, using user-configured external models and filtered context.

QWON Terminal

A Mac native terminal for AI teams.

QWON Terminal coordinates Codex CLI, Claude Code, and other agents through multiple sessions, isolated Git worktrees, shared context, and an event timeline built for parallel AI work.

  • Session Grid for 2-16 agents
  • Worktree isolation to prevent conflicts
  • User-controlled handoff between sessions
  • Timeline monitoring for errors and completion

One network, many nodes

Treat iPhone, Mac, and cloud as one AI team.

Mobile Node Job input, notification, lightweight inference
Worker Node Agent execution, local inference, long-running tasks
Compute Node Parallel processing and heavy local workloads
Cloud Node Escalated reasoning with explicit user policy

Roadmap

Think nearby first. Reach farther only when needed.

Phase 1

Core Runtime

Local LLM, OpenAI and Anthropic adapters, context compression, and routing UI.

Phase 2

Multimodal Expansion

Vision, OCR, voice input, local memory, and device-side preprocessing.

Phase 3

AI Runtime Platform

QWON Terminal, LAN models, MCP, Shortcuts, and a node selection algorithm.

Technical Preview

Follow QWON as it takes shape.

The technical preview is being prepared for people who want to follow PREXUS alpha work, QWON Terminal, and local inference evaluation.