Overview
Install
Coding Agent
pi-ai
Agent Core
Packages
Extend
Deploy
Glossary + FAQ
GitHub @badlogic
badlogic/pi-mono

Pi Agent

A minimal TypeScript toolkit for building AI agents and managing LLM deployments. Seven composable packages. MIT licensed. Built by Mario Zechner.

The Pi philosophy: No sub-agents baked in. No plan mode. No permission popups. No background bash. Instead, minimalist primitives and a robust extension system. Users build features they need; the core stays small, honest, and hackable.
Start here if you're new to Pi

What is Pi Agent, actually?

Pi is not trying to be Claude Code, Cursor, or a full AI IDE. It's a TypeScript toolkit from one developer (Mario Zechner) with a radical idea: ship a tiny, understandable core, and let users add the pieces they actually need. The full distribution is only seven small npm packages — you can read and understand each one in an afternoon.

There are two things you can do with Pi. First, use it as a coding agent in your terminal (pi) — four built-in tools (read, write, edit, bash), unlimited extensibility through TypeScript extensions and markdown skills. Second, use its libraries (pi-ai, pi-agent-core) to build your own agent — a Slack bot, a browser chat, a custom workflow, whatever. pi-web-ui even gives you ready-made components for putting a chat UI in a static Cloudflare Pages site.

Who it's for: developers who'd rather compose 50 lines than configure 500. If “I want an agent that does exactly X, nothing more, nothing less” sounds good, Pi is probably right for you.

In plain terms
Monorepo = one git repo that holds multiple related packages. pi-mono holds all seven Pi packages together — they share a build system and depend on each other cleanly.

Composable = each package does one thing well and plays nicely with the others. You don't need all seven; grab only what you need.

Subscription auth (the Pi superpower) = use the Claude/ChatGPT/Copilot subscription you already pay for instead of a separate per-token API bill. Pi logs you in through OAuth and piggybacks on your existing plan.

Minimalism as a constraint — Pi deliberately omits features (plan mode, sub-agents, permission popups) that other tools treat as table stakes. The bet: most of those are poorly generalized, and a user who needs them can build them as an extension that fits their situation exactly.
TypeScript
MIT License
7 packages
20+ LLM providers
OAuth + API keys
Subscription auth

Explore the Guide

Install & Quickstart
Install globally, authenticate via OAuth subscription or API key, and start pi.
npmoauth
Coding Agent CLI
Slash commands, @-mentions, sessions, branching, keyboard shortcuts.
piTUI
pi-ai Library
Unified LLM API for 20+ providers with TypeBox tool schemas.
librarytypebox
Agent Core
Stateful agent runtime with tool execution and event streaming.
runtimeevents
All Packages
pi-tui, pi-mom, pi-web-ui, pi-pods — the supporting cast.
monorepotui
Extensions & Skills
TypeScript extensions, skills, prompt templates, themes, Pi Packages.
extensionsskills
Self-Hosted Deployment
Deploy pi-coding-agent as a Cloudflare Pages site using Wrangler.
cloudflarewrangler

Quick Links

Monorepo
badlogic/pi-mono
@mariozechner/pi-coding-agent
npm package
@mariozechner/pi-ai
Unified LLM library
Mario Zechner
mariozechner.at
Getting Started

Install & Quickstart

Pi installs as a global npm package. You can authenticate via OAuth using an existing Claude/ChatGPT/Copilot subscription, or provide an API key directly.

Install

# Install the coding agent globally npm install -g @mariozechner/pi-coding-agent # Launch pi

Auth Option A — Subscription OAuth

If you already pay for Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot, Google Gemini CLI, or Google Antigravity, you can use your subscription without an API key.

pi # Inside the TUI: /login # Select provider — opens browser for OAuth flow

Auth Option B — API Key

export ANTHROPIC_API_KEY=sk-ant-... # or export OPENAI_API_KEY=sk-... # or any supported provider pi
Why subscription auth matters: Subscription OAuth lets you use Pi without paying twice. If you already have Claude Pro, you get Claude Sonnet/Opus access through Pi at no extra cost—same for ChatGPT Plus and Copilot.
In plain terms
npm install -g = install globally, so the pi command is available from any folder. The -g flag (versus local install) means Pi lives in a shared location, not in each project's node_modules/.

OAuth subscription flow: Pi opens a browser page at your provider (e.g. claude.ai), you log in with your normal account, provider redirects back to Pi with a token, Pi saves that token locally. Now every request Pi makes goes through your subscription instead of billing a separate API account.

One-shot mode (pi "question") = don't open the interactive TUI; run once, print the answer, exit. Great for piping into other commands or scripts.

AGENTS.md / CLAUDE.md = a markdown file in your project or home dir with instructions Pi reads before every request. Think of it as your project's ambient context — coding style, don't-dos, domain glossary. Pi supports both names for compatibility with Claude Code.

Project-level overrides: settings in .pi/settings.json at your project root take precedence over the global ~/.pi/agent/settings.json. Different projects can have different defaults (different model, stricter thinking level, etc.) without editing globals.
Common mistakes
  • Skipping -g and wondering why pi isn't on PATH. Local npm installs live inside the current folder's node_modules/.bin/ — you'd have to invoke via npx pi or a local script. For the global CLI experience, use -g.
  • Mixing subscription OAuth and direct API keys for the same provider. Pi will happily use either, but if both are configured it can pick the wrong one. Remove one when you settle on the other.
  • Assuming subscription OAuth covers everything. Subscriptions have usage caps (Claude Pro has message limits; Copilot has monthly quotas). When you hit them, Pi's provider returns errors until reset. API keys don't have this — they just bill more.
  • Putting API keys in settings.json. That file is meant to be committable (coding conventions, project model choice). Keep secrets in your shell's env vars or a .env file Pi loads separately.
  • Forgetting that .pi/settings.json doesn't auto-create. The directory-level override only exists if you make it. Start with the global, then override per-project as needed.
  • Editing SYSTEM.md expecting it to stack with the default. SYSTEM.md replaces the system prompt entirely. If you only want to add instructions, use AGENTS.md instead.

First Prompt

# Interactive mode pi # One-shot mode pi "List all .ts files in src/" # Print mode for scripting pi -p "Summarize this codebase" # Pipe input cat README.md | pi -p "Summarize this text" # Specific model pi --model openai/gpt-4o "Help me refactor" # High thinking effort pi --thinking high "Solve this complex problem" # Restrict tools pi --tools read,grep,find,ls -p "Review the code"

Configuration Locations

PathPurpose
~/.pi/agent/settings.jsonGlobal settings (thinking level, theme, transport)
.pi/settings.jsonProject-level overrides
~/.pi/agent/SYSTEM.mdGlobal system prompt replacement
.pi/SYSTEM.mdProject system prompt replacement
~/AGENTS.md or ./AGENTS.mdContext file (or CLAUDE.md)
~/.pi/agent/prompts/Global prompt templates (invoked via /name)
~/.pi/agent/skills/Global skills (Agent Skills standard)
~/.pi/agent/models.jsonCustom model definitions
~/.pi/agent/themes/Theme files (hot-reload)
pi-coding-agent

Coding Agent CLI

A minimal terminal coding harness with four built-in tools: read, write, edit, bash. Everything else is opt-in via extensions.

Why only four tools?

Because those four are enough, and everything else distracts.

Claude Code has ~20 built-in tools. Cursor has dozens. Pi has four. The rationale: every extra built-in is a decision someone made about your workflow, baked in where you can't remove it. Four tools cover what a coding agent fundamentally needs — read code, write code, edit code, run things — and leave room for you to add the specific capabilities your work wants as TypeScript extensions. No popups, no sub-agents, no hidden background bash. What you see in the session is what happened.

In plain terms
JSONL = JSON Lines. One JSON object per line, one line per event. Human-readable, appendable, easy to diff and grep. Pi stores every session turn as a JSON line — you can tail -f a running session or search across every conversation with grep.

Branching = fork a conversation at any message. You get two independent futures stemming from the same past — useful when you want to try “what if I asked it differently” without losing the original thread.

Compact = summarize older turns to free up context space, preserving the essence. Different from /new, which wipes history entirely.

Thinking level = how much private reasoning the model does before replying. Shift+Tab cycles off / low / medium / high. Higher levels pay in latency and cost but dramatically improve output on complex problems.

Bang injection: !cmd runs a shell command and sends the output to the model along with your next message. !!cmd runs but silently — you see it, model doesn't. Fast way to give the model live context (git status, file list, error output) without typing it yourself.

@-mentions = type @ and fuzzy-match any file in your project. Picks the file; Pi automatically adds its contents to the message as context. Saves copy-pasting.

Built-in Tools

read

Read files from the filesystem with line numbers. Supports ranges.

write

Create or overwrite files with specified content.

edit

Exact string replacement edits—diff-based, reviewable changes.

bash

Execute shell commands, capture stdout/stderr, and return output.

Slash Commands

CommandPurpose
/loginOAuth authentication flow (select provider)
/logoutClear stored credentials
/modelSwitch AI models interactively
/settingsConfigure thinking, theme, delivery mode
/treeBrowse session history and branches
/forkCreate a new session branching from current point
/newStart a fresh session
/compactSummarize older messages to recover context
/copyCopy assistant's last message to clipboard
/export [file]Save session as HTML
/skill:nameInvoke a loaded skill by name
/templatenameExpand a prompt template

Editor Features

@-mentions

Type @ to fuzzy-search and reference project files inline.

Multi-line

Shift+Enter for newline (Ctrl+Enter on Windows Terminal).

Image Paste

Ctrl+V to paste images, or drag-and-drop from the desktop.

Bash Injection

!cmd runs and sends output; !!cmd runs silently.

Keyboard Shortcuts

ShortcutAction
Ctrl+LOpen model selector
Ctrl+P / Shift+Ctrl+PCycle through scoped models
Shift+TabAdjust thinking level (off / low / medium / high)
Escape (x2)Open session tree navigator
Ctrl+OCollapse / expand tool output
Ctrl+TCollapse / expand thinking blocks

Session Management

Sessions persist as JSONL files with branching support. Every message can be the root of a new fork—experiment freely without losing prior work.

JSONL storageAppend-only, human-readable session history
Branching/fork creates a new branch at any point
Navigation/tree browses the session graph
Compaction/compact summarizes old turns to recover context window
Export/export session.html renders a portable view
Walk-through

Branching a tricky refactor without losing work

  1. step 1 You're debugging a tangled state-management bug. You've had a 40-turn conversation with Pi that's finally narrowed down to two possible fixes: inline the state into the component, or lift it into a context provider.
  2. step 2 You're not sure which approach is right. Type /fork. Pi creates a new session branching at your current point — same history up to here, new future from here.
  3. step 3 In the fork, you say: “go with the context provider approach.” Pi writes the changes. You see how it plays out, run the test suite.
  4. step 4 Results are meh. Without losing the work, you type /tree to see the session graph — you can see the original branch and your experimental fork as two siblings.
  5. step 5 You hop back to the original branch, fork it again, try the inline approach instead. That works better.
  6. step 6 You keep the inline-approach branch, abandon the other. Nothing was destroyed — both branches are still on disk as JSONL files in case you ever want to compare.
  7. step 7 When you're done, /export renders the session to a portable HTML file you can attach to the PR as a record of how you arrived at the fix.
Common mistakes
  • Forgetting to /compact on long sessions. Context windows fill up. You'll see Pi refuse to accept more input or silently drop older context. Compact when you notice responses feeling “thin” — before you hit the wall.
  • Using /new when you meant /fork. /new starts a completely fresh session with no history. /fork keeps history up to the fork point. If you wipe a session by mistake, the JSONL file is still on disk — you can /tree your way back.
  • Running !cmd when you meant !!cmd. The single-bang sends command output into the model context, eating tokens. For diagnostic peeks you just want to see, use !!.
  • Assuming restricted tools via --tools are enforced mid-session. --tools read,grep applies for that invocation only. If you start a new session, the default tools come back.
  • Editing files outside the session and losing track. Pi's edit tool does exact-match string replacement — if the file changed between read and edit, the edit fails. Either pause Pi while editing manually or re-read the file first.
  • Thinking level stuck on high. The Shift+Tab cycle is session-scoped. If responses are slow and expensive, check your thinking level before blaming the model.

Execution Modes

Interactive

Default TUI mode—multi-line editor, streaming, session branching.

Print (-p)

Non-interactive output for scripting and pipelines.

JSON

Structured output for programmatic consumption.

RPC

Integration with parent processes over stdio RPC.

SDK

Embed Pi directly into your own TypeScript applications.

@mariozechner/pi-ai

Unified LLM Library

TypeScript library providing unified access to 20+ LLM providers with automatic model discovery, token tracking, cost monitoring, and context persistence.

Design constraint: pi-ai only includes models that support tool calling. This makes it ideal for agentic workflows but means you won't find raw text-completion-only models here.
Why this library exists

Every LLM provider has a slightly different API — and it's exhausting.

Anthropic uses messages with a content array. OpenAI uses the “chat completions” shape. Gemini wants yet another structure. Each supports tool-calling, but the JSON schema for how a tool is defined and how a tool call comes back is different for every one of them. Reasoning tokens, streaming events, cost fields — all different. Writing an agent from scratch against four providers means writing four agents.

pi-ai normalizes all of it into one TypeScript API. You write code once against the complete() / stream() interface. Behind the scenes, pi-ai translates to whichever provider you pointed at. Switch from Claude to GPT to Qwen with a one-line change — the rest of your agent doesn't notice.

In plain terms
TypeBox = a TypeScript library for describing JSON schemas with static types. Type.Object({ location: Type.String() }) gives you both a validator at runtime and a compile-time type. pi-ai uses TypeBox to define tool parameters — your tool definitions are type-safe end-to-end.

Unified reasoning (Extended Thinking): different providers expose reasoning differently — Anthropic has “extended thinking,” OpenAI has “o1/o3” reasoning tokens, Gemini has its own flavor. pi-ai boils them down to one knob: reasoning: 'off' | 'low' | 'medium' | 'high'. Library handles provider-specific translation.

Streaming deltas = the model sends its response in chunks as it generates. text_delta is a new piece of the visible reply; toolcall_delta is a piece of a tool's JSON arguments as they're being built; thinking_delta is a piece of the model's private reasoning. All arrive live; done fires at the end.

openai-completions API dialect = the OpenAI chat-completions shape has become the de-facto standard. Ollama, LM Studio, vLLM, Together, Groq — they all speak it. So when you point pi-ai at http://localhost:11434/v1 (Ollama), it just works.

Proxy URL = route requests through your own backend instead of calling the provider directly. Essential for browser apps where shipping the API key to the user's browser would leak it.

Supported Providers

OpenAI
Anthropic
Google Gemini
Google Vertex AI
Azure OpenAI
Amazon Bedrock
Mistral
Groq
Cerebras
xAI
OpenRouter
Vercel AI Gateway
MiniMax
Kimi / Moonshot
GitHub Copilot
Ollama / LM Studio / vLLM

Basic Usage

import { getModel, complete, stream } from '@mariozechner/pi-ai'; // Type-safe model discovery with auto-complete const model = getModel('openai', 'gpt-4o-mini'); const context = { systemPrompt: 'You are a helpful assistant.', messages: [ { role: 'user', content: 'Hello!' } ] }; const result = await complete(model, context); console.log(result.content, result.cost);

Streaming

for await (const event of stream(model, context)) { switch (event.type) { case 'text_delta': process.stdout.write(event.delta); break; case 'toolcall_delta': // Partial tool arguments as JSON stream break; case 'thinking_delta': // Reasoning content break; case 'done': console.log('Stop reason:', event.stopReason); break; } }

Tool Calling with TypeBox

import { Type, StringEnum } from '@mariozechner/pi-ai'; const tools = [{ name: 'get_weather', description: 'Get current weather', parameters: Type.Object({ location: Type.String(), units: StringEnum(['celsius', 'fahrenheit']) }) }]; const result = await complete(model, { ...context, tools });

Extended Thinking

// Unified reasoning across Claude, GPT-5, Gemini 2.5 await completeSimple(model, context, { reasoning: 'high' // 'off' | 'low' | 'medium' | 'high' });

OAuth Login

import { loginGitHubCopilot, getOAuthApiKey } from '@mariozechner/pi-ai/oauth'; const credentials = await loginGitHubCopilot({ onVerificationUri: (uri) => console.log('Visit:', uri) }); const { apiKey } = await getOAuthApiKey('github-copilot', auth);

Streaming Events

EventDescription
text_deltaStreamed response text chunk
toolcall_deltaPartial tool arguments (JSON streaming)
thinking_deltaModel reasoning content
doneCompletion with stop reason and token usage
errorFailure with any partial content preserved

Custom Endpoints

const ollamaModel = { id: 'llama-3.1-8b', api: 'openai-completions', baseUrl: 'http://localhost:11434/v1', reasoning: false, input: ['text'], cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, contextWindow: 128000, maxTokens: 32000 };
Cross-provider handoffs: Switch models mid-conversation while preserving thinking blocks and tool results. Thinking automatically converts to <thinking> tagged text for providers that don't support native reasoning.
Walk-through

Switching from Claude to GPT mid-conversation without breaking anything

  1. step 1 You're 15 turns into a design session with Claude Opus. It's been carefully reasoning; thinking blocks are embedded in the history. You hit Claude's reasoning rate limit.
  2. step 2 You swap the model: model = getModel('openai', 'gpt-4o'). Nothing else in your code changes.
  3. step 3 On the next stream() call, pi-ai inspects the context. Claude-style thinking blocks don't map to GPT-4o's format — so pi-ai automatically converts them into <thinking>...</thinking> tagged plain text that GPT-4o can read as context.
  4. step 4 Claude's tool-call messages get reshaped to the OpenAI tool_calls format. Tool result messages get re-keyed.
  5. step 5 GPT-4o gets a coherent conversation history and responds. The streaming events that come back are the same text_delta / done shape as Claude — your consumer doesn't know a switch happened.
  6. step 6 A few turns later, you swap to a local Ollama model via Custom Endpoint for privacy. pi-ai converts the OpenAI-shaped history to openai-completions shape (mostly identical) — Ollama picks up the thread.
  7. step 7 Total code changes across three provider hops: three strings. pi-ai absorbed everything else.
Common mistakes
  • Using a model that doesn't support tool calling and wondering why tools never fire. pi-ai's registry filters to tool-capable models, but Custom Endpoint models you register yourself bypass that filter. Verify.
  • Shipping API keys in a browser bundle. pi-ai works in the browser, but calls to provider APIs from the browser either CORS-fail or expose your key. Use the proxyUrl option with a Worker or backend that holds the key server-side.
  • Forgetting that reasoning: 'high' costs real money. High-reasoning Claude/GPT runs can easily eat $0.10+ per turn. Track result.cost and set budgets.
  • Assuming streaming events arrive in a specific order across providers. Most providers interleave thinking_delta / text_delta, but timing varies. Write consumer code that handles any order.
  • Defining tools without TypeBox. You can technically pass plain JSON Schema — but you lose the compile-time types pi-ai generates from TypeBox. Use TypeBox; future-you will appreciate the IntelliSense.
  • Registering a Custom Endpoint with wrong contextWindow. pi-ai uses that number to decide when to warn about overflow. Too high = silent truncation. Check the model's real context window and enter it accurately.
@mariozechner/pi-agent-core

Agent Core Runtime

Stateful agent with tool execution and event streaming, built on pi-ai. This is the engine powering the coding agent and the foundation for your own custom agents.

Why a separate runtime?

Agents need more than “call an LLM in a loop.”

pi-ai handles one LLM request: send context, get response, stream deltas. But an agent needs to: remember the conversation across turns, decide when tool calls are done, run tools safely (and possibly in parallel), handle interruptions, post-process results, surface progress events to a UI. That's all state management and orchestration — not LLM stuff.

pi-agent-core is exactly that layer: a small state machine that owns the conversation, fires a clean set of events as things happen, and lets you hook in before/after tool calls. Build a coding CLI, a Slack bot, a browser chat, or your own wild experimental UI — they all share this same engine. The coding agent in Pi is, itself, just a consumer of pi-agent-core.

In plain terms
Stateful = remembers things between method calls. agent.prompt("hi"), then later agent.prompt("again") — the agent knows they're in the same conversation without you re-feeding history.

Event-driven = instead of making you poll or wait, the agent tells you when things happen: message started, tool firing, tool done, turn complete. You subscribe to the events you care about. Perfect for streaming UIs.

Turn = one full cycle of (think → run tools → wait → think again → run more tools → reply). A single user prompt can trigger multiple turns if the agent needs to iterate.

transformContext() = a hook that runs just before pi-agent-core sends history to the LLM. Use it to filter sensitive data, summarize old turns, redact internal-only info, whatever you need. The agent's real state stays intact; only what the LLM sees gets transformed.

Parallel vs sequential tool execution: when the model requests 5 tools at once, parallel fires them all simultaneously (fast, assumes independence). Sequential runs them one after the other (safer when later tools might depend on earlier results).

beforeToolCall / afterToolCall: hooks to intercept tool execution — validate parameters, show a confirmation prompt, log calls, modify results. The “permission system” Pi doesn't include by default is a short extension using these hooks.

Quick Example

import { Agent } from '@mariozechner/pi-agent-core'; import { getModel } from '@mariozechner/pi-ai'; const agent = new Agent({ initialState: { systemPrompt: "You are a helpful assistant.", model: getModel("anthropic", "claude-sonnet-4-20250514"), }, }); agent.subscribe((event) => { if (event.type === "message_update") { process.stdout.write(event.assistantMessageEvent.delta); } }); await agent.prompt("Hello!");

Message System

pi-agent-core distinguishes between two message formats:

AgentMessagesInternal format: user, assistant, toolResult, plus custom types
transformContext()Pre-LLM transformation hook (filter, summarize, redact)
convertToLlm()Bridge: AgentMessages → Messages that models understand
LLM MessagesThe subset of messages sent to the model

Event System

EventDescription
agent_startAgent run has begun
agent_endFinal event—no more loop events will fire
turn_startA new LLM-call-plus-tool-execution cycle begins
turn_endThe current turn completes
message_startNew assistant message begins streaming
message_updateContent delta for an in-progress message
message_endMessage fully assembled
tool_execution_startA tool call is about to execute
tool_execution_updateProgress update from the tool
tool_execution_endTool call completed with result

Tool Execution

Tools are defined with AgentTool using TypeBox schemas for parameters. Two execution strategies are available:

Parallel (default)

Preflight sequentially, then execute all pending tool calls concurrently. Faster for independent operations.

Sequential

Execute one-by-one in order. Safer when tool calls have implicit dependencies.

Tool contract: Tools must throw errors on failure rather than returning error messages. The beforeToolCall hook can block execution; afterToolCall post-processes results.
Walk-through

Events firing during one agent.prompt("fix the failing test")

  1. t+0ms You call agent.prompt("fix the failing test"). agent_start fires. State flips to isStreaming: true.
  2. t+30ms turn_start fires — we're entering the first think-and-tool cycle.
  3. t+40ms message_start — the model begins streaming its reply.
  4. t+200ms Stream of message_update events carries visible text deltas. Your UI streams them into the screen. Some deltas are thinking (private reasoning) — render them differently or hide them.
  5. t+800ms Model decides it needs tools: run bash: npm test, read a source file. tool_execution_start fires twice (parallel mode).
  6. t+810ms If you've registered beforeToolCall, it runs. You could pop a confirmation dialog here. For this example, it passes through.
  7. t+2100ms Tests finish. tool_execution_end fires with exit code and output. afterToolCall post-processes (e.g., truncates huge stdout to something sane).
  8. t+2200ms message_end (from the first message). turn_end. Then turn_start again — the model now has tool results and starts a second think cycle.
  9. t+4000ms Second turn produces an edit tool call that patches the failing assertion. tool_execution_* fires. New message summarizing the fix streams in.
  10. t+5000ms Model stops; no more tool calls. Final message_end, turn_end, and agent_end. isStreaming flips back to false.
Common mistakes
  • Returning error strings from tools instead of throwing. The Agent can't distinguish “tool succeeded and returned text that looks like an error” from “tool failed.” Throw. The Agent's retry + error-handling logic kicks in properly.
  • Mutating agent.state.messages directly mid-stream. The runtime is actively reading/writing that array. Use the provided APIs; treat state as read-mostly from outside.
  • Assuming parallel tool execution is safe. If two tools both write to the same file, races happen. Tag tools with dependencies or switch to sequential for this group.
  • Subscribing to every event for a simple UI. You usually just need message_update (streaming text) and tool_execution_* (progress indicators). The others are for advanced cases.
  • Using agentLoop directly without understanding you're bypassing the state machine. The low-level API is powerful but places the burden of correctness on you. Start with Agent; drop down only when you've hit its opinionated walls.
  • Forgetting that transformContext runs every LLM call. A slow or expensive transformation multiplies cost. Cache where possible.

State Management

// Access agent state agent.state.systemPrompt agent.state.model agent.state.thinkingLevel agent.state.tools agent.state.messages agent.state.isStreaming agent.state.streamingMessage agent.state.pendingToolCalls agent.state.errorMessage // Assigning arrays copies the top-level array before storing agent.state.tools = newTools;

Advanced Features

Steering / Follow-up

Interrupt running operations or queue work to run after completion.

Custom Messages

Extend message types via TypeScript declaration merging—add your own discriminated variants.

Proxy Support

Browser apps can proxy through backend servers to avoid exposing keys client-side.

Low-Level API

Direct control via agentLoop for advanced use cases where the Agent class is too opinionated.

Monorepo

All Packages

Seven packages that compose together. Use the full stack, or pick pieces for your own project.

Which package do I actually need?

I want to...UseWhy
Use Pi as a coding agent in my terminalpi-coding-agentThe full CLI. Install globally with npm i -g.
Call an LLM from my own TypeScript codepi-aiUnified API across 20+ providers. No agent loop — just model calls.
Build a custom agent (Slack bot, Discord bot, custom CLI)pi-agent-core + pi-aiCore gives you the stateful loop; ai gives you the LLM calls.
Make a terminal UI for my own toolpi-tuiFlicker-free rendering, components, image support. No Pi-agent coupling.
Build an AI chat UI in a browserpi-web-ui + pi-agent-coreWeb components + IndexedDB storage. Deploy as a static site.
Have a bot read my Slack and respondpi-momPre-built Slack agent. Sandboxed, self-managing.
Run open-source models on rented GPUspi-podsAutomates vLLM setup across providers. OpenAI-compatible endpoint out.
In plain terms
Flicker-free rendering = when a terminal UI redraws the screen, a naive approach wipes everything and re-paints — users see a brief flash. Pi-tui uses a three-strategy diff to only rewrite what changed, and CSI 2026 (a terminal escape sequence) to tell the terminal “don't show intermediate frames.” Result: redraws look instant and smooth.

Bracketed paste = a terminal mode where pasted text arrives wrapped in special markers. Without it, pasting 50 lines into a prompt breaks because each newline submits early. With it, Pi-tui's editor receives the whole paste as one event.

IME support (Input Method Editor) = the composition system used for CJK (Chinese, Japanese, Korean) input and other multi-keystroke scripts. Pi-tui keeps the cursor visually in the right place while you're typing a multi-character glyph.

Kitty / iTerm2 graphics protocols = ways to render actual images inside a terminal (not ASCII art). Pi-tui emits the right escape sequences so supported terminals can show photos, diagrams, screenshots inline with your chat.

mini-lit = a stripped-down build of the Lit web components library. Pi-web-ui uses it to ship standards-based web components (custom HTML tags) with minimal bundle size.

IndexedDB = a browser-native local database. Pi-web-ui puts your sessions, preferences, and credentials there — no server needed. Survives page reloads; private to your browser.

vLLM = a high-throughput inference engine for open-source models. Pi-pods installs it on your rented GPU, exposes an OpenAI-compatible endpoint, so anything that speaks OpenAI (including pi-ai) can hit your private model.

Package Map

MONOREPO STRUCTURE pi-mono/ ├── packages/ │ ├── ai/ # @mariozechner/pi-ai │ ├── agent/ # @mariozechner/pi-agent-core │ ├── coding-agent/ # @mariozechner/pi-coding-agent │ ├── tui/ # @mariozechner/pi-tui │ ├── web-ui/ # @mariozechner/pi-web-ui │ ├── mom/ # @mariozechner/pi-mom (Slack bot) │ └── pods/ # pi-pods (GPU deployment) ├── AGENTS.md # Development rules ├── test.sh └── pi-test.sh # Run from source

pi-tui

Minimal terminal UI framework with flicker-free rendering. Three-strategy rendering only updates what changed, and CSI 2026 synchronized output prevents visual tearing.

Components

Text, Input, Editor, Markdown, Loader, SelectList, SettingsList, Image, Box, Container

Editor

Multi-line editing, syntax-highlighted code blocks, vertical scrolling

Bracketed Paste

Handles large pastes exceeding 10 lines without input glitches

IME Support

CJK input methods with proper cursor positioning

Inline Images

Kitty or iTerm2 graphics protocols for rendering images

Autocomplete

File paths and slash commands with fuzzy matching

pi-mom — Slack Agent

An autonomous Slack bot powered by an LLM. Responds to @mentions, DMs, and can execute bash, read/write files, and manage its own development environment.

  • Self-managing: installs dependencies and builds custom CLI tools ("skills") on demand
  • Sandboxed: Docker containers by default, host mode available
  • Persistent: conversation history, files, and tools all in one controlled directory
  • Memory: global and per-channel memory files for cross-session context
npm install @mariozechner/pi-mom # Then create a Slack app with Socket Mode enabled # Set env vars: SLACK_BOT_TOKEN, SLACK_APP_TOKEN, etc.

pi-web-ui

Reusable web components for building AI chat interfaces. Built on mini-lit web components and Tailwind CSS v4.

ChatPanel

Primary high-level interface: messages + artifacts viewer

AgentInterface

Lower-level chat for custom layouts with attachments and thinking

Artifacts

JavaScript REPL, HTML, SVG, Markdown rendering

File Handling

PDFs, Word, spreadsheets, presentations, images

IndexedDB Storage

Sessions, credentials, preferences persist in the browser

Local Providers

Ollama, LM Studio, vLLM—plus automatic CORS proxy

pi-pods — GPU Deployment

A Node.js CLI for running large language models on remote GPU pods. Automates vLLM setup, configures tool calling for agentic models, and exposes OpenAI-compatible endpoints.

DataCrunch
RunPod
Vast.ai
Prime Intellect
AWS EC2
Any Ubuntu + NVIDIA
export HF_TOKEN=token export PI_API_KEY=key # Set up a pod with shared NFS storage pi pods setup dc1 "ssh root@1.2.3.4" --mount "nfs_command" # Start a model with smart GPU allocation pi start Qwen/Qwen2.5-Coder-32B-Instruct --name qwen # Interactive chat against deployed model pi agent qwen -i
Multi-model per pod: pi-pods automatically assigns models to different GPUs on the same machine. Configure memory fraction (30–90%), context window (4k–128k), and GPU count per model.
Common mistakes
  • Installing every package when you only need pi-ai. Each package has its own footprint. npm i @mariozechner/pi-ai alone is ~1 MB. The full stack is 10× that.
  • Using pi-tui for rendering in a way that bypasses its redraw engine. If you console.log() inside a pi-tui app, the screen tears. All terminal output must go through pi-tui's write API.
  • Letting pi-mom run in host mode by default. Host mode means Slack messages can trigger bash on your machine. Keep it in Docker sandbox mode unless you have a specific, scoped reason to go host.
  • Running pi-pods without HF_TOKEN set. Most models on HuggingFace need an auth token for download. pi-pods errors out — but the error can be silent on slow connections; always verify.
  • Shipping pi-web-ui without a Worker proxy for API keys. pi-web-ui supports provider keys for convenience in dev, but in production you must never ship keys to the browser. Proxy through a Worker (see Deploy tab).
  • Assuming pi-tui works on Windows cmd.exe. It expects a modern terminal with ANSI support. Windows Terminal, WSL, macOS Terminal, iTerm2, Kitty, Alacritty all work. cmd.exe/PowerShell conhost does not.
Customization

Extensions & Skills

The Pi philosophy is minimalism at the core, extensibility everywhere else. Four layers let you shape the agent to your workflow.

Four layers, one question

Which extension mechanism should I use?

Pi gives you four places to hook in, from simplest to most powerful: prompt templates (markdown you invoke with /name), skills (markdown + supporting files following an open standard), TypeScript extensions (full programmatic access), and Pi Packages (bundles of any of the above). The right choice depends on whether you're capturing a workflow, adding a tool, or shipping to others.

Extension layer picker

I want to...UseTrade-off
Save a prompt I type oftenPrompt templateFastest to write. Only pastes text into the session. No logic.
Teach Pi a repeatable workflow (with scripts)SkillPortable (Agent Skills standard). Can include files the agent reads. Still declarative.
Add a new tool, slash command, or UI elementExtension (TypeScript)Full power. Runs code. Requires build step.
Distribute my work to othersPi PackageWraps any of the above. Installable via npm or git.
In plain terms
Agent Skills standard = an open spec (originally from Anthropic) defining how a skill folder is laid out: one SKILL.md with YAML frontmatter (name, description, tags, when-to-use) plus any supporting files in the same folder. Pi reads the standard; so do other agent tools — skills port between them.

YAML frontmatter = the ----bracketed metadata block at the top of a markdown file. Machine-parseable, human-readable. Pi uses it to decide when a skill applies without needing to read the whole body.

Hot reload (themes) = Pi watches theme files for changes and re-applies them live — no restart. Edit colors in your editor, see them immediately in the terminal.

Pi Package = an npm-installable bundle that can include any combination of prompts, skills, extensions, themes. pi install npm:@foo/bar pulls it in. Good for distributing a team's shared customizations or publishing open-source.

“things Pi deliberately doesn't bake in” — the Pi philosophy. Plan mode, sub-agents, permission systems, git checkpointing — all exist as community extensions because they depend heavily on what you actually want. Better to have 10 flavors of plan-mode extension than one forced into the core.

Four Extension Layers

Prompt TemplatesMarkdown files in ~/.pi/agent/prompts/ — invoke with /name
SkillsSKILL.md files following the Agent Skills standard
ExtensionsTypeScript modules registering tools, commands, shortcuts, UI
ThemesTheme files in ~/.pi/agent/themes/ with hot reload

Prompt Templates

# Create ~/.pi/agent/prompts/commit.md --- name: commit description: Review changes and write a commit message --- Review the current git diff. Write a conventional commit message following the project's style. Then propose running git commit with that message. # Invoke inside pi: /commit

Skills

Skills follow the Agent Skills standard: a SKILL.md file with YAML frontmatter describing the skill's purpose, plus supporting files (scripts, templates, references) in the same directory.

~/.pi/agent/skills/pdf-extract/ ├── SKILL.md # Metadata + instructions ├── extract.py # Supporting script └── README.md # Documentation for the skill # Invoke in pi: /skill:pdf-extract

Extensions (TypeScript)

Extensions are TypeScript modules that can register:

  • Custom tools (beyond read/write/edit/bash)
  • Additional slash commands
  • Keyboard shortcuts
  • UI components (leveraging pi-tui)
  • Transport implementations
  • Editor integrations
Community examples include: git checkpointing, SSH remote execution, MCP integration, custom code editors, plan mode, sub-agent spawners, and permission systems—all the things Pi deliberately doesn't bake in.

Pi Packages

Bundle prompts, skills, extensions, and themes into distributable Pi Packages. Ship via npm or git.

# Install from npm pi install npm:@foo/pi-tools # Install from git pi install git:github.com/user/repo # List installed packages pi list # Update all pi update # Enable/disable resources interactively pi config

Custom Models

Add providers that aren't in pi-ai's built-in registry via ~/.pi/agent/models.json, as long as they speak the OpenAI or Anthropic API dialect.

{ "models": [ { "id": "my-local-model", "provider": "custom", "api": "openai-completions", "baseUrl": "http://localhost:8080/v1", "contextWindow": 128000, "maxTokens": 32000 } ] }

Context Files

Pi loads AGENTS.md (or CLAUDE.md for compatibility) from your home directory or project root. This is where project-specific instructions live—coding conventions, architectural notes, don't-do lists.

Walk-through

Turning a recurring workflow into a portable skill

  1. step 1 Every Friday you ask Pi to generate a weekly changelog from git commits. The prompt is long and you keep re-typing variations of it. Time to capture it.
  2. step 2 Start with a prompt template: create ~/.pi/agent/prompts/changelog.md with the prompt text. Invoke with /changelog. Done in 30 seconds.
  3. step 3 A month later you want it smarter — the changelog should categorize commits (feat/fix/chore) and pull PR titles. That needs logic, not just text. Upgrade to a skill.
  4. step 4 Create ~/.pi/agent/skills/changelog/. Inside: SKILL.md with YAML frontmatter (name: changelog, when-to-use: weekly summary of git activity) plus categorize.py (a script the skill tells Pi to run).
  5. step 5 In the SKILL.md body, describe the workflow step-by-step: “run git log --since=7days; pipe through categorize.py; pull PR titles for any merge commits; format as markdown.”
  6. step 6 Invoke with /skill:changelog. Pi follows the playbook, using your script as a helper.
  7. step 7 A teammate wants the same workflow. Bundle the skill (plus your git-checkpoint prompt) into a Pi Package: @yourname/team-workflow, publish to npm. Teammate runs pi install npm:@yourname/team-workflow — gets the whole thing.
Common mistakes
  • Using an extension when a skill would do. Skills are portable, declarative, and scan-auditable. Reach for TypeScript extensions only when you genuinely need code — custom tools, UI integration, or runtime logic.
  • Putting logic in YAML frontmatter. Frontmatter is metadata only (name, tags, triggers). Keep the playbook in the markdown body.
  • Writing skills that assume Pi-specific behavior. The Agent Skills standard is cross-tool. If you rely on Pi-only things (specific slash commands, internal state), your skill won't port. Keep skills portable where you can.
  • Forgetting that prompts/skills go in ~/.pi/agent/ not ~/.pi/. Easy to typo. Check with ls ~/.pi/agent/skills/.
  • Publishing a Pi Package without a README. npm installs strangers' code. A README explaining what the package registers (tools, commands) is table stakes for trust.
  • Assuming themes hot-reload the layout — they only hot-reload colors. Structural changes still need a Pi restart.
Distribution

Cloudflare Pages Deploy

Deploy this documentation site (or your own pi-web-ui application) to Cloudflare Pages using the Wrangler CLI.

This Site's Deployment

The guide you're reading is a single-file HTML SPA deployed to Cloudflare Pages. Here's the exact flow:

# 1. Install Wrangler (once) npm install -g wrangler # 2. Authenticate (once) wrangler login # 3. Create the Pages project (once per project) wrangler pages project create pi-agent-guide \ --production-branch=main # 4. Deploy the directory containing index.html wrangler pages deploy pi-agent \ --project-name=pi-agent-guide \ --commit-dirty=true # 5. Verify curl -sI "https://pi-agent-guide.pages.dev/"

Deploy pi-web-ui Applications

pi-web-ui ships reusable components for building a full chat UI in the browser. Because it uses IndexedDB for session/credential storage, no server is required—deploy as a static site on Cloudflare Pages.

# Create a new pi-web-ui project mkdir my-pi-app && cd my-pi-app npm init -y npm install @mariozechner/pi-web-ui @mariozechner/pi-agent-core # Build (assuming Vite or similar) npm run build # Deploy dist/ to Cloudflare Pages wrangler pages deploy dist \ --project-name=my-pi-app \ --commit-dirty=true

Local Dev with Wrangler

# Preview this site locally cd pi-agent wrangler pages dev . # Opens http://localhost:8788

Auto-Deploy via Git

For production workflows, connect your GitHub repository to Cloudflare Pages via the dashboard:

  • Build command: npm run build (or empty for static)
  • Build output directory: dist (or . for static)
  • Every push to main triggers a production deploy
  • PRs get preview deployments on unique URLs

Environment Secrets

# Pages deployments never ship API keys # Use Cloudflare Workers as a proxy for secure key handling wrangler secret put ANTHROPIC_API_KEY wrangler secret put OPENAI_API_KEY # In your Worker, proxy to the LLM provider # The browser app points at the Worker, not directly at OpenAI
Security model: Never expose API keys in client-side pi-web-ui bundles. Route requests through a Worker that injects the key server-side. pi-ai supports a proxyUrl option precisely for this pattern.
In plain terms
Static site = just HTML/CSS/JS files served from a CDN. No server running code on each request. Fast, cheap, resilient. Cloudflare Pages is built for this.

Cloudflare Worker = a small server-side function that runs at Cloudflare's edge. When your browser app needs to call an LLM, it calls the Worker; the Worker adds the API key and forwards the request. Key never touches the browser.

CORS (Cross-Origin Resource Sharing) = the browser rule that blocks pages on one domain from calling APIs on another unless the API explicitly allows it. Provider APIs typically reject browser calls directly — another reason for the Worker proxy.

Preview deployment = when you push a PR branch, Cloudflare Pages builds it and gives you a unique URL. You can share it for review without affecting production.

Cloudflare Access = authentication gate in front of a Pages site. Good for internal tools — visitors log in via Google/GitHub/email before the site loads.

Wrangler = Cloudflare's CLI. Handles login, project creation, deploys, secrets, local dev. Everything you can do in the dashboard you can do in Wrangler.

Threat Models — What Could Actually Go Wrong

Pi's deployment story is “static + optional Worker proxy.” Here's what goes wrong when that's misconfigured and what fixes each scenario.

I shipped ANTHROPIC_API_KEY in the browser bundle.
A user opens DevTools → Network tab → sees the key in a request header. Within hours, keys show up in public scrapers and someone runs up your bill.
Use a Worker proxy with wrangler secret put. The browser calls the Worker; only the Worker knows the key. pi-ai's proxyUrl option exists for exactly this.
Someone discovers my Worker URL and uses it as a free OpenAI proxy.
Your Worker forwards any request with any prompt. A bored person on Reddit posts the URL. Your quota burns in a weekend.
Gate the Worker with Cloudflare Access (login required), or add a simple session token your app generates, or restrict origin via CORS in the Worker. Don't ship an unauthenticated proxy.
A user's browser gets compromised and reads IndexedDB.
pi-web-ui stores credentials (subscription tokens, API keys if the user entered them) in IndexedDB. A browser-level attack could harvest them.
IndexedDB is scoped to the origin, so only same-origin scripts can read it. Don't inject third-party JS carelessly. Encourage users to rely on the Worker proxy instead of entering keys into the browser.
My Worker logs user prompts and I didn't realize.
Cloudflare Worker logs can capture request bodies. If you ever log the full request for debugging, private prompts end up in your log store.
Strip request bodies from Worker logs by default. If you need to log, hash or sanitize first. Document logging behavior in your privacy policy.
I accidentally wrangler pages deploy'd the wrong directory.
Your node_modules, .env, or a local dev folder goes live. Secrets in filenames get indexed; huge bundles tank performance.
Always specify the exact directory: wrangler pages deploy dist, not wrangler pages deploy .. Add a .gitignore/.wranglerignore for safety. Double-check the upload summary before confirming.
A deploy breaks production and I need to roll back fast.
You pushed bad code; users hit a white screen.
Cloudflare Pages keeps a deployment history. Dashboard → deployments → “rollback to this version.” Every preview URL stays live, so roll back via URL while you fix the issue.
Common mistakes
  • Deploying without --commit-dirty=true while iterating. Wrangler refuses deploys from dirty git trees by default. During active dev, pass the flag; for release, commit first.
  • Putting API keys in wrangler.toml. That file is meant to be committable. Use wrangler secret put — secrets live encrypted in Cloudflare, not in your repo.
  • Forgetting that Pages deploys are immutable. You can't “edit” a deployed version — you push a new one. Rollback = route to an earlier version.
  • Using a static site when you actually need a Worker. If your app calls LLMs directly from JavaScript, you need a Worker proxy — a purely static deploy leaks keys.
  • Not setting up preview URLs for PRs. Cloudflare Pages auto-creates them when you connect a GitHub repo. Don't build + test only locally; real preview URLs catch CDN-level issues (headers, routes) you'd miss otherwise.

Monitoring

Analytics

Cloudflare Pages Analytics shows visits, bandwidth, and geographic distribution for free.

Web Vitals

Built-in Core Web Vitals tracking—LCP, CLS, FID—for performance monitoring.

Access Rules

Add authentication via Cloudflare Access to gate internal tools.

Custom Domains

Point your own domain via the Pages dashboard; SSL is automatic.

Reference

Glossary + FAQ

Every piece of jargon in Pi Agent, in plain English. Jump to a letter or scroll.

A
Agent Skills standard
Open spec (originally Anthropic) for how a skill folder is structured: a SKILL.md with YAML frontmatter plus supporting files. Portable across agent tools that implement it.
AGENTS.md / CLAUDE.md
A markdown file in your home or project dir holding ambient instructions (coding style, project conventions, don't-dos). Pi loads it automatically. Supports both names for interop.
agentLoop
pi-agent-core's low-level API. Gives you full control over the agent loop without the Agent class's opinions. Use when Agent is too rigid for your case.
B
Bang injection
!cmd runs a shell command and feeds output into the model's context. !!cmd runs but hides output from the model. Fast way to add live data to a prompt.
Bracketed paste
Terminal mode where pasted text arrives wrapped in markers. Lets pi-tui's editor receive multi-line pastes as one event instead of losing lines to newline submissions.
C
Compact (/compact)
Replace older conversation turns with a summary to free context space. Preserves meaning, drops verbosity. Different from /new which wipes history entirely.
CORS Cross-Origin Resource Sharing
Browser rule: a page on one domain can't call APIs on another unless the API explicitly allows it. Most LLM provider APIs don't allow browser calls — use a Worker proxy.
CSI 2026
A terminal control sequence that groups screen updates atomically. pi-tui uses it so you never see half-rendered intermediate frames during a redraw.
E
Extension
A TypeScript module that registers tools, slash commands, shortcuts, or UI into Pi. The most powerful extension layer — anything the coding agent can do, an extension can add.
Event (pi-agent-core)
A message fired by the agent runtime as things happen: agent_start, turn_start, message_update, tool_execution_end, etc. Subscribe to build streaming UIs.
F
Fork (/fork)
Create a new session branch at the current message. Same history up to here, independent future from here. Useful for “what if I asked differently?” experiments.
H
Hot reload
Changes to theme files apply live without restarting Pi. Edit colors in your editor, see them immediately.
I
IME Input Method Editor
System for composing multi-keystroke characters (CJK scripts, accented letters). pi-tui tracks IME state so the cursor stays in the right spot mid-composition.
IndexedDB
Browser-native local database. pi-web-ui stores sessions, credentials, and preferences here — no server needed. Scoped to the page's origin; private to that browser profile.
J
JSONL
JSON Lines — one JSON object per line. Appendable, greppable, diff-friendly. Pi's session format.
K
Kitty / iTerm2 graphics protocols
Escape-sequence protocols for rendering real images inside a terminal. pi-tui emits them so supported terminals show photos, diagrams, or screenshots inline.
M
mini-lit
A minimal fork of the Lit web-components library. pi-web-ui uses it for small-bundle, standards-based components.
Monorepo
One git repo holding multiple related packages. pi-mono contains all seven Pi packages together.
O
OAuth subscription
Log in to a provider (Claude, ChatGPT, Copilot) via browser; provider returns a token that lets Pi use your existing subscription instead of a separate API bill.
openai-completions API
The request/response shape OpenAI's chat-completions endpoint uses. De-facto standard — Ollama, LM Studio, vLLM, Groq, Together all speak it. pi-ai's api: 'openai-completions' tells pi-ai to use this shape for a custom endpoint.
P
Pi Package
An npm- or git-installable bundle that can include prompt templates, skills, extensions, or themes. How you distribute your customizations to others.
pi-agent-core
The stateful agent runtime: session state, tool execution, event streaming, hooks. Built on pi-ai.
pi-ai
Unified LLM library. Same API across 20+ providers, automatic cross-provider handoffs, streaming events, TypeBox tool schemas.
pi-mom
A Slack bot package. Autonomous LLM-powered bot with bash access, self-managing skills, Docker sandboxing.
pi-pods
CLI for running open-source LLMs on rented GPUs (DataCrunch, RunPod, Vast.ai, etc). Sets up vLLM, exposes an OpenAI-compatible endpoint.
pi-tui
Terminal UI framework. Flicker-free redraws, components, images, IME support. Used by pi-coding-agent; usable standalone.
pi-web-ui
Web components for building chat UIs in the browser. Ships with ChatPanel, Artifacts, file handling, IndexedDB storage.
Preview deployment
A unique-URL deployment for a PR or branch on Cloudflare Pages. Lets you share “test this” links without touching production.
Prompt template
A markdown file in ~/.pi/agent/prompts/ you invoke with /name. Pure text expansion — no code, no logic, just a saved prompt.
proxyUrl
pi-ai option that routes requests through a server you control instead of calling providers directly. Essential for browser apps — keeps API keys out of the bundle.
R
Reasoning level
Unified thinking control across providers: 'off' | 'low' | 'medium' | 'high'. Higher = slower, pricier, smarter on complex problems.
RPC mode
One of Pi's execution modes. Pi runs as a subprocess of a parent program, communicating over stdio JSON-RPC. How to embed Pi in your own tool.
S
SKILL.md
The markdown entry point of a skill folder. YAML frontmatter on top (name, description, triggers), playbook body below.
Skill
A folder under ~/.pi/agent/skills/<name>/ containing a SKILL.md plus any supporting files. Invoked via /skill:name.
Streaming deltas
Chunks of a model's response as it generates. text_delta = visible reply piece; thinking_delta = private reasoning piece; toolcall_delta = partial tool arguments.
SYSTEM.md
Markdown file that replaces Pi's default system prompt entirely. Advanced use — most people want AGENTS.md (which layers on top) instead.
T
transformContext
pi-agent-core hook that runs before each LLM call. Lets you filter/summarize/redact what the model sees without losing the agent's real state.
Tree (/tree)
Session navigator. Shows the branching history of your conversation as a graph; jump to any node.
Turn
One full cycle of (LLM call + tool execution) inside a prompt. A single user message can span multiple turns if the agent iterates.
TypeBox
TypeScript library for describing JSON schemas with static types. pi-ai uses it to define tool parameters with both runtime validation and compile-time types.
V
vLLM
High-throughput open-source inference engine for LLMs. pi-pods installs it on rented GPUs to serve models behind an OpenAI-compatible API.
W
Worker (Cloudflare Worker)
A small server-side function running at Cloudflare's edge. Used to hold API keys and proxy requests from browser apps.
Wrangler
Cloudflare's CLI. Deploys Pages sites, manages Workers, handles secrets, runs local dev.
Y
YAML frontmatter
The ----bracketed metadata block at the top of a markdown file. Machine-parseable. Pi uses it to read a skill's name, description, and tags without processing the whole body.

Frequently Asked Questions

How is Pi different from Claude Code / Cursor / Continue?

Claude Code and Cursor are polished, opinionated products with plan mode, sub-agents, permission systems, and many built-in tools. Pi is deliberately minimal — four tools, no baked-in workflows — and instead ships a full extension system so you build the exact shape you want. If you love defaults, Pi will feel bare. If you want a toolkit to build your agent, Pi is the best-documented starting point I've seen.

Do I really save money with subscription OAuth?

Usually yes, if your usage fits your subscription's limits. Claude Pro gives ~40 messages per 5-hour window; ChatGPT Plus is more generous; Copilot has monthly limits. If you'd hit those caps anyway, using OAuth is free vs. paying per-token via API. If you're a heavy user who'd exceed the subscription caps, API keys (or OpenRouter) might be cheaper overall.

Can I run Pi offline with a local model?

Yes. Run Ollama or LM Studio locally, configure a Custom Endpoint in pi-ai pointing at http://localhost:11434/v1 (Ollama) or http://localhost:1234/v1 (LM Studio). Pi will treat it like any other OpenAI-completions provider. Nothing leaves your machine.

Which package do I install if I just want to call an LLM from my app?

Just @mariozechner/pi-ai. It has no dependencies on the coding agent or the runtime. You'll get model discovery, streaming, tool calling, cross-provider handoffs — nothing else.

What if I want plan mode, sub-agents, or a permission system?

Write it as an extension or install a community one. Pi's philosophy is that these features vary too much between users to bake in — but the extension API exposes everything you'd need (tool hooks, state access, UI slots). Check the pi-mono issues/discussions for community extensions.

How do I back up my Pi setup?

Copy ~/.pi/. Contains settings, prompts, skills, sessions, themes, model definitions. On a new machine: npm i -g @mariozechner/pi-coding-agent, drop the folder in place, run pi. Re-run /login for OAuth providers since tokens may be machine-bound.

Is it safe to share a pi-web-ui deployment publicly?

Only with a Worker proxy and authentication. A purely static pi-web-ui deployment that lets anonymous users enter their own API keys is fine (they pay for their usage). A deployment calling your API keys without auth will be abused — see the Threat Models on the Deploy tab.

Why TypeBox instead of Zod?

TypeBox outputs real JSON Schema natively — the format LLM providers expect for tool definitions. Zod needs a converter. TypeBox also has zero runtime dependencies. Different tradeoffs; Pi picked TypeBox for the LLM use case.

Can Pi's coding agent run in a GitHub Action or CI?

Yes — use pi -p "prompt" (print mode) for one-shot non-interactive runs. Pipe input, capture output, feed into your pipeline. Authenticate with an API key via env var (OAuth subscriptions don't work in CI since they need a browser).

What happens if I break a session JSONL file?

Since sessions are append-only JSONL, corruption usually means a bad line at the end. Open the file, remove the partial line, save. Pi will load everything up to the last valid entry. Full corruption is rare because Pi writes line-by-line.

pi-agent-guide
9 sections
MIT
pi-agent-guide.pages.dev/#home