Skip to content

microagent serve

Last updated: 2026-06-27

microagent serve mcp Stdio MCP transport for agent clients

microagent serve mcp is the MCP client integration entry point. A client launches it as a foreground stdio subprocess; it is not a normal interactive CLI command and is not advertised in top-level help. When started directly from a terminal, the command exits with setup guidance instead of waiting for MCP frames on stdin.

Serve local GGUF models with microagent model serve.

The MCP server automatically uses AX output mode. It exposes structured tools for workspace lifecycle, inspection, results, stats, logs, events, snapshots, images, networks, volumes, model store/serving, copy/artifact access, host diagnostics, capability discovery, and cost estimation.

The MCP server stops at VM operations: it does not plan, call an LLM, interpret audit meaning, broker credentials, or make policy decisions.

Serve a model:

Terminal window
microagent model serve TheBloke/Llama-2-7B-GGUF/llama-2-7b.Q4_K_M.gguf

Probe the MCP transport without an MCP client:

Terminal window
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"probe","version":"0"}}}' | microagent serve mcp
{"jsonrpc":"2.0","id":1,"result":{"capabilities":{"tools":{}},"protocolVersion":"2025-06-18","serverInfo":{"name":"microagent","version":"0.8.3"}}}

In normal use you never run serve mcp yourself - your MCP client launches it.

CommandPurpose
mcpMCP client-launched stdio integration entry point
modelServe a local HuggingFace GGUF model

Install microagent on the same host where your coding tool will launch the MCP server, then verify the host backend there:

Terminal window
microagent doctor

For every stdio MCP client, add microagent as a local stdio MCP server:

command: microagent
args: ["serve", "mcp"]

That snippet belongs in your MCP client’s server configuration. If the client is a GUI app or a remote editor session that does not inherit your shell PATH, use the absolute path from command -v microagent as the command value. Do not configure microagent serve mcp as an HTTP/SSE server or background daemon; the MCP client must start it as a foreground stdio process.

For long-running operations such as image pulls, rootfs builds, and VM lifecycle calls, raise the client’s MCP tool timeout when the client supports one. The microagent tools use ~/.microagent/ by default; most tools also accept a state_dir argument when a caller needs an explicit state root.

The examples below intentionally show the client configuration instead of a microagent installer command. MCP clients store settings in different files, support different timeout fields, and may run locally, remotely, or inside an editor profile. The reliable installation contract is the stdio command above.

Terminal window
codex mcp add microagent -- microagent serve mcp

Or edit ~/.codex/config.toml or a trusted project .codex/config.toml:

[mcp_servers.microagent]
command = "microagent"
args = ["serve", "mcp"]
startup_timeout_sec = 20
tool_timeout_sec = 600
Terminal window
claude mcp add --transport stdio --scope user microagent -- microagent serve mcp

For a project-shared Claude Code configuration, put this in .mcp.json at the project root:

{
"mcpServers": {
"microagent": {
"command": "microagent",
"args": ["serve", "mcp"],
"timeout": 600000
}
}
}

For a project-shared server, use --scope project instead of --scope user.

For a workspace configuration, create .vscode/mcp.json:

{
"servers": {
"microagent": {
"type": "stdio",
"command": "microagent",
"args": ["serve", "mcp"]
}
}
}

You can also add the user-profile server from a shell where microagent is on PATH:

Terminal window
code --add-mcp '{"name":"microagent","command":"microagent","args":["serve","mcp"]}'

If VS Code is connected to a remote machine and you want microagent to run there, define the server in the remote workspace or remote user MCP configuration.

Add microagent to ~/.copilot/mcp-config.json:

{
"mcpServers": {
"microagent": {
"type": "local",
"command": "microagent",
"args": ["serve", "mcp"],
"env": {},
"tools": ["*"]
}
}
}

If your Copilot CLI session does not inherit the same PATH as your shell, use the absolute path from command -v microagent as the command value.

Use the client’s local stdio server form. If it asks for a single command and arguments, enter microagent and serve, mcp. If it uses Claude-style JSON, the minimum shape is:

{
"mcpServers": {
"microagent": {
"command": "microagent",
"args": ["serve", "mcp"]
}
}
}
ToolPurpose
microagent.describeReturn the machine-readable capability manifest
microagent.pingValidate the MCP transport
workspace.createCreate or dry-run a workspace, including snapshot forks with from_snapshot
workspace.startStart a prepared workspace, including snapshot restore with from_snapshot
workspace.execRun a structured command in a running workspace
workspace.haltHalt a workspace and preserve disk state
workspace.stopStop a workspace runtime
workspace.killForce stop a workspace runtime
workspace.quarantineSever host-side network and mediation
workspace.pausePause a running workspace when supported
workspace.resumeResume a paused workspace when supported
workspace.deleteDelete a workspace, with optional preview and force
workspace.listList saved workspaces
workspace.inspectInspect workspace state with summary or full output
workspace.resultRead the structured workspace result
workspace.statsSample workspace resource usage
workspace.logsRead workspace serial logs with summary or full output
workspace.eventsRead lifecycle events with summary or full output
workspace.cloneClone a stopped workspace
workspace.applyApply supported changes from a workspace spec file
workspace.commitCommit a stopped workspace rootfs to an OCI image
workspace.estimate_costEstimate workspace resources before action
artifacts.listList declared workspace artifacts
artifacts.getRetrieve a declared workspace artifact
snapshot.createCreate a backend snapshot when supported
snapshot.listList workspace snapshots
snapshot.deleteDelete a workspace snapshot, with optional preview
network.inspectInspect a workspace’s network
volume.createCreate a named managed ext4 volume
volume.listList named managed volumes
volume.inspectInspect a named managed volume
volume.deleteDelete a named managed volume, with optional preview and force
images.pullPull a reusable image rootfs
images.listList reusable local image records
images.pushPush a locally committed OCI image
images.tagTag a local image record
images.deleteDelete a local image record, with optional preview
images.prunePrune stale local image records, with optional preview
models.pullPull a GGUF model from HuggingFace into the local store
models.listList locally stored models
models.removeRemove a model from the local store
models.prunePrune local model records whose blobs are missing
models.serveStart or reuse a local host model server for a stored or pulled model
models.stopStop local host model server instances for a model
models.runnersList running local model servers
models.policy.validateValidate a structured model mediation policy file
models.policy.evaluateDry-run a policy file against structured request metadata
profiles.listList resource profiles
host.inspectReport host capabilities
doctor.checkRun host diagnostics
contract.getReturn the runtime fields integrations rely on
kernel.verifyVerify a kernel artifact
kernel.installInstall a kernel artifact after preview confirmation
rootfs.buildBuild a rootfs after preview confirmation
cpCopy files into or out of stopped workspace disks

The models.* tools mirror the model subcommands - the same local store and host runner management over MCP.

connect, streaming logs/events/stats, supervise, perf, init, and secret check remain CLI-only. They are interactive, streaming, benchmarking, project scaffolding, or secret-boundary workflows that need more specific MCP interaction and permission semantics than a bounded request/response tool.

MCP tool responses are structured for agent clients. Mutation tools return a consistent envelope with result, optional structured error, timing_ms, and principal_context fields.

workspace.inspect, workspace.logs, and workspace.events default to compact summary output so repeated agent state checks do not require full event history or full serial logs. Pass format: "full" when the complete underlying AX payload is required. workspace.logs accepts tail_lines for bounded log polling. workspace.events accepts limit and after_index, and returns next_after_index; pass that value as the next after_index to poll for new events without a long-running events --follow call.

workspace.delete, volume.delete, snapshot.delete, images.delete, and images.prune accept preview: true to return the actions that would be taken without changing host state. Mutating tools accept an optional idempotency_key; tools that are not inherently idempotent replay the first successful MCP envelope for a client-supplied key.

Snapshot restore and fork use the same workspace tools as the CLI. Pass from_snapshot: "<tag>" to workspace.start to restore a workspace in place, or from_snapshot: "<workspace>:<tag>" to workspace.create to fork a new workspace from an existing snapshot. The dedicated snapshot.* tools create, list, and delete snapshot records.

kernel.install and rootfs.build use a stricter preview-confirm contract. Call the tool with preview: true first, inspect the returned actions, then call the same tool with confirm_token set to the returned confirmation_token. Calls without the matching token fail before changing host state.

workspace.exec returns the structured exec result directly under result: status, optional exit_code, base64-encoded stdout and stderr, truncation flags, timestamps, protocol version, and optional service error. A nonzero command exit is not a tool error; it is represented by status: exited and a nonzero exit_code. Successful workspace.exec responses also include retry_count, retry_wall_clock_ms, and matching metadata fields. When the bounded retry budget is exhausted, the JSON-RPC error data includes retry_count, retry_wall_clock_ms, and retry_exhausted so clients can distinguish retry exhaustion from ordinary task failure. These retry semantics come from the shared workspace exec layer and match CLI AX exec behavior.

serve mcp takes no flags. model serve takes the same flags as model serve:

FlagDescription
--dedicatedStart a dedicated runner for this caller instead of reusing a shared one
--runner <backend>Runner backend: llamacpp (default), vllm, or custom
--runner-gpu <mode>Runner GPU intent: off, on, or auto
--runner-model <id>Backend model id for runners such as vLLM
--runner-served-model <name>OpenAI-compatible served model name for runners such as vLLM
--runner-command <template>Custom host model runner command template
--runner-name <name>Name to record for a custom host model runner
--runner-health-path <path>HTTP health path for a custom host model runner
--runner-arg <arg>Extra host model runner argument. Repeat for multiple argv entries
--runner-env KEY=VALUEExtra host model runner environment override. Repeat for multiple variables
--token <t>HuggingFace bearer token used if the model must be auto-pulled
--state-dir <dir>State directory (default ~/.microagent/)

See global flags for --json/--text/--output/--mode.

model serve exits 0 when the runner is started or reused; nonzero when no host model runner binary is found, runner configuration is invalid, or the model cannot be pulled. serve mcp runs until its client closes stdin, then exits 0; started from a terminal, it exits nonzero with setup guidance. In AX mode a failure is written as a structured error envelope.