scala-mcp

[md]LLM harness

modules/example-harness · Stdio + HTTP · JVM / Native

A minimal LLM ↔ MCP agent. Reads a Claude-style .mcp.json, opens every server it lists, hands those servers' tools to an OpenAI-compatible chat endpoint, and runs an interactive REPL with streaming responses, slash commands, and tool-calling.

This is the most complete client example in the repo — it exercises every moving part: both transports, the McpClient API, server-initiated sampling and elicitation callbacks, and notifications.

What it does

Configuration

The harness reads a Claude-style .mcp.json:

{
  "mcpServers": {
    "dice": {
      "command": "sbt",
      "args": ["exampleDiceJVM/run"]
    },
    "pomodoro": {
      "type": "http",
      "url": "http://localhost:25000/mcp"
    }
  }
}

Discriminator: a command key means stdio; otherwise (or type: "http") means streamable HTTP. Headers can be passed as "headers": { ... } for HTTP entries.

Build and run (JVM)

sbt 'exampleHarnessJVM/run --config .mcp.json --base-url https://api.openai.com/v1 --api-key sk-… --model gpt-4o-mini'

Any OpenAI-compatible endpoint works — set --base-url to your provider's URL (Anthropic via OpenRouter, DeepSeek, GLM, a local Ollama, etc.) and --model to a model id that endpoint understands.

Build and run (Scala Native)

# Requires clang/llvm and s2n-tls — `nix-shell` provides both.
sbt exampleHarnessNative/nativeLink
./modules/example-harness/native/target/scala-3.3.4/example-harness-out \
  --config .mcp.json --base-url https://api.openai.com/v1 \
  --api-key sk-… --model gpt-4o-mini

The native binary is single-file and starts in milliseconds, which makes it convenient as a long-running terminal companion.

What it demonstrates

The harness source under modules/example-harness/shared/src/main/scala/net/andimiller/mcp/examples/harness/ is split into focused files worth reading in order:

File Shows
Main.scala Wiring: load .mcp.json, build the LLM client, build the shared ClientHandler, open every server, collect tools and prompts, hand off to Repl.run.
McpClients.scala One function per McpServerSpec that returns a Resource[F, McpClient[F]] — both StdioMcpClient.builder and StreamableHttpMcpClient.builder.
ClientHandlers.scala The capability advertisement (sampling, elicitation with form) and dispatch by method name.
SamplingHandler.scala sampling/createMessage → forward to OpenAiClient.chat, shape the response back into the MCP wire format.
ElicitationHandler.scala elicitation/create → walk the schema's properties, prompt the terminal field-by-field, type-coerce, return an accept / cancel response.
ToolBridge.scala Aggregate every server's tools into a single OpenAI-shaped tool list with namespaced names; route tool calls back to the right McpClient; synthesise list_resources / read_resource tools for servers that advertise resources.
PromptBridge.scala Surface MCP prompts as /prompt … slash commands; convert PromptMessage content into OpenAI ChatMessages.
Notifications.scala One background fiber per server that drains client.notifications and prints them dim.
Repl.scala The chat loop: streaming output with separate "content" and "thinking" lanes, tool-call hops bounded by MaxToolHops, slash-command dispatch.
OpenAiClient.scala / OpenAiTypes.scala A tiny OpenAI-compatible chat client (single POST or streaming SSE) plus the wire types it needs.

Together they're a worked answer to "what does it take to plug an LLM into an arbitrary set of MCP servers." Most of the protocol-level plumbing is upstream in McpClient and ClientHandler — the harness is mostly bridging code between MCP shapes and OpenAI shapes.