Skip to content

How it works

mcp-compressor sits between an MCP client and one or more backend MCP servers.

flowchart LR
  Client[MCP client or SDK user]
  Compressor[mcp-compressor]
  BackendA[Backend MCP server A]
  BackendB[Backend MCP server B]

  Client --> Compressor
  Compressor --> BackendA
  Compressor --> BackendB

The compressor connects to backend servers, reads their tool metadata, and exposes a smaller frontend tool surface.

Normal compressed tools

Instead of exposing every backend tool directly, the frontend exposes wrappers such as:

  • <server>_get_tool_schema
  • <server>_invoke_tool
  • <server>_list_tools at max compression

For a single server named atlassian, the client might see:

atlassian_get_tool_schema
atlassian_invoke_tool
atlassian_list_tools

A model can first inspect the compact listing, then call get_tool_schema only for the tool it wants, then call invoke_tool.

Compression levels

Level Tool listing behavior Typical use
low More descriptive compressed listings Smaller servers or exploratory use
medium Balanced descriptions and parameter names Default choice
high Very compact listings focused on names/args Large toolsets
max Minimal frontend surface plus list_tools Very large/multi-server setups

Transports

Backends can be:

  • local stdio MCP server commands,
  • remote streamable HTTP MCP URLs.

Frontends can be:

  • stdio MCP server mode,
  • streamable HTTP MCP server mode,
  • local proxy server for generated clients and SDK use.

Native SDKs

Rust, Python, and TypeScript SDKs can start a compressed proxy in-process without spawning the mcp-compressor stdio CLI. The SDKs still start the backend MCP servers or connect to remote backend URLs as configured.