Skip to content

Document vMCP scalability limits and operational configuration #737

@yrobla

Description

@yrobla

Summary

Several vMCP scalability limits and operational configuration details are either undocumented or only present in source code. This issue tracks what needs to be added to the docs, primarily in docs/toolhive/guides-vmcp/scaling-and-performance.mdx.

Gaps to address

1. Session cache capacity (1000 sessions/pod)

  • What: Each vMCP pod holds a node-local LRU cache of 1000 live sessions. When full, least-recently-used sessions are evicted and their backend connections are closed abruptly.
  • Where in code: pkg/vmcp/server/sessionmanager/factory.godefaultCacheCapacity = 1000
  • What to document: The limit, what happens on eviction, when operators might hit it, and guidance on sizing pod count accordingly.
  • Note: This is not currently exposed as a CRD field — worth considering whether it should be.

2. Session TTL default (30 minutes)

  • What: Sessions inactive for 30 minutes are automatically evicted from Redis and the local cache. The field exists in OperationalConfig in the CRD but the default is not surfaced in user-facing docs.
  • Where in code: pkg/vmcp/server/server.godefaultSessionTTL = 30 * time.Minute
  • What to document: Default value, what happens when it expires (client must reconnect), and guidance on tuning for long-lived vs. short-lived workloads.

3. File descriptor limits

  • What: Each session holds approximately one HTTP connection per backend. At 1000 sessions × 10 backends, a pod can hold ~10K file descriptors. This is not documented or accounted for in any container resource guidance.
  • What to document: Expected FD consumption formula, recommended ulimit / Kubernetes container limits, and what happens if FDs are exhausted.

4. Cross-pod session rebuild cost

  • What: When a request lands on a pod that doesn't have the session cached (cache miss or no sticky routing), RestoreSession reconnects to all backends and rebuilds the routing table. This costs ~100–500ms.
  • What to document: This cost should be called out in the sticky session / Redis section so operators understand the latency tradeoff of not using session affinity, not just the correctness concern.

Related files

  • docs/toolhive/guides-vmcp/scaling-and-performance.mdx — primary target
  • docs/toolhive/reference/crd-spec.md — may need updates for TTL default and cache capacity

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions