Three Webflow clients on retainer asked me the same question last month, almost word for word. Should we have our own MCP server, or are you running one across all your accounts? The answer I had been improvising for six months finally needed a proper write-up. After running both architectures in production since Anthropic's MCP spec hit version 1.0 in late 2024, I have a clear preference for solo and small studio practices. This piece walks through the trade-offs, the security model that changed my mind, and the exact split I now recommend to other Webflow Partners running fewer than ten active retainers.
What Does a Single Multi-Client MCP Server Actually Look Like?
A single multi-client MCP server is one process that exposes tools and resources to Claude or another LLM client, with per-request tenant isolation that scopes data access to the client whose session is active. The server runs on my own infrastructure. Each connected workspace authenticates with a credential that maps to exactly one tenant, and the server enforces that boundary on every tool call.
In practice my server runs on a Cloudflare Worker behind a Workers KV-backed token store, exposes about twenty tools (Webflow Designer reads, Webflow CMS writes, Plausible Analytics queries, GitHub repo lookups for the few clients with custom code repos), and routes by a tenant header that the client cannot spoof because the token itself encodes the tenant. The aggregate cost is under nine dollars a month according to Cloudflare's April 2026 invoice.
The alternative is one MCP server per client, deployed independently, with no shared code path between them. That model has appeal on paper because it removes the multi-tenancy boundary entirely. The cost is operational. Twelve servers to update, twelve secrets to rotate, twelve monitoring dashboards to read.
Why Did I Start With Per-Client Servers and Switch?
I started with per-client servers because the security argument felt obvious. If client A's data cannot physically reach the process that handles client B, then the boundary is provable rather than asserted. After six months I had eight servers in production, and the maintenance overhead had become the largest single time cost in my practice, larger than client work itself.
The switch happened after I shipped a tool to one client server that had a small bug, and the same bug existed in five other client servers because I had copied the file. Fixing six servers individually took an hour of pure operations work. The next morning I started building the multi-tenant version. Anthropic's own MCP server reference architecture from December 2025 documents the multi-tenant pattern explicitly, which gave me confidence that the design was supported rather than improvised.
What About Security and Tenant Isolation in a Shared Server?
Tenant isolation in a shared MCP server is enforced through three layers that compose. The token-to-tenant mapping at the edge, the per-tool authorization check inside the server, and the per-resource read filter that the underlying APIs enforce when the server makes outbound calls. If any one layer fails, the other two still hold the boundary.
For Webflow specifically, the third layer is the strongest. The Webflow Data API token I pass through to the Webflow MCP tools is per-client, so even if my server's logic accidentally let a tool call through with the wrong tenant context, the Webflow API would reject the request because the token does not match the requested site. The OWASP Multi-Tenant Application Security Cheat Sheet from 2024 describes the same defense-in-depth pattern. I built my server to match that pattern before I trusted it with billing-sensitive data.
How Does Cost Compare Between One Server and Many?
One shared server is roughly an order of magnitude cheaper at the scale most solo Webflow practices operate at. My single server handles around 800 tool calls per day across twelve clients, which fits inside Cloudflare Workers' free tier most months and costs about nine dollars when it does not. Twelve separate servers, each on its own minimum-tier deployment, cost between sixty and one hundred and forty dollars depending on the platform.
The cost picture flips at a different scale. A studio running fifty active client retainers with heavy MCP usage may find that the per-tenant resource isolation of dedicated servers is worth the premium. The breakeven I saw in my own practice was around twenty active retainers, which is more clients than most solo Partners ever take on. For my volume the shared model wins by a wide margin. I covered the broader cost picture in my actual monthly AI tooling cost piece from earlier this month.
What Tools Belong on the Shared Server and What Stays Separate?
Most read-only tools belong on the shared server because the multi-tenant boundary is cleanly enforceable. Site metadata reads, CMS item lists, analytics queries, deployment status checks, and content audits all map naturally onto a tenant-scoped query pattern. Write tools also fit when the write target is well-scoped, like CMS item creates against a specific collection.
The tools that stay on per-client deployments are the ones where the blast radius of a tenant boundary failure would be unacceptable. Site publish operations, DNS record updates, billing portal actions, and anything that touches a payment processor get their own dedicated process per client. The mental model is the same one banks use for cross-account transfers: low-risk operations share infrastructure, high-risk operations get isolation. The Webflow MCP Server team's own documentation from March 2026 implies a similar split for production deployments.
How Do You Authenticate the Right Client at the Right Time?
Authentication uses a per-client API token that the Claude desktop app sends in every request. The token maps to exactly one tenant in my server's KV store, and the mapping is set at provisioning time and never mutated. Switching tenants in the same Claude session is impossible because the token is configured at the connection level, not the per-message level. The tenant identity is therefore stable across the entire session.
The provisioning flow generates a fresh token, stores the hashed version in KV with the tenant ID, returns the plaintext token once, and never logs it. The client adds the token to their Claude Code or Claude Desktop config. From that point forward every request from that workspace carries the right tenant context with no manual switching. The pattern is the same one Stripe uses for restricted API keys, and it has been quietly battle-tested for over a decade. I covered the related operational discipline in my per-client AI memory stack piece.
What Does a Typical Client Onboarding Look Like Now?
Onboarding a new client to my shared MCP server takes about twelve minutes start to finish. I generate the tenant ID and token, add the Webflow Data API token to the client's secrets entry, send the client their personal MCP config snippet, and walk them through pasting it into Claude Desktop. The first tool call usually happens five minutes after they paste the config.
The corresponding flow for a per-client server is closer to ninety minutes. Spin up a fresh deployment, configure the platform's secrets, deploy the server code, configure DNS or use a generated subdomain, test the deployment, send the config to the client. Multiplied across twelve clients the time saving from the shared model is around fifteen hours, which is roughly two days of billable client work I get back every onboarding cohort. I covered the broader onboarding rhythm in my three-hour contractor onboarding piece.
What Is the Failure Mode I Lost Sleep Over?
The failure mode I lost sleep over for the first month was a logic bug that returned client A's data to client B because of a missing tenant filter in a single tool's database query. I wrote a property-based test suite that runs on every deploy and asserts that no tool call ever returns data tagged with a tenant ID different from the request's tenant. The suite has caught two bugs since I added it in February 2026.
The other failure mode worth designing against is a token leak. If a client's MCP token leaks publicly, that single token grants access to that single tenant's data through the Webflow APIs my server proxies. The mitigation is fast revocation, which my admin endpoint supports. I rehearse the revocation flow once a quarter to keep the muscle memory current. Most studios I talk to skip this rehearsal and discover the gap during a real incident. The fifteen minutes a quarter is worth it.
How Do You Decide Which Architecture Fits Your Practice?
The decision rule I now give to other Webflow Partners is to count active retainers and weigh how much of the practice's time is spent on MCP-mediated client work. If you have fewer than twenty active clients and MCP usage is at least a daily activity, the shared multi-tenant server is almost always the right call. The maintenance saving compounds across every tool you ship from that point forward.
If you have more than fifty active clients or your MCP work touches financial systems where blast radius matters more than maintenance cost, the per-client server pattern earns its overhead. The middle range of twenty to fifty clients is where reasonable practitioners disagree. The honest answer there is to start with the shared model, instrument it well, and split the high-risk tools off to dedicated deployments only when a specific incident or requirement justifies the move. I covered the related discipline in my MCP resource templates piece.
How Should You Try This in Your Own Practice This Week?
Pick two clients whose MCP needs are similar and stand up a single Cloudflare Worker that exposes one read-only tool, like a Webflow CMS list-collections tool. Configure the server with two tokens, one per tenant. Test that each token only sees its own client's data. The whole exercise takes about three hours including the writing of a basic property test.
From there the shared server grows tool by tool as you migrate functionality off any per-client servers you already run. The migration is not urgent unless your operational time on per-client maintenance has become painful. When it has, the shared model pays back the migration cost within two onboarding cohorts. If you want a sample server skeleton to start from, my Claude Creative Connectors pipeline piece covers the production deployment shape I use.
If you are weighing the architectures for your own Webflow practice and want to talk through the specific trade-offs in your situation, drop me a line and tell me how many clients you support and which APIs your MCP tools touch most. Let's chat.
Get your website crafted professionally
Let's create a stunning website that drive great results for your business
Get in Touch
This form help clarify important questions in advance.
Please be as precise as possible as it will save our time.