March 14, 2026 · Engineering · 8 min read

Building a Production-Grade MCP Server Inside a Multi-Tenant Enterprise Platform

Standards usually arrive long after teams have already stitched together their own integrations. MCP felt different. When the specification started to mature, we could see a realistic path to letting AI agents talk directly to enterprise systems without building a custom wrapper for each assistant, each model, and each use case.

At Graylark, that mattered immediately. Our labour-relations platform handles country-specific agreements, proposals, works councils, legal advisory context, and change workflows for multinational organizations. The data is sensitive, the permissions are granular, and tenant boundaries are strict. Any agent interface had to follow those exact same rules, not bypass them in the name of convenience.

This is the story of how we introduced MCP into that environment as a production capability, what worked, what broke, and where we deliberately avoided over-engineering. If you are considering MCP in an enterprise application, the key idea is simple: treat it as part of your core platform architecture, not a sidecar integration.

Why MCP Was a Practical Starting Point

Before MCP, connecting an assistant to internal platform data usually meant bespoke API work. Even when APIs were well designed, every external consumer still had to learn domain-specific models, filtering conventions, and auth behavior. That is manageable for one integration. It is expensive and fragile at scale.

MCP shifted that pattern. Agents can discover tools, resources, and prompt templates dynamically instead of relying on pre-baked coupling. For Graylark, that means one integration surface can support internal copilots, customer-side assistants, and automation agents without rewriting the same contract in five places.

We did not adopt MCP because we think it is the final answer forever. We adopted it because, right now, it reduces duplicated integration effort while improving governance. MCP gives us a strong common interface for current agents. Our platform provides the guardrails.

We also assume the protocol landscape will keep moving. That is why we treated MCP as an execution surface, not as our core product contract. The long-term contract is our capability layer: tools, resources, prompts, policy controls, and audit semantics that can be projected into whichever protocol becomes dominant next.

Production Constraints Changed the Design

Proof-of-concept MCP servers can be deceptively simple. Production MCP inside a multi-tenant enterprise platform is not. We had to preserve four non-negotiables from day one:

Authentication had to support both user-driven and machine-driven access.
Tenant and country boundaries had to hold across asynchronous execution.
Privilege checks had to stay consistent with the existing web platform.
Every tool invocation had to be auditable.

Those constraints shaped nearly every technical decision. We intentionally kept the MCP module isolated and toggleable, then wired it through the same security and data controls the rest of the platform already trusts. In practice, that gave us a clean deployment path: MCP can be enabled where needed, and disabled without destabilizing core product behavior.

Authentication for Two Real Caller Types

In enterprise environments, "the client" is never one thing. We saw two concrete patterns.

First, a human user connects an assistant and expects the assistant to see exactly what they can see. That path should inherit the existing user context directly.

Second, a background automation or external system needs machine credentials for unattended execution. That path cannot rely on an interactive user session.

We implemented both, with clear separation. User-linked sessions inherit the user's platform boundaries. Machine tokens are revocable, scoped, and tied back to accountable identities and tenant context. The practical outcome is consistent behavior: agents do not gain "special" data access just because they use MCP.

Preserving Tenant and Country Boundaries Under Async Load

Multi-tenancy is straightforward when everything runs in one request thread. MCP over long-lived connections introduces asynchronous execution, and that is where many systems leak context by accident.

We treated context propagation as a first-class requirement, not an implementation detail. Tenant scope and security state are captured at session entry, propagated through async execution, and validated again before tool handlers touch domain services. We also keep an explicit session context fallback so edge cases do not silently degrade into cross-scope behavior.

The hard part is not getting an MCP demo to work. The hard part is proving it still works correctly on the busiest day of the quarter.

That focus paid off in testing. We found and fixed intermittent failures that only appeared under contention, long before they could become production incidents.

Privilege Enforcement and Audit Without Special Cases

We were strict on one design principle: if an action requires privilege in the UI, it should require the same privilege through MCP. No parallel authorization model, no shortcuts.

Tool handlers perform capability checks at entry, then return structured outcomes agents can handle cleanly. This gives users predictable behavior and makes integration troubleshooting much faster.

Auditability follows the same pattern. Rather than asking each tool author to remember logging, invocation audit is applied consistently at registration time. Every MCP tool call carries actor context, timestamps, and invocation metadata into the same governance trail used by the rest of the platform.

The value is practical: compliance teams can answer "who did what and when" across both human UI activity and agent activity from one source of truth.

Validating Inputs at the Trust Boundary

MCP requests arrive as external input. That means validation has to be explicit and boring in the best possible way. We use lightweight per-tool validation for required fields, IDs, enums, and bounded limits so handlers start from known-safe data.

We avoided heavyweight abstractions here. Keeping validation local to each tool made behavior clearer in code review and easier to evolve as tools changed. The goal was reliability and readability, not framework cleverness.

Designing for Protocol Evolution Without Betting on One Standard

MCP is maturing quickly, and it will continue to evolve. We assumed from the start that transports, schemas, and integration expectations would move over time.

To stay flexible, we use provider-style registration for tools, resources, and prompts. Each domain capability can be added independently, discovered at startup, and wired through shared controls. Transport concerns are kept separate from business logic, so protocol plumbing can change without forcing a rewrite of domain handlers.

In practice, we abstracted where it matters most: at the capability boundary. Tool definitions, access checks, audit behavior, and context-scoping rules are implemented once and then exposed through MCP today. If a stronger emergent protocol takes over tomorrow, we can add a new adapter layer and reuse the same underlying capabilities and controls.

That balance was deliberate. We avoided speculative framework complexity, but we also avoided hard-wiring business behavior to a single protocol format.

Prompts Matter as Much as Tools

Tools provide capability; prompt assets provide execution quality. In labour relations, the sequence of analysis matters. A useful agent should gather relevant proposal context, linked agreements, country constraints, and process state before summarizing risk.

We encoded those patterns as reusable prompt templates. That gave us more consistent outputs across assistants and reduced the number of "technically correct but operationally unhelpful" responses. It also created a practical feedback loop: when analysts refined their approach, we could improve prompt assets once and lift every connected workflow.

Operational Lessons From Going Live

Three lessons stood out during rollout.

Connection health needs active management. Long-lived sessions fail in subtle ways unless keep-alive and cleanup are explicit.
Observability should include agent behavior patterns, not just errors. Unexpected tool usage can reveal process risks early.
Security consistency builds trust faster than feature count. Clients care less about how many tools exist and more about whether boundaries hold.

We also found that platform teams and governance teams ask different but equally important questions. Platform teams ask, "Does this scale safely?" Governance teams ask, "Can we prove what happened?" A production MCP implementation has to satisfy both.

Where We Are Taking It Next

Near term, we are focused on finer-grained workload controls, deeper execution telemetry, and richer tool composition for multi-step agent flows. We are also improving policy controls so organizations can tune machine access behavior by tenant and environment with less operational overhead.

The broader direction is clear for us: protocol-based agent integration is becoming part of standard enterprise AI architecture. MCP is a strong option today, but our platform is designed to move if the market converges elsewhere. Sensitive operational systems need that combination of capability, control, and adaptability.

Closing View

Adding MCP to Graylark LRM was not about adding another endpoint. It was about extending the platform contract to AI agents without weakening any of the trust boundaries that enterprise customers rely on.

That is the pattern we believe in: reusable capability surfaces, strict security alignment, and architecture that can evolve without forcing a full rebuild every time protocol standards shift.

For broader platform context, visit Graylark Technologies and see the related write-up on Graylark LRM.

Back to all articles