Integrating context from various tools and data sources into large language models (LLMs) presents challenges for developers. To streamline this process, Anthropic has introduced the Model Context Protocol (MCP), which standardizes how applications provide context to these models. By building an MCP server, developers can make their APIs accessible for use in other applications. Google Kubernetes Engine (GKE) offers a robust environment for deploying these servers efficiently.
Understanding MCP Transports
The Model Context Protocol operates on a client-server architecture. Initially, it only supported local server execution through the stdio transport. However, it has evolved to include remote access via Streamable HTTP. This allows the server to function as an independent process capable of managing multiple client connections through HTTP POST and GET requests.
Advantages of GKE for MCP Servers
Deploying an MCP server on GKE offers several key benefits:
- Scalability: GKE Autopilot can efficiently handle variable traffic loads, allowing for horizontal scaling during peak demands.
- Centralized Access: Teams can connect to a single MCP server, reducing redundancy and ensuring that updates are immediately available to all users.
- Enhanced Security: The combination of the Kubernetes Gateway API and SSL certificates ensures secure and encrypted traffic, protecting against unauthorized access.
Prerequisites for Setup
Before beginning the setup process, ensure the following tools are installed:
- Python 3.10 or higher
- UV (for package and project management)
- Google Cloud SDK (gcloud)
- kubectl command-line tool
Step-by-Step Installation Guide
To set up the environment, follow these steps:
- Create a directory for the project:
- Configure Google Cloud credentials:
- Initiate the GKE Autopilot cluster creation:
- Create project files using UV:
- Prepare the necessary files:
server.py,test_server.py, and aDockerfile.
mkdir mcp-on-gke && cd mcp-on-gke
gcloud auth login && gcloud config set project $PROJECT_ID
gcloud container clusters create-auto mcp-cluster --region $REGION --release-channel rapid --async
uv init
Creating a Math MCP Server
For tasks like addition and subtraction, developers can utilize FastMCP, a Python framework for building MCP servers. This allows for the creation of a simple math server.
To add FastMCP as a dependency, run:
uv add fastmcp uv add asyncio
Then, implement the server logic in server.py.
Testing the MCP Server Locally
Before deploying, it’s crucial to test the MCP server locally. Create a test_mcp_server.py script to verify functionality:
uv run server.py
After running the server, execute the test script in a new terminal.
Building the Container Image
While the cluster is provisioning, prepare the Dockerfile and build the container image:
gcloud builds submit --tag $REGION-docker.pkg.dev/$PROJECT_ID/mcp-repo/math-mcp-server:latest
Once built, verify the cluster's readiness:
gcloud container clusters get-credentials mcp-cluster --region $REGION
Deploying with Gateway API and SSL
Deploy the server workloads using the Kubernetes Gateway API for secure exposure. Create a deployment.yaml file to define the deployment and service configurations, then apply it:
kubectl apply -f deployment.yaml
Check the status of the pods:
kubectl get pods
To ensure accessibility, use port-forwarding:
kubectl port-forward svc/mcp-service 8080:80
Securing the Connection
To secure the connection, reserve a static IP for the load balancer and create a Google-managed SSL certificate:
gcloud compute addresses create mcp-server-ip --global
Point your domain's DNS A record to this IP and create the certificate:
gcloud compute ssl-certificates create mcp-cert --domains mcp.yourdomain.com --global
Finally, deploy the gateway configuration:
kubectl apply -f gateway.yaml
Cleanup
After testing, clean up resources with:
kubectl delete -f deployment.yaml && kubectl delete -f gateway.yaml && gcloud compute addresses delete mcp-server-ip --global
Deploying MCP servers on Kubernetes enables innovative applications for integrated agents and AI workflows, enhancing the development landscape.