Deploying a Remote MCP Server on GKE in 30 Minutes

Deploying a Remote MCP Server on GKE in 30 Minutes

Integrating context from various tools and data sources into large language models (LLMs) presents challenges for developers. To streamline this process, Anthropic has introduced the Model Context Protocol (MCP), which standardizes how applications provide context to these models. By building an MCP server, developers can make their APIs accessible for use in other applications. Google Kubernetes Engine (GKE) offers a robust environment for deploying these servers efficiently.

Understanding MCP Transports

The Model Context Protocol operates on a client-server architecture. Initially, it only supported local server execution through the stdio transport. However, it has evolved to include remote access via Streamable HTTP. This allows the server to function as an independent process capable of managing multiple client connections through HTTP POST and GET requests.

Advantages of GKE for MCP Servers

Deploying an MCP server on GKE offers several key benefits:

  • Scalability: GKE Autopilot can efficiently handle variable traffic loads, allowing for horizontal scaling during peak demands.
  • Centralized Access: Teams can connect to a single MCP server, reducing redundancy and ensuring that updates are immediately available to all users.
  • Enhanced Security: The combination of the Kubernetes Gateway API and SSL certificates ensures secure and encrypted traffic, protecting against unauthorized access.

Prerequisites for Setup

Before beginning the setup process, ensure the following tools are installed:

  • Python 3.10 or higher
  • UV (for package and project management)
  • Google Cloud SDK (gcloud)
  • kubectl command-line tool

Step-by-Step Installation Guide

To set up the environment, follow these steps:

  1. Create a directory for the project:
  2. mkdir mcp-on-gke && cd mcp-on-gke
  3. Configure Google Cloud credentials:
  4. gcloud auth login && gcloud config set project $PROJECT_ID
  5. Initiate the GKE Autopilot cluster creation:
  6. gcloud container clusters create-auto mcp-cluster --region $REGION --release-channel rapid --async
  7. Create project files using UV:
  8. uv init
  9. Prepare the necessary files: server.py, test_server.py, and a Dockerfile.

Creating a Math MCP Server

For tasks like addition and subtraction, developers can utilize FastMCP, a Python framework for building MCP servers. This allows for the creation of a simple math server.

To add FastMCP as a dependency, run:

uv add fastmcp uv add asyncio

Then, implement the server logic in server.py.

Testing the MCP Server Locally

Before deploying, it’s crucial to test the MCP server locally. Create a test_mcp_server.py script to verify functionality:

uv run server.py

After running the server, execute the test script in a new terminal.

Building the Container Image

While the cluster is provisioning, prepare the Dockerfile and build the container image:

gcloud builds submit --tag $REGION-docker.pkg.dev/$PROJECT_ID/mcp-repo/math-mcp-server:latest

Once built, verify the cluster's readiness:

gcloud container clusters get-credentials mcp-cluster --region $REGION

Deploying with Gateway API and SSL

Deploy the server workloads using the Kubernetes Gateway API for secure exposure. Create a deployment.yaml file to define the deployment and service configurations, then apply it:

kubectl apply -f deployment.yaml

Check the status of the pods:

kubectl get pods

To ensure accessibility, use port-forwarding:

kubectl port-forward svc/mcp-service 8080:80

Securing the Connection

To secure the connection, reserve a static IP for the load balancer and create a Google-managed SSL certificate:

gcloud compute addresses create mcp-server-ip --global

Point your domain's DNS A record to this IP and create the certificate:

gcloud compute ssl-certificates create mcp-cert --domains mcp.yourdomain.com --global

Finally, deploy the gateway configuration:

kubectl apply -f gateway.yaml

Cleanup

After testing, clean up resources with:

kubectl delete -f deployment.yaml && kubectl delete -f gateway.yaml && gcloud compute addresses delete mcp-server-ip --global

Deploying MCP servers on Kubernetes enables innovative applications for integrated agents and AI workflows, enhancing the development landscape.

This editorial summary reflects Google and other public reporting on Deploying a Remote MCP Server on GKE in 30 Minutes.

Reviewed by WTGuru editorial team.