Secure Access to Private EKS Clusters Without Bastion Hosts Using SSM

Apr 22, 20264 min read

Locking down your Kubernetes control plane is a basic requirement for any production environment. Exposing the EKS API server to the public internet is just asking for automated scanners to ruin your weekend. However, securing the endpoint creates an operational headache: how do you actually run kubectl when the API is sealed inside a private subnet?

The traditional answer was a bastion host. But managing SSH keys, rotating credentials, and maintaining yet another publicly exposed EC2 instance is tedious. We all know that a "temporary" bastion host spun up on a Friday afternoon will inevitably become a load-bearing production pillar by Monday.

Instead, we can use AWS Systems Manager (SSM) Session Manager. By leveraging the SSM agent already running on your EKS worker nodes, we can securely tunnel our local traffic directly to the private API endpoint without opening inbound ports or managing SSH keys.

The Mechanics of the SSM Tunnel

The flow is straightforward:

  1. Your local machine initiates an SSM port forwarding session targeting a specific EKS worker node.
  2. The SSM session is instructed to forward traffic to a remote host (the private EKS API endpoint URL) on port 443.
  3. You update your kubeconfig to point to localhost on your chosen forwarded port.

Because the worker node is already in the VPC and authorized to talk to the EKS control plane, it acts as a highly secure, identity-aware proxy. Access is governed entirely by IAM, meaning you can audit every connection via CloudTrail.

Prerequisite: IAM Configuration

For this to work, your EKS worker nodes must have the SSM agent installed (the official EKS optimized AMIs have this by default) and the correct IAM permissions.

Here is a Terraform snippet demonstrating how to attach the necessary SSM policy to your existing EKS node IAM role.

# Assumes you already have an aws_iam_role defined for your worker nodes # named 'eks_node_role' resource "aws_iam_role_policy_attachment" "ssm_managed_instance_core" { policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore" role = aws_iam_role.eks_node_role.name } # Optional but recommended: Restrict who can start sessions in IAM resource "aws_iam_policy" "ssm_user_access" { name = "EKS-SSM-Tunnel-Access" description = "Allows users to port forward to EKS nodes" policy = jsonencode({ Version = "2012-10-17" Statement = [ { Effect = "Allow" Action = "ssm:StartSession" Resource = [ "arn:aws:ec2:*:*:instance/*", "arn:aws:ssm:*:*:document/AWS-StartPortForwardingSessionToRemoteHost" ] # In a real environment, restrict the instance resource via tags Condition = { StringEquals = { "ssm:resourceTag/eks:cluster-name" = "my-production-cluster" } } } ] }) }

This ensures your nodes can communicate with the SSM service and restricts which IAM users can actually initiate the tunnel.

Establishing the Tunnel

Once the nodes are registered in SSM, you need a script to extract a valid instance ID, locate the cluster API endpoint, and start the tunnel.

Here is a Bash script you can execute locally to handle the heavy lifting. It requires the AWS CLI and the Session Manager plugin to be installed on your workstation.

#!/bin/bash set -euo pipefail CLUSTER_NAME="my-production-cluster" REGION="us-east-1" LOCAL_PORT="8443" # Fetch the private endpoint of the EKS cluster echo "Fetching EKS endpoint for ${CLUSTER_NAME}..." EKS_ENDPOINT=$(aws eks describe-cluster \ --name "${CLUSTER_NAME}" \ --region "${REGION}" \ --query "cluster.endpoint" \ --output text | sed 's/https:\/\///') # Find an active worker node instance ID using tags echo "Finding an active worker node..." INSTANCE_ID=$(aws ec2 describe-instances \ --region "${REGION}" \ --filters "Name=tag:eks:cluster-name,Values=${CLUSTER_NAME}" "Name=instance-state-name,Values=running" \ --query "Reservations[0].Instances[0].InstanceId" \ --output text) if [ "$INSTANCE_ID" == "None" ]; then echo "Error: No running worker nodes found." exit 1 fi echo "Establishing SSM tunnel through ${INSTANCE_ID} to ${EKS_ENDPOINT}..." echo "Leave this terminal open. Access EKS via https://localhost:${LOCAL_PORT}" # Start the port forwarding session aws ssm start-session \ --region "${REGION}" \ --target "${INSTANCE_ID}" \ --document-name AWS-StartPortForwardingSessionToRemoteHost \ --parameters "{\"host\":[\"${EKS_ENDPOINT}\"],\"portNumber\":[\"443\"],\"localPortNumber\":[\"${LOCAL_PORT}\"]}"

Run this script, and it will bind localhost:8443 to the private API endpoint.

Updating Kubeconfig

The final step is modifying your local Kubernetes configuration. You cannot simply run aws eks update-kubeconfig and call it a day, because that will configure the private AWS endpoint, which your machine still cannot route to directly.

You need to manually alter the server field for your cluster to point to the local port.

When you port-forward the EKS API server to your local machine, connecting to https://localhost:8443 introduces a new problem. The API server presents a TLS certificate minted for its internal AWS endpoint (e.g., 1234567890ABCDEF.yl4.us-east-1.eks.amazonaws.com), not localhost.

The quick, dirty fix is to add insecure-skip-tls-verify: true to your kubeconfig. But nothing screams "I definitely passed my SOC2 audit" quite like explicitly disabling TLS validation in production. It is the infrastructure equivalent of putting black tape over a check engine light.

Instead of turning off validation, we can instruct kubectl to connect via our local port but validate the TLS certificate against the actual EKS endpoint hostname. We do this by utilizing the tls-server-name parameter.

apiVersion: v1 clusters: - cluster: server: https://localhost:8443 # Validate the certificate against the real AWS endpoint tls-server-name: 1234567890ABCDEF.yl4.us-east-1.eks.amazonaws.com name: arn:aws:eks:us-east-1:123456789012:cluster/my-production-cluster # ... contexts and users remain unchanged

Once saved, kubectl get pods will route securely through the SSM tunnel, across the worker node, and hit the control plane.

Wrap-Up

Relying on SSM port forwarding eliminates the need for VPNs, bastion hosts, and complex routing rules just to run operational commands against an isolated EKS cluster. By utilizing the existing IAM-integrated agent on your worker nodes, you shrink your external attack surface while maintaining strict audit trails for developer access.

RELATED POSTS
Tomasz Fidecki
Tomasz Fidecki
CTO | Technology

Managing S3 Lifecycle Rules with s3cmd on DigitalOcean Spaces

Apr 08, 20263 min read
Article image
Michał Miler
Michał Miler
Senior Software Engineer

AWS Amplify vs Vercel: Complete Pricing Comparison for Next.js Applications

Feb 18, 20268 min read
Article image
Bartłomiej Gałęzowski
Bartłomiej Gałęzowski
Senior Software Engineer

Secure RDS Access Without Bastion Hosts: Using ECS Containers and SSM

Feb 04, 20264 min read
Article image