-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
When https://github.com/coder/coder-logstream-kube is in use, we see extraneous changes to the Agent connectivity state. These incluce marking the agent connected before the agent process actually connects to coderd and marking the agent disconnected even though the agent never disconnects from coderd. You will often see logs like the following in the coderd log that do not correspond with the agent itself disconnecting.
The root problem is that coder-logstream-kube impersonates the agent to do it's work with coderd. It learns the agent token via Kubernetes pod spec, then connects to the same RPC endpoint as the real agent does. Coderd is designed with the assumption that only the real agent will connect to this RPC endpoint, and so uses the connectivity state of the RPC to set connection and disconnection timestamps for the agent. These timestamps are used to compute the connectivity state of the agent.
Consequences
At least one customer has observed VSCode attempting to connect to the agent before it actually starts.
At least two customers have observed disconnections from both VSCode and Jetbrains due to the agent being marked disconnected. This second issue can be mitigated by not triggering disconnections at the SSH layer due to momentary agent disconnects reported by coderd.
The dashboard and metrics briefly, incorrectly, indicate the agent is disconnected.
Relevant Log Output
coderd.agentrpc.yamux.stdlib: [ERR] yamux: Failed to read header: failed to get reader: failed to read frame header: EOF owner=spike workspace_name=dogfood agent_name=main request_id=a64d2f7f-631c-4daf-9692-53e95c4bbd0eExpected Behavior
coder-logstream-kube connections should not affect the connectivity status of the agent.
Steps to Reproduce
- set up Coder with Kubernetes as the workspace infrastructure (including template)
- install coder-logstream-kubernetes in the Kubernetes cluster as usual
- start a workspace
Environment
Iaas: Kubernetes
coder-logstream-kube: since v0.0.10
coder: since v2.7 at least, possibly older
Additional Context
No response