Add agent registration timings #116

dannykopping · 2024-10-18T19:38:45Z

As part of #44, we should show when the agent registration occurred and its start/end time in this timeline. This gives users the ability to understand the delta between workspace provisioning completion and the agent startup, which is really an approximate measure of how quickly the compute instance spins up and executes the agent process and how long it takes to call back to the control-plane.

In https://codercom.slack.com/archives/CJURPL8DN/p1729196074736939, Spike and I discussed what would the definition for "agent registration" be. We settled upon: The time between an agent process starting and the successful completion of the call to /api/v2/workspaceagents/me/rpc.
See here: https://github.com/coder/coder/blob/7da231bc9286060e7ad499fa38cb829c4b055a7d/agent/agent.go#L750-L759

The workspace_agents.first_connected_at column seems to satisfy this definition fairly well, and is set here:
https://github.com/coder/coder/blob/5366f2576f690a3f7d3ac1a4efb8dd49cc2e9bd1/coderd/workspaceagentsrpc.go#L330-L338

workspace_agent.created_at is set here which is invoked once a provisioner job has been completed:
https://github.com/coder/coder/blob/fac77f956ec28ad04c0c90750e381e69465d2ae1/coderd/provisionerdserver/provisionerdserver.go#L1782-L1784

Based on this, I think we can simply return the value of created_at (start) to first_connected_at (end) as a reasonable approximation of agent registration time.

Nit: do we want to persist with this "registration time" terminology or is "connection time" more clear?

The text was updated successfully, but these errors were encountered:

BrunoQuaresma · 2024-10-23T16:09:40Z

Ok, so from what I understand, all the data we need is already there. We just need to return this data in the timing endpoint. Is that correct? For example:

return sdk.AgentTiming{
    StartedAt: a. CreatedAt,
    EndedAt: a.FirstConnectedAt,
    Agent: a.Name
}

johnstcn · 2024-10-23T16:15:51Z

@BrunoQuaresma that sounds about right to me based on the issue description!
I would double-check that the time returned roughly matches with the time it takes to download and start the agent.

dannykopping · 2024-10-23T16:52:39Z

Hhmm maybe we need the agent download time as well, that would be very useful

BrunoQuaresma · 2024-10-23T17:29:55Z

@dannykopping so we want to display two timings for each agent, connection, and download times?

dannykopping · 2024-10-23T17:44:00Z

@dannykopping so we want to display two timings for each agent, connection, and download times?

I think so, WDYT?

BrunoQuaresma · 2024-10-25T12:53:41Z

I'm trying to figure out what is the best way to display the agent times based on this issue's discussion. I recorded a video explaining my doubts and thoughts a bit more.

Screen.Recording.2024-10-25.at.09.40.09.mp4

chrifro · 2024-10-25T13:59:59Z

I'm for option two, then we are consistent with the patterns.
The agent times are part of dev. So dev should have one bar including two smaller boxes. When you click on them, you can see the timings for connection and download times. For the colors, if there is no existing state like success or failed (as we have for start) we can use grey bars without a legend.

Hope that makes sense. Let me know if you prefer a mockup but I think we're aligned @BrunoQuaresma :)

dannykopping · 2024-10-25T14:36:56Z

OK, I think we need to take a step back here.

In the design you've got "start" and "dev"; I think I see why you're doing this (because one workspace can have many agents). Even so, the "start" section would have to be under "dev" since each agent has its own startup scripts.

Each agent will have (in order) timings for downloading, connecting, and executing startup scripts. Each should be individual lanes of the timing chart like we have already for "init", "plan", "graph", etc.

I think we should assume that 99% of the time workspaces will have a single agent, so let's keep the design simple by excluding the agent name in that case.

If there are multiple agents, we should repeat the "workspace boot" section and in that we could include the agent name.

On a related note, the main panel label "Provisioning time" is too specific, since it does not include the agent timings. I think we should rename this to "Build timeline" as I suggested here.

I also think we should be more clear and rename "workspace boot" to "agent initialization". The whole timeline is effectively the timings for workspace boot, so this will confuse folks.

BrunoQuaresma · 2024-10-29T11:52:17Z

@dannykopping it makes sense. It looks like we don't have the download agent timing for now, so I will open a PR to include the connection timing and group it together with the start stage in the same section for each agent. It would look like the following:

The section title would be the agent name, in this case, "dev"
Connection would be added as a stage of the agent as the Start

Wdyt?

dannykopping · 2024-10-29T16:47:46Z

Perfect 👍

I think we should name the stages "run startup scripts" and "connect", to remain consist with the other stages (present tense verbs). The help text for the registration step could be "Time taken to establish an RPC connection with the control plane".

coder-labeler bot added the project/observability label Oct 18, 2024

BrunoQuaresma self-assigned this Oct 23, 2024

BrunoQuaresma mentioned this issue Oct 29, 2024

feat(site): add agent connection timings coder/coder#15276

Merged

BrunoQuaresma closed this as completed in coder/coder#15276 Nov 1, 2024

BrunoQuaresma closed this as completed in coder/coder@e232aee Nov 1, 2024

BrunoQuaresma mentioned this issue Nov 1, 2024

Add agent download timings #184

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add agent registration timings #116

Add agent registration timings #116

dannykopping commented Oct 18, 2024

BrunoQuaresma commented Oct 23, 2024

johnstcn commented Oct 23, 2024

dannykopping commented Oct 23, 2024

BrunoQuaresma commented Oct 23, 2024

dannykopping commented Oct 23, 2024

BrunoQuaresma commented Oct 25, 2024

chrifro commented Oct 25, 2024

dannykopping commented Oct 25, 2024

BrunoQuaresma commented Oct 29, 2024

dannykopping commented Oct 29, 2024

Add agent registration timings #116

Add agent registration timings #116

Comments

dannykopping commented Oct 18, 2024

BrunoQuaresma commented Oct 23, 2024

johnstcn commented Oct 23, 2024

dannykopping commented Oct 23, 2024

BrunoQuaresma commented Oct 23, 2024

dannykopping commented Oct 23, 2024

BrunoQuaresma commented Oct 25, 2024

chrifro commented Oct 25, 2024

dannykopping commented Oct 25, 2024

BrunoQuaresma commented Oct 29, 2024

dannykopping commented Oct 29, 2024