-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[extension/httpforwarder] Shutdown should wait server exit #37735
[extension/httpforwarder] Shutdown should wait server exit #37735
Conversation
err := h.server.Close() | ||
h.shutdownWG.Wait() | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of Shutdown
vs Close
is orthogonal to the issue being fixed in this PR: deterministic release of the port used by the component. In both cases there is still the need to wait the call to Serve(listener)
to return.
I tend to prefer Close
since, in certain cases Shutdown
can take much longer given that it needs to wait existing connections to finish gracefully. That said the component owners will have more context if they want to change the current behavior in that regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may miss something:
// Shutdown gracefully shuts down the server without interrupting any
// active connections. Shutdown works by first closing all open
// listeners, then closing all idle connections, and then waiting
// indefinitely for connections to return to idle and then shut down.
// If the provided context expires before the shutdown is complete,
// Shutdown returns the context's error, otherwise it returns any
// error returned from closing the [Server]'s underlying Listener(s).
A.k.a because it returns the error from the listener that function that you wait for it must be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take into account the possible interleaves of the calls to Serve
and Shutdown
/Close
. The ownership of listener
is transferred to the http.Server
when the call to Serve(listener)
happens. However, since that happens in a goroutine it is possible that Shutdown
is called before the http.Server
took the ownership of listener. In this case the listener is not closed until the call to Serve
happens. Serve
adds a deferred call to listener.Close
, even if Shutdown
was already called, which ensures that the port is released by the time the method returns.
While this may be rather improbable in practice, it is better for the component to behave deterministically with the shutdown ensuring no left over goroutine and/or port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bogdandrutu PTAL at reply above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as a solution tofor the flaky test. The Close vs Shutdown discussion can be solved in a separate PR if needed
* main: (55 commits) [chore] Update core dependencies (open-telemetry#38124) Add kafka topics observer implementation (open-telemetry#38060) [exporter/splunk_hec] Mute errors from draining the response body (open-telemetry#38118) [chore] [exporter/splunk_hec] Remove dead code (open-telemetry#38113) Add support for JUnit test results (open-telemetry#37941) [chore] amend changelog for prometheus receiver change (open-telemetry#38109) [chore] Fix dead links in issue-triaging.md (open-telemetry#38105) [chore] fix deprecation (open-telemetry#38107) [exporter/coralogix] Add new batch options to Coralogix exporter (open-telemetry#38082) [chore][exporter/datadog] fix integration test (open-telemetry#38091) [chore] Update otel to unblock contrib test in core repo (open-telemetry#38100) [chore] Bump go-version match to 1.23 (open-telemetry#38099) [exporter/elasticsearch] Add _metric_names_hash to avoid metric rejections (open-telemetry#37511) elasticsearchexporter: refactor encoding; drop metrics support from raw/none/bodymap mapping modes (open-telemetry#37928) [exporter/stefexporter] Fix incorrectly implemented STEF exporter zstd compression option (open-telemetry#38089) [exporter/clickhouse] Add client info for identifying exporter in `system.query_log` (open-telemetry#37146) [chore] Prepare release 0.120.1 (open-telemetry#38055) [extension/httpforwarder] Shutdown should wait server exit (open-telemetry#37735) receiver/prometheusremotewrite: Add two fields timestamp and value. (open-telemetry#37895) [reciver/sqlqueryreceiver] Add support for SapASE (sybase) (open-telemetry#37773) ...
Description
Ensures that the
Shutdown
waits for the server to release the listen port and finish the corresponding go routine.Although a bug on the component it doesn't seem to deserve a changelog since it was detected via tests and there is no issue around the shutdown of the component.
Link to tracking issue
Fixes #37716
Testing
Repro the issue locally and validated no repro with the changes.
Documentation
N/A