Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some flaky Aspire.Hosting.Tests.SlimTestProgramTests tests #5637

Closed
radical opened this issue Sep 10, 2024 · 9 comments · Fixed by #5668
Closed

Some flaky Aspire.Hosting.Tests.SlimTestProgramTests tests #5637

radical opened this issue Sep 10, 2024 · 9 comments · Fixed by #5668
Labels
area-engineering-systems infrastructure helix infra engineering repo stuff disabled-tests

Comments

@radical
Copy link
Member

radical commented Sep 10, 2024

Build Information

Build: https://dev.azure.com/dnceng/internal/_build/results?buildId=2534744
Build error leg or test failing: Aspire.Hosting.Tests
Pull request: main

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": "at Aspire.Hosting.Tests.SlimTestProgramFixture.WaitReadyStateAsync",
  "ErrorPattern": "",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng/internal/_build/results?buildId=2534744
Error message validated: [at Aspire.Hosting.Tests.SlimTestProgramFixture.WaitReadyStateAsync]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 9/10/2024 2:19:58 AM UTC

Report

Build Definition Test Pull Request
803024 dotnet/aspire Aspire.Hosting.Tests.SlimTestProgramTests.TestProjectStartsAndStopsCleanly #5505
2534744 dotnet-aspire Aspire.Hosting.Tests.SlimTestProgramTests.TestProjectStartsAndStopsCleanly

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
2 2 2

Failing tests:

  • Aspire.Hosting.Tests.SlimTestProgramTests.TestProjectStartsAndStopsCleanly
  • Aspire.Hosting.Tests.SlimTestProgramTests.TestPortOnEndpointAnnotationAndAllocatedEndpointAnnotationMatch
  • Aspire.Hosting.Tests.SlimTestProgramTests.TestPortOnEndpointAnnotationAndAllocatedEndpointAnnotationMatchForReplicatedServices

Note: failing only on Windows.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-engineering-systems infrastructure helix infra engineering repo stuff label Sep 10, 2024
@radical radical added the blocking-clean-ci Blocking a green CI label Sep 10, 2024
@radical
Copy link
Member Author

radical commented Sep 10, 2024

Stack trace:

System.Net.Http.HttpRequestException : No connection could be made because the target machine actively refused it. (localhost:5156)
---- System.Net.Sockets.SocketException : No connection could be made because the target machine actively refused it.

Stack trace
at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.AddHttp11ConnectionAsync(QueueItem queueItem)
at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.GetStringAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
[at Aspire.Hosting.Tests.SlimTestProgramFixture.WaitReadyStateAsync(CancellationToken cancellationToken) in /_/tests/Aspire.Hosting.Tests/TestProgramFixture.cs:line 67](https://dev.azure.com/dnceng/internal/_git/ecc7dc1d-c809-460e-8d0f-6712313d5180?path=%2F_%2Ftests%2FAspire.Hosting.Tests%2FTestProgramFixture.cs&version=GBmain&_a=contents&line=67&lineEnd=68&lineStartColumn=1&lineEndColumn=1&lineStyle=plain)
[at Aspire.Hosting.Tests.TestProgramFixture.InitializeAsync() in /_/tests/Aspire.Hosting.Tests/TestProgramFixture.cs:line 35](https://dev.azure.com/dnceng/internal/_git/ecc7dc1d-c809-460e-8d0f-6712313d5180?path=%2F_%2Ftests%2FAspire.Hosting.Tests%2FTestProgramFixture.cs&version=GBmain&_a=contents&line=35&lineEnd=36&lineStartColumn=1&lineEndColumn=1&lineStyle=plain)
----- Inner Stack Trace -----
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|285_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)

@radical
Copy link
Member Author

radical commented Sep 10, 2024

Failing tests:

  • Aspire.Hosting.Tests.SlimTestProgramTests.TestProjectStartsAndStopsCleanly
  • Aspire.Hosting.Tests.SlimTestProgramTests.TestPortOnEndpointAnnotationAndAllocatedEndpointAnnotationMatch
  • Aspire.Hosting.Tests.SlimTestProgramTests.TestPortOnEndpointAnnotationAndAllocatedEndpointAnnotationMatchForReplicatedServices

@radical
Copy link
Member Author

radical commented Sep 10, 2024

cc @mitchdenny

@radical
Copy link
Member Author

radical commented Sep 10, 2024

These tests are failing with different traces. This is a different one (failure):

System.Net.Http.HttpRequestException : An error occurred while sending the request.
---- System.IO.IOException : Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
-------- System.Net.Sockets.SocketException : An existing connection was forcibly closed by the remote host.

   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.GetStringAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
   at Aspire.Hosting.Tests.SlimTestProgramFixture.WaitReadyStateAsync(CancellationToken cancellationToken) in /_/tests/Aspire.Hosting.Tests/TestProgramFixture.cs:line 67
   at Aspire.Hosting.Tests.TestProgramFixture.InitializeAsync() in /_/tests/Aspire.Hosting.Tests/TestProgramFixture.cs:line 35
----- Inner Stack Trace -----
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
----- Inner Stack Trace -----

@davidfowl
Copy link
Member

Do we have server logs?

radical added a commit that referenced this issue Sep 10, 2024
…ts (#5640)

* [tests] Disable flaky `Aspire.Hosting.Tests.SlimTestProgramTests` tests

Issue: #5637

* disable tests only on windows
@radical
Copy link
Member Author

radical commented Sep 10, 2024

Do we have server logs?

From the build artifacts:

info: Aspire.Hosting.Dcp.start-apiserver.api-server[0]
      Starting API server...
info: Aspire.Hosting.Dcp.start-apiserver.api-server[0]
      API server started        {"Address": "::1", "Port": 52361}
info: Aspire.Hosting.Dcp.start-apiserver.dcp-host[0]
      Starting DCP controller host
info: Aspire.Hosting.Dcp.start-apiserver.dcp-host[0]
      Started all services      {"count": 1}
$ENDPOINTS: {"servicea":{"Endpoints":[{"Name":"http","Uri":"http://localhost:5156"}]},"serviceb":{"Endpoints":[{"Name":"http","Uri":"http://localhost:5254"}]},"servi
cec":{"Endpoints":[{"Name":"http","Uri":"http://localhost:5271"}]},"workera":{"Endpoints":[]}}
info: Aspire.Hosting.DistributedApplication[0]
      Distributed application started. Press Ctrl+C to shut down.
info: System.Net.Http.HttpClient.Default.LogicalHandler[100]
      Start processing HTTP request GET http://localhost:5156/
info: System.Net.Http.HttpClient.Default.ClientHandler[100]
      Sending HTTP request GET http://localhost:5156/
[xUnit.net 00:00:16.35]     Aspire.Hosting.Tests.SlimTestProgramTests.TestPortOnEndpointAnnotationAndAllocatedEndpointAnnotationMatchForReplicatedServices [FAIL]
[xUnit.net 00:00:16.35]       System.Net.Http.HttpRequestException : No connection could be made because the target machine actively refused it. (localhost:5156)

We need better logs for this one.

@davidfowl
Copy link
Member

Let's up the log level start a PR with resource logs so we can investigate.

radical added a commit to radical/aspire that referenced this issue Sep 10, 2024
The HttpClient used in the tests don't have resilience, so if
`TestProgram` takes longer to startup, then http requests to the
services can fail.

Instead, explicitly wait for `Application started` messages on the
services before sending a request.

Fixes dotnet#5637
@radical
Copy link
Member Author

radical commented Sep 10, 2024

Based on the logs I think the issue here is that the httpclient being used does not have resilience, and we can end up sending the http request before the services are ready.

Trying a fix in #5668 .

@davidfowl
Copy link
Member

I wonder if WaitFor can make our tests more reliable.

@radical radical removed the blocking-clean-ci Blocking a green CI label Sep 10, 2024
radical added a commit to radical/aspire that referenced this issue Sep 11, 2024
…ts (dotnet#5640)

* [tests] Disable flaky `Aspire.Hosting.Tests.SlimTestProgramTests` tests

Issue: dotnet#5637

* disable tests only on windows
@github-actions github-actions bot locked and limited conversation to collaborators Oct 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-engineering-systems infrastructure helix infra engineering repo stuff disabled-tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants