Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test (and sensitive to execution order): KafkaContextPropagationTest #46047

Open
holly-cummins opened this issue Feb 3, 2025 · 2 comments
Labels
area/housekeeping Issue type for generalized tasks not related to bugs or enhancements area/kafka env/windows Impacts Windows machines

Comments

@holly-cummins
Copy link
Contributor

holly-cummins commented Feb 3, 2025

Description

I've noticed some flakiness in KafkaContextPropagationTest. I can make test failures more or less likely by setting an explicit execution order, so I think there's some sensitivity to execution order in the tests.

The failure moves around a bit between Java 17 and Java 21 jobs, depending on the order I set, and it also moves between test methods a bit. I cannot reproduce it locally on my mac, and it never turns up on Windows.

Note that because we rerun flaky tests in CI, this failure is unlikely to mark a build red (unless the codebase is affected by #46048)

The failure is (for example):

2025-02-03T19:28:31.5498131Z [WARNING] Flakes: 
2025-02-03T19:28:31.5499122Z [WARNING] io.quarkus.it.kafka.KafkaContextPropagationTest.testAbsenceOfContextPropagation
2025-02-03T19:28:31.5500579Z [ERROR]   Run 1: KafkaContextPropagationTest.testAbsenceOfContextPropagation:81 1 expectation failed.
2025-02-03T19:28:31.5502997Z Expected status code <500> but was <204>.

An order which reproduces failures (maybe 1 in 8 times?): https://github.com/holly-cummins/quarkus/blob/3c71670e320eaf2fff5a69857946bce9e35f1d02/integration-tests/reactive-messaging-context-propagation/src/test/java/io/quarkus/it/kafka/KafkaContextPropagationTest.java

An order which makes success very likely: https://github.com/holly-cummins/quarkus/blob/b5b544e40061d433b58e9ba93e51ce6e6b7a85f3/integration-tests/reactive-messaging-context-propagation/src/test/java/io/quarkus/it/kafka/KafkaContextPropagationTest.java (I only put orders on the tests which seemed to need it on this one).

You might find this branch useful for reproducing. I've removed most jobs from the CI so it only runs the affected test: https://github.com/holly-cummins/quarkus/tree/refs/heads/messaging-reactive-order-reproducer

Implementation ideas

No response

@holly-cummins holly-cummins added the area/housekeeping Issue type for generalized tasks not related to bugs or enhancements label Feb 3, 2025
Copy link

quarkus-bot bot commented Feb 3, 2025

/cc @alesj (kafka), @cescoffier (kafka), @ozangunalp (kafka)

@quarkus-bot quarkus-bot bot added the env/windows Impacts Windows machines label Feb 3, 2025
@ozangunalp
Copy link
Contributor

@holly-cummins thanks a lot for looking into this!

I've a branch in which I think I've fixed it. I'll run more tests on my branch regarding what you describe here.

It is early to say where the race condition is actually. It may be the "clear" of context in context propagation.

Btw we've reverted the PR that introduced these tests on the main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/housekeeping Issue type for generalized tasks not related to bugs or enhancements area/kafka env/windows Impacts Windows machines
Projects
Development

No branches or pull requests

2 participants