Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flaky test][exporter/prometheusremotewrite] TestExportWithWALEnabled is flaky on Windows #37715

Closed
pjanotti opened this issue Feb 5, 2025 · 6 comments · Fixed by #37719
Closed

Comments

@pjanotti
Copy link
Contributor

pjanotti commented Feb 5, 2025

Component(s)

exporter/prometheusremotewrite

Describe the issue you're reporting

This shouldn't be falling on non-Windows since in general they allow deleting a folder even if there are still open handles to files in the directory. Anyway, the test TestExportWithWALEnabled, introduced via #37630, failed on main.

https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/13153338018/job/36704858478#step:8:167

=== FAIL: . TestExportWithWALEnabled (1.59s)
    testing.go:1231: TempDir RemoveAll cleanup: remove C:\Users\RUNNER~1\AppData\Local\Temp\TestExportWithWALEnabled2886325455\001\prom_remotewrite\00000000000000000001: The process cannot access the file because it is being used by another process.
Copy link
Contributor

github-actions bot commented Feb 5, 2025

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@ArthurSens
Copy link
Member

Damn, this Windows thing really doesn't like me 😭. Thanks for raising it

@ArthurSens
Copy link
Member

I noticed that other tests (they all touch the WAL) are being skipped due to the same failure. I wonder if this is really something unfixable and we should skip or if there's anything we can do in all those tests.

@dashpole
Copy link
Contributor

dashpole commented Feb 5, 2025

We should skip for now, and open a tracking issue

@pjanotti
Copy link
Contributor Author

pjanotti commented Feb 5, 2025

@ArthurSens it is fixable: it is some race with the WAL settings in which some go routine still using files in the test directory. It is common that on shutdown something is missed and left running. Let me try to scan over it to see what I can find.

@songy23 songy23 closed this as completed in 86430c5 Feb 5, 2025
pjanotti added a commit to pjanotti/opentelemetry-service-contrib that referenced this issue Feb 5, 2025
songy23 pushed a commit that referenced this issue Feb 6, 2025
…ne exits (#37733)

#### Description

When WAL is used the shutdown doesn't wait until the go routine launched
to handle WAL terminates. This makes the shutdown non-deterministic and
cause the test failure issues linked below.

Although a bug on the component it doesn't seem to deserve a changelog
since it was detected via tests and there is no open issue around the
shutdown of the component.

#### Link to tracking issue

Fixes #9124, #37715

#### Testing

Ran the component tests locally multiple times on Windows (more likely
to surface concurrency issues).

#### Documentation

N/A
chengchuanpeng pushed a commit to chengchuanpeng/opentelemetry-collector-contrib that referenced this issue Feb 8, 2025
chengchuanpeng pushed a commit to chengchuanpeng/opentelemetry-collector-contrib that referenced this issue Feb 8, 2025
…ne exits (open-telemetry#37733)

#### Description

When WAL is used the shutdown doesn't wait until the go routine launched
to handle WAL terminates. This makes the shutdown non-deterministic and
cause the test failure issues linked below.

Although a bug on the component it doesn't seem to deserve a changelog
since it was detected via tests and there is no open issue around the
shutdown of the component.

#### Link to tracking issue

Fixes open-telemetry#9124, open-telemetry#37715

#### Testing

Ran the component tests locally multiple times on Windows (more likely
to surface concurrency issues).

#### Documentation

N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants