Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/filelog] Fix issue where flushed tokens could be truncated #37596

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

djaglowski
Copy link
Member

@djaglowski djaglowski commented Jan 30, 2025

Fixes #35042 (and #32100 again)

The issue affected unterminated logs of particular lengths. Specifically, longer than our internal scanner.DefaultBufferSize (16kB) and shorter than max_log_size.

The failure mode was described in #32100 but was apparently only fixed in some circumstances. I believe this is a more robust fix. I'll articulate the exact failure mode again here:

  1. During a poll cycle, reader.ReadToEnd is called. Within this, a scanner is created which starts with a default buffer size. The buffer is filled, but no terminator is found. Therefore the scanner resizes the buffer to accommodate more data, hoping to find a terminator. Eventually, the buffer is large enough to contain all content until EOF, but still no terminator was found. At this time, the flush timer has not expired, so reader.ReadToEnd returns without emitting anything.
  2. During the next poll cycle, reader.ReadToEnd creates a new scanner, also with default buffer size. The first time is looks for a terminator, it of course doesn't find one, but at this time the flush timer has expired. Therefore, instead of resizing the buffer and continuing to look for a terminator, it just emits what it has.

What should happen instead is the scanner continues to resize the buffer to find as much of the unterminated token as possible before emitting it. Therefore, this fix introduces a simple layer into the split func stack which allows us to reason about unterminated tokens more carefully. It captures the length of unterminated tokens and ensures that when we recreate a scanner, we will start with a buffer size that is appropriate to read the same content as last time, plus one additional byte. The extra byte allows us to check if new content has been added, in which case we will resume resizing. If no new content is found, the flusher will emit the entire unterminated token as one.

@djaglowski
Copy link
Member Author

Depends on #37596

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long lines are unexpectedly split into multiple OTEL Log records
1 participant