Skip to content

Spatial aggregation for async instruments with filtering views #7264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

fandreuz
Copy link

@fandreuz fandreuz commented Apr 9, 2025

Fixes #4901

I propose the following changes:

  • AsynchronousMetricStorage now declares void record(Attributes, long) and void record(Attributes, double) instead of a single void record(Measurement), to enable using AggregatorHandle within the class.
  • AsynchronousMetricStorage now contains two long fields (startEpochNanos, epochNanos) which are injected before running the callbacks, as an alternative to passing a Measurement object.
  • Some tests had to change due to the change in the interface of AsynchronousMetricStorage

@fandreuz fandreuz requested a review from a team as a code owner April 9, 2025 22:51
Copy link

linux-foundation-easycla bot commented Apr 9, 2025

CLA Not Signed

@fandreuz
Copy link
Author

fandreuz commented Apr 9, 2025

I'd like to discuss the test here. I would expect a cumulative sum instrument to aggregate all occurrences with empty attributes. Thus, I would expect to see 6 here instead of 3.

To preserve this behavior, I clear the values for all the attributes at every collection, which is why AsynchronousMetricStorageTest#collect_reusableData_reusedObjectsAreReturnedOnSecondCall is failing. I'd like to clarify this aspect before fixing it.

@jack-berg
Copy link
Member

jack-berg commented Apr 14, 2025

I'd like to discuss the test here. I would expect a cumulative sum instrument to aggregate all occurrences with empty attributes. Thus, I would expect to see 6 here instead of 3.

Yes I believe you're correct. The test / logic seem appear to be incorrect currently

Oh I forgot about a fundamental detail of asynchronous instruments: conceptually, when you make an observation with a cumulative instrument, we say that you are observing a cumulative value. Thus, if we have a reader with cumulative temporality which reads twice and invokes the callbacks twice, and the callback records the same value 3 each time, the reported value should indeed be 3 and not 6.

Consider it from the perspective of this instrumentation which records information about classes loaded. If in sequential reads the callback records that 100 classes have been loaded, the output of a cumulative reader should always be 100, not N*100 (where N is the number of times the callback has been invoked). Instrumentation is not responsible for managing state and recording only the delta.

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. This is pretty complex code when you consider the combinations of cumulative + delta, reusable + immutable data mode, and our desire to maintain zero memory allocations with reusable data mode after the application reaches steady state.

So reviews need to be slow / careful, but I am definitely interested in getting this issue resolved!

for (AsynchronousMetricStorage<?, ?> storage : storages) {
if (storage.getRegisteredReader().equals(activeReader)) {
storage.record(measurement);
storage.setEpochInformation(startEpochNanos, epochNanos);
storage.record(attributes, value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we combine these into a single operation?

return MetricStorage.CARDINALITY_OVERFLOW;
}

if (aggregatorHandles.containsKey(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this log message makes sense anymore: Previously we knew that that the instrument itself had recorded multiple values for the same attributes. Now we have know way of telling whether it was the instrumentation that recorded the same attributes, or that the attributes were the same after applying the view attribute filter.

We could continue this logging by maintaining state that tracks the unique original unfiltered attributes recorded, but that seems too expensive.

} else {
lastPoints = new HashMap<>();
points = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you've replaced the reused points hash map with a local variable Map<Attributes, T> currentPoints = new HashMap<>(); in the collect method.

We've got to reuse resources to continue to achieve our goal not allocating any additional memory once the SDK reaches steady state. Details available here: https://opentelemetry.io/blog/2024/java-metric-systems-compared/#opentelemetry-java-metrics

assertThat(passedMeasurement.longValue()).isEqualTo(5);
assertThat(passedMeasurement.startEpochNanos()).isEqualTo(0);
assertThat(passedMeasurement.epochNanos()).isEqualTo(10);
verify(mockAsyncStorage1).record(Attributes.empty(), 5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's retain the asserts to verify the epoch informatinon was set. Applies to other tests as well.

Suggested change
verify(mockAsyncStorage1).record(Attributes.empty(), 5);
verify(mockAsyncStorage1).setEpochInformation(0, 10);
verify(mockAsyncStorage1).record(Attributes.empty(), 5);

Co-authored-by: jack-berg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Async instruments don't do spatial reaggregation when attributes are dropped
2 participants