Spatial aggregation for async instruments with filtering views #7264

fandreuz · 2025-04-09T22:51:48Z

I propose the following changes:

AsynchronousMetricStorage now declares void record(Attributes, long) and void record(Attributes, double) instead of a single void record(Measurement), to enable using AggregatorHandle within the class.
AsynchronousMetricStorage now contains two long fields (startEpochNanos, epochNanos) which are injected before running the callbacks, as an alternative to passing a Measurement object.
Some tests had to change due to the change in the interface of AsynchronousMetricStorage

linux-foundation-easycla · 2025-04-09T22:51:55Z

❌ - login: @fandreuz / name: Francesco Andreuzzi . The commit (ab8bd4d, 0c63775, e942dc2, ed783f9, 6816f91) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please submit a support request ticket.

fandreuz · 2025-04-09T23:06:21Z

I'd like to discuss the test here. I would expect a cumulative sum instrument to aggregate all occurrences with empty attributes. Thus, I would expect to see 6 here instead of 3.

To preserve this behavior, I clear the values for all the attributes at every collection, which is why AsynchronousMetricStorageTest#collect_reusableData_reusedObjectsAreReturnedOnSecondCall is failing. I'd like to clarify this aspect before fixing it.

jack-berg · 2025-04-14T23:10:57Z

I'd like to discuss the test here. I would expect a cumulative sum instrument to aggregate all occurrences with empty attributes. Thus, I would expect to see 6 here instead of 3.

~~Yes I believe you're correct. The test / logic seem appear to be incorrect currently~~

Oh I forgot about a fundamental detail of asynchronous instruments: conceptually, when you make an observation with a cumulative instrument, we say that you are observing a cumulative value. Thus, if we have a reader with cumulative temporality which reads twice and invokes the callbacks twice, and the callback records the same value 3 each time, the reported value should indeed be 3 and not 6.

Consider it from the perspective of this instrumentation which records information about classes loaded. If in sequential reads the callback records that 100 classes have been loaded, the output of a cumulative reader should always be 100, not N*100 (where N is the number of times the callback has been invoked). Instrumentation is not responsible for managing state and recording only the delta.

jack-berg

Thanks for working on this. This is pretty complex code when you consider the combinations of cumulative + delta, reusable + immutable data mode, and our desire to maintain zero memory allocations with reusable data mode after the application reaches steady state.

So reviews need to be slow / careful, but I am definitely interested in getting this issue resolved!

jack-berg · 2025-04-14T22:22:06Z

...rics/src/main/java/io/opentelemetry/sdk/metrics/internal/state/SdkObservableMeasurement.java

    for (AsynchronousMetricStorage<?, ?> storage : storages) {
      if (storage.getRegisteredReader().equals(activeReader)) {
-        storage.record(measurement);
+        storage.setEpochInformation(startEpochNanos, epochNanos);
+        storage.record(attributes, value);


Can we combine these into a single operation?

jack-berg · 2025-04-14T22:42:05Z

...ics/src/main/java/io/opentelemetry/sdk/metrics/internal/state/AsynchronousMetricStorage.java

+      return MetricStorage.CARDINALITY_OVERFLOW;
+    }
+
+    if (aggregatorHandles.containsKey(


I don't think this log message makes sense anymore: Previously we knew that that the instrument itself had recorded multiple values for the same attributes. Now we have know way of telling whether it was the instrumentation that recorded the same attributes, or that the attributes were the same after applying the view attribute filter.

We could continue this logging by maintaining state that tracks the unique original unfiltered attributes recorded, but that seems too expensive.

jack-berg · 2025-04-14T22:56:06Z

...ics/src/main/java/io/opentelemetry/sdk/metrics/internal/state/AsynchronousMetricStorage.java

    } else {
-      lastPoints = new HashMap<>();
-      points = new HashMap<>();


It looks like you've replaced the reused points hash map with a local variable Map<Attributes, T> currentPoints = new HashMap<>(); in the collect method.

We've got to reuse resources to continue to achieve our goal not allocating any additional memory once the SDK reaches steady state. Details available here: https://opentelemetry.io/blog/2024/java-metric-systems-compared/#opentelemetry-java-metrics

...ics/src/main/java/io/opentelemetry/sdk/metrics/internal/state/AsynchronousMetricStorage.java

jack-berg · 2025-04-14T23:07:35Z

.../src/test/java/io/opentelemetry/sdk/metrics/internal/state/SdkObservableMeasurementTest.java

-      assertThat(passedMeasurement.longValue()).isEqualTo(5);
-      assertThat(passedMeasurement.startEpochNanos()).isEqualTo(0);
-      assertThat(passedMeasurement.epochNanos()).isEqualTo(10);
+      verify(mockAsyncStorage1).record(Attributes.empty(), 5);


Let's retain the asserts to verify the epoch informatinon was set. Applies to other tests as well.

Suggested change

verify(mockAsyncStorage1).record(Attributes.empty(), 5);

verify(mockAsyncStorage1).setEpochInformation(0, 10);

verify(mockAsyncStorage1).record(Attributes.empty(), 5);

Co-authored-by: jack-berg <[email protected]>

fandreuz added 4 commits April 9, 2025 00:55

wip

6816f91

fix tests

ed783f9

keep ref value locally

0c63775

comments

ab8bd4d

fandreuz requested a review from a team as a code owner April 9, 2025 22:51

jack-berg reviewed Apr 14, 2025

View reviewed changes

review comment

e942dc2

Co-authored-by: jack-berg <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spatial aggregation for async instruments with filtering views #7264

Spatial aggregation for async instruments with filtering views #7264

fandreuz commented Apr 9, 2025 •

edited

Loading

linux-foundation-easycla bot commented Apr 9, 2025 •

edited

Loading

fandreuz commented Apr 9, 2025 •

edited

Loading

jack-berg commented Apr 14, 2025 •

edited

Loading

jack-berg left a comment

jack-berg Apr 14, 2025

jack-berg Apr 14, 2025

jack-berg Apr 14, 2025

jack-berg Apr 14, 2025

	verify(mockAsyncStorage1).record(Attributes.empty(), 5);
	verify(mockAsyncStorage1).setEpochInformation(0, 10);
	verify(mockAsyncStorage1).record(Attributes.empty(), 5);

Spatial aggregation for async instruments with filtering views #7264

Are you sure you want to change the base?

Spatial aggregation for async instruments with filtering views #7264

Conversation

fandreuz commented Apr 9, 2025 • edited Loading

linux-foundation-easycla bot commented Apr 9, 2025 • edited Loading

fandreuz commented Apr 9, 2025 • edited Loading

jack-berg commented Apr 14, 2025 • edited Loading

jack-berg left a comment

Choose a reason for hiding this comment

jack-berg Apr 14, 2025

Choose a reason for hiding this comment

jack-berg Apr 14, 2025

Choose a reason for hiding this comment

jack-berg Apr 14, 2025

Choose a reason for hiding this comment

jack-berg Apr 14, 2025

Choose a reason for hiding this comment

fandreuz commented Apr 9, 2025 •

edited

Loading

linux-foundation-easycla bot commented Apr 9, 2025 •

edited

Loading

fandreuz commented Apr 9, 2025 •

edited

Loading

jack-berg commented Apr 14, 2025 •

edited

Loading