Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track queued merges in ElasticsearchMergeScheduler and InternalEngine #121794

Closed
wants to merge 1 commit into from

Conversation

fcofdez
Copy link
Contributor

@fcofdez fcofdez commented Feb 5, 2025

This commit adds tracking for merges that are queued for future execution.

Relates ES-10570

This commit adds tracking for merges that are queued for future
execution.

Relates ES-10570
@fcofdez fcofdez added >non-issue :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. Team:Distributed Indexing Meta label for Distributed Indexing team labels Feb 5, 2025
@elasticsearchmachine elasticsearchmachine added serverless-linked Added by automation, don't add manually v9.1.0 labels Feb 5, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@fcofdez fcofdez requested review from tlrx and arteam February 5, 2025 16:57
Copy link
Contributor

@arteam arteam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -41,6 +41,9 @@ public class MergeTracking {
private final Set<OnGoingMerge> onGoingMerges = ConcurrentCollections.newConcurrentSet();
private final Set<OnGoingMerge> readOnlyOnGoingMerges = Collections.unmodifiableSet(onGoingMerges);

private final Set<OnGoingMerge> queuedMerges = ConcurrentCollections.newConcurrentSet();
private final Set<OnGoingMerge> readOnlyQueuedMerges = Collections.unmodifiableSet(queuedMerges);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting idea! I guess we can piggy back on the fact that UnmodifiableCollection is just a shell and the changes to queuedMerges are guaranteed to be visible via the readOnlyQueuedMerges reference.

return readOnlyQueuedMerges;
}

public void markMergeQueued(OnGoingMerge merge) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see MergeTracking being used in stateless but also in stateful (InternalEngine). So I wonder why we do not call this function also for the stateful merges? An oversight? Or is it because merges cannot be queued in stateful and they just immediately execute?

If we do not mark queued merges in core ES, at least we should document it in the javadoc of markMergeQueued, hasQueuedOrOnGoingMerges() and hasQueuedMerges(), so at least people are not misled to use the functions in stateful without understanding that queued merges are not there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is it because merges cannot be queued in stateful and they just immediately execute?

In stateful merges get executed immediately in their own new thread (

@Override
protected MergeThread getMergeThread(MergeSource mergeSource, MergePolicy.OneMerge merge) throws IOException {
MergeThread thread = super.getMergeThread(mergeSource, merge);
thread.setName(
EsExecutors.threadName(indexSettings, "[" + shardId.getIndexName() + "][" + shardId.id() + "]: " + thread.getName())
);
return thread;
}
).

If we do not mark queued merges in core ES, at least we should document it in the javadoc of markMergeQueued, hasQueuedOrOnGoingMerges() and hasQueuedMerges(), so at least people are not misled to use the functions in stateful without understanding that queued merges are not there.

Sure, I'll add a javadoc and mention that it's implementation dependant 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh OK thanks for explaining! Yea javadoc to maybe explain what you just said would be nice so that people know in stateful there are no queued merges.

return readOnlyQueuedMerges;
}

public void markMergeQueued(OnGoingMerge merge) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh OK thanks for explaining! Yea javadoc to maybe explain what you just said would be nice so that people know in stateful there are no queued merges.

@fcofdez
Copy link
Contributor Author

fcofdez commented Feb 6, 2025

I'm closing this in favour of a simpler solution.

@fcofdez fcofdez closed this Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >non-issue serverless-linked Added by automation, don't add manually Team:Distributed Indexing Meta label for Distributed Indexing team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants