Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Introduce Pre-copy Merged Segment into Segment Replication #17528

Open
guojialiang92 opened this issue Mar 6, 2025 · 1 comment
Open
Labels
enhancement Enhancement or improvement to existing feature or request Other untriaged

Comments

@guojialiang92
Copy link

guojialiang92 commented Mar 6, 2025

Is your feature request related to a problem? Please describe

Segment Replication copies the segment held by the primary shard to the replica shard, reducing the build segment overhead of the replica. One of the costs of doing this compared to document replication is the added visibility delay between the primary and the replica. The replica needs to wait for segment replication to complete before searching the doc in the segment.
Currently, the segment replication process includes two types of segments, one generated by merge and the other built by refresh for incremental indexing. Assuming a merged segment is 1GB with a transmission bandwidth of 50MB/s, the segment replication process including this segment will take at least 20s. If there are larger merge segments and more shards occupying the transmission bandwidth, the segment replication process will last longer.

Describe the solution you'd like

Introduction

This RFC introduces the optimization of segment replication, which uses Lucene's IndexWriter.IndexReaderWarmer to pre-copy the merged segment to the replica. It can effectively reduce the delay time before the documents are visible for searching in the replica.

Background

Lucene supports extending IndexWriter.IndexReaderWarmer. After the segment files of merge are generated, IndexWriter.IndexReaderWarmer will be called. The merged segment cannot be searched until the IndexWriter.IndexReaderWarmer is completed. If segment replication is enabled, the merged segment will only participate in the segment replication process after IndexWriter.IndexReaderWarmer is completed.

Proposed Solution

For easy understanding, let's first explain the current segment replication process. segment(_3.si) is generated by segment(_1.si) and segment(_2.si) merge, and segment(_4.si) is generated by refresh. During the segment replication process, they will be replicated together to the replica. If segment(_3.si) is large, the replication process will take a long time, and the docs contained in segment(_4.si) will be invisible to the replica for a long time.

Image

After introducing the Pre-copy Merged Segment, the primary will pre-copy segment(_3.si) to the replica.
Discuss two situations separately:
In the first case, pre-copy is done before segment replication. After refresh, segment(_3.si) and segment(_4.si) are copied to the replica through segment replication. Because the replica already holds the files of segment(_3.si) , these files will be reused during segment replication without network transmission.

Image

The second case is that the pre-copy is done after segment replication. After refresh, the segment(_3.si) is still not visible in the primary, and only the segment(_4.si) is copied to the replica through segment replication.

Image

In the above case, the segment replication process does not include segment(_3.si), reducing the time overhead of segment replication.

Implementation Approaches

Extend IndexReaderWarmer#warm. After the merge generates the files, process as follows:

  1. The primary dispatches the merged segment information to all replica nodes
  2. The replica pulls the required files from the primary
  3. After the replica receives all files or an exception occurs, it returns the response to the primary
  4. After the primary receives all replicas' response or times out, it completes the pre-copy merge segment process

Image

Failover

The worst case scenario is that the pre-copy merge segment process encounters an exception or timeout, which will fall back to the current situation. The merged segment is copied to the replica through segment replication.

Related component

Other

Describe alternatives you've considered

No response

Additional context

No response

@guojialiang92 guojialiang92 added enhancement Enhancement or improvement to existing feature or request untriaged labels Mar 6, 2025
@github-actions github-actions bot added the Other label Mar 6, 2025
@shwetathareja
Copy link
Member

@guojialiang92 : If I understood your proposal, you are suggesting to copy the merged segment _3.si files first to replicas but dont update the checkpoint to track _3.si yet.

Also, separately transfer of _4.si can be prioritized over _3.si pre-copy so that latest segments are available for search sooner.

Adding @ashking94 to the conversation as he is looking into similar improvements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Other untriaged
Projects
None yet
Development

No branches or pull requests

2 participants