You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe
Segment Replication copies the segment held by the primary shard to the replica shard, reducing the build segment overhead of the replica. One of the costs of doing this compared to document replication is the added visibility delay between the primary and the replica. The replica needs to wait for segment replication to complete before searching the doc in the segment.
Currently, the segment replication process includes two types of segments, one generated by merge and the other built by refresh for incremental indexing. Assuming a merged segment is 1GB with a transmission bandwidth of 50MB/s, the segment replication process including this segment will take at least 20s. If there are larger merge segments and more shards occupying the transmission bandwidth, the segment replication process will last longer.
Describe the solution you'd like
Introduction
This RFC introduces the optimization of segment replication, which uses Lucene's IndexWriter.IndexReaderWarmer to pre-copy the merged segment to the replica. It can effectively reduce the delay time before the documents are visible for searching in the replica.
Background
Lucene supports extending IndexWriter.IndexReaderWarmer. After the segment files of merge are generated, IndexWriter.IndexReaderWarmer will be called. The merged segment cannot be searched until the IndexWriter.IndexReaderWarmer is completed. If segment replication is enabled, the merged segment will only participate in the segment replication process after IndexWriter.IndexReaderWarmer is completed.
Proposed Solution
For easy understanding, let's first explain the current segment replication process. segment(_3.si) is generated by segment(_1.si) and segment(_2.si) merge, and segment(_4.si) is generated by refresh. During the segment replication process, they will be replicated together to the replica. If segment(_3.si) is large, the replication process will take a long time, and the docs contained in segment(_4.si) will be invisible to the replica for a long time.
After introducing the Pre-copy Merged Segment, the primary will pre-copy segment(_3.si) to the replica.
Discuss two situations separately:
In the first case, pre-copy is done before segment replication. After refresh, segment(_3.si) and segment(_4.si) are copied to the replica through segment replication. Because the replica already holds the files of segment(_3.si) , these files will be reused during segment replication without network transmission.
The second case is that the pre-copy is done after segment replication. After refresh, the segment(_3.si) is still not visible in the primary, and only the segment(_4.si) is copied to the replica through segment replication.
In the above case, the segment replication process does not include segment(_3.si), reducing the time overhead of segment replication.
Implementation Approaches
Extend IndexReaderWarmer#warm. After the merge generates the files, process as follows:
The primary dispatches the merged segment information to all replica nodes
The replica pulls the required files from the primary
After the replica receives all files or an exception occurs, it returns the response to the primary
After the primary receives all replicas' response or times out, it completes the pre-copy merge segment process
Failover
The worst case scenario is that the pre-copy merge segment process encounters an exception or timeout, which will fall back to the current situation. The merged segment is copied to the replica through segment replication.
Related component
Other
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
@guojialiang92 : If I understood your proposal, you are suggesting to copy the merged segment _3.si files first to replicas but dont update the checkpoint to track _3.si yet.
Also, separately transfer of _4.si can be prioritized over _3.si pre-copy so that latest segments are available for search sooner.
Adding @ashking94 to the conversation as he is looking into similar improvements
Is your feature request related to a problem? Please describe
Segment Replication copies the segment held by the primary shard to the replica shard, reducing the build segment overhead of the replica. One of the costs of doing this compared to document replication is the added visibility delay between the primary and the replica. The replica needs to wait for segment replication to complete before searching the doc in the segment.
Currently, the segment replication process includes two types of segments, one generated by merge and the other built by refresh for incremental indexing. Assuming a merged segment is 1GB with a transmission bandwidth of 50MB/s, the segment replication process including this segment will take at least 20s. If there are larger merge segments and more shards occupying the transmission bandwidth, the segment replication process will last longer.
Describe the solution you'd like
Introduction
This RFC introduces the optimization of segment replication, which uses Lucene's
IndexWriter.IndexReaderWarmer
to pre-copy the merged segment to the replica. It can effectively reduce the delay time before the documents are visible for searching in the replica.Background
Lucene supports extending
IndexWriter.IndexReaderWarmer
. After the segment files of merge are generated,IndexWriter.IndexReaderWarmer
will be called. The merged segment cannot be searched until theIndexWriter.IndexReaderWarmer
is completed. If segment replication is enabled, the merged segment will only participate in the segment replication process afterIndexWriter.IndexReaderWarmer
is completed.Proposed Solution
For easy understanding, let's first explain the current segment replication process. segment(_3.si) is generated by segment(_1.si) and segment(_2.si) merge, and segment(_4.si) is generated by refresh. During the segment replication process, they will be replicated together to the replica. If segment(_3.si) is large, the replication process will take a long time, and the docs contained in segment(_4.si) will be invisible to the replica for a long time.
After introducing the Pre-copy Merged Segment, the primary will pre-copy segment(_3.si) to the replica.
Discuss two situations separately:
In the first case, pre-copy is done before segment replication. After refresh, segment(_3.si) and segment(_4.si) are copied to the replica through segment replication. Because the replica already holds the files of segment(_3.si) , these files will be reused during segment replication without network transmission.
The second case is that the pre-copy is done after segment replication. After refresh, the segment(_3.si) is still not visible in the primary, and only the segment(_4.si) is copied to the replica through segment replication.
In the above case, the segment replication process does not include segment(_3.si), reducing the time overhead of segment replication.
Implementation Approaches
Extend
IndexReaderWarmer#warm
. After the merge generates the files, process as follows:Failover
The worst case scenario is that the pre-copy merge segment process encounters an exception or timeout, which will fall back to the current situation. The merged segment is copied to the replica through segment replication.
Related component
Other
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: