Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

microbench-ci: benchmark suite changes [dnm] #143129

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

herkolategan
Copy link
Collaborator

@herkolategan herkolategan commented Mar 19, 2025

wip, dnm

Epic: None

yuzefovich and others added 5 commits March 19, 2025 12:40
Given that we only have two scan variants (Scan and ReverseScan) and
will only support two scan formats for now (KEY_VALUES and
BATCH_RESPONSE), it seems cleaner and less verbose to use `if` blocks
rather than `switch` statements. Thus, this commit inlines some of the
helper methods and gets rid of `switch`es. I think it'll be easier to
follow the logic for BATCH_RESPONSE scan format added in the following
commit.

The only somewhat meaningful change is the fact that we now check the
scan format first as opposed to checking whether we have a Scan or
a ReverseScan (this will make logic for BATCH_RESPONSE cleaner)

Release note: None
This commit refactors a couple of helpers that we use when merging Scan
and ReverseScan responses with the buffered values. This is done in
preparation for supporting BATCH_RESPONSE scan format. In particular:
- it introduces a helper struct that is used on the first iteration to
calculate the size of the responses
- it "disassembles" the responses into its contents (i.e. `Rows` field).
This also allows us to remove some code duplication between Scans and
ReverseScans.

It also exposes the need to adjust NumKeys and NumBytes fields of the
response. Addressing that is left as a TODO.

Release note: None
This commit extends the txnWriteBuffer helper structs to support
BATCH_RESPONSE scan format of Scans and ReverseScans. In this format
`BatchResponses` field contains `[][]byte` slices where a single
`[]byte` slice can have multiple KVs in the "engine" format. (Each
`[]byte` is growing in size exponentially, until a hard limit, for large
scans.) The contribution of this commit is two-fold:
- the size helper struct is responsible for calculating the precise
capacity that each `[]byte` is needed in the merged response
- during the main iteration, when accepting the KV from the buffer we
need to write it in the same "engine" format as the KV server did.

For simplicity, this commit keeps the hierarchy of slices in
`BatchResponses` unchanged, i.e. we inject all buffered KVs into the
"current position" of `BatchResponses[i]` from the server response. This
means that in the extreme case when the server response is empty, all
buffered KVs will be included in a single slice. This could be
suboptimial from memory GC perspective, but adding a smarter heuristic
is left as a TODO.

Release note: None
This commit utilizes the respSizeHelper struct to calculate the final
values for `NumKeys` and `NumBytes` fields of merged Scan and ReverseScan
responses when they are produced by the txnWriteBuffer. The existing
test has been extended to verify these values which required some minor
adjustments to correctly simulate already committed expected values
(they have the timestamp set).

Release note: None
@herkolategan herkolategan added the do-not-merge bors won't merge a PR with this label. label Mar 19, 2025
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge bors won't merge a PR with this label.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants