Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify keys on read #495

Merged
merged 12 commits into from
Oct 26, 2024
Merged

Verify keys on read #495

merged 12 commits into from
Oct 26, 2024

Conversation

rhuffy
Copy link
Contributor

@rhuffy rhuffy commented May 9, 2024

Similar to the motivation for our verify_keys_on_write feature, it appears to be possible for a node to respond to a ReadVerb even if it does not think it owns the data. This PR adds an invalidReads metric behind a similar feature flag so we can start to see if this is actually occurring.

This would cause logical corruption if a client were to make a read request on a coordinator with an out of data view of the ring and received a null after committing a write, and could potentially result in resurrected data if an invalid read is written as part of a read repair.

This PR also adds SafeLogging arguments to the existing VerifyKeysOnWrite errors.

Comment on lines 51 to 55
Tracing.trace("Enqueuing response to {}", message.from);
MessagingService.instance().sendReply(reply, id, message.from);

OwnershipVerificationUtils.verifyRead(keyspace, command.key);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this verification is intentionally after enqueuing the reply in order to avoid introducing additional latency on the read path. We should consider moving it earlier and throwing an exception after understanding in what circumstances this violation is logged, if any.

{
if (cacheWasRecentlyRefreshed())
{
throwInvalidMutationException(mutation, keyspace, cachedNaturalEndpoints, pendingEndpoints);
handler.onViolation(keyspace, key, cachedNaturalEndpoints, pendingEndpoints);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an invalid read won't throw an exception here so it will constantly trigger refreshCache on every read, which could potentially be problematic. similarly it will still call handler.onValid(keyspace); at the botton

i think we should define a InvalidReadException and handle it here and below without a refresh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! To avoid depending on the presence of an exception for correct control flow, I just added some returns in the correct place.

{
private static final Logger logger = LoggerFactory.getLogger(OwnershipVerificationUtils.class);

public static final OwnershipVerificationHandler INSTANCE = new MutationVerificationHandler();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why wouldn't this be of type MutationVerificationHandler and similarly for ReadVerificationHandler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline, I'm just following the List l = new ArrayList(); pattern where you always use the most generic interface where possible.

@rhuffy rhuffy merged commit d27feca into palantir-cassandra-2.2.18 Oct 26, 2024
6 checks passed
d-guo pushed a commit that referenced this pull request Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants