fix: make HashUtil::HashBytes more platform independent #814
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://github.com/polyominal/bustub/blob/9988c8503cc3e5107cc750c651ebeaef2efe8ce6/src/include/common/util/hash_util.h#L32-L39
The hash function uses raw
bytes[i]
inchar
. This creates a subtle bug. C++ doesn't define whetherchar
is signed or unsigned.For byte values over 127, this matters:
This bit me during HyperLogLog project:
https://github.com/polyominal/bustub/blob/9988c8503cc3e5107cc750c651ebeaef2efe8ce6/test/primer/hyperloglog_test.cpp#L58-L108
My code failed on my machine (unsigned
char
) but passed elsewhere (signedchar
).Proposed fix: Cast to
int8_t
for consistent behavior.This should enable more machines to work. Any downsides to consider?