-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dsmr: implement chunk rate limiter #1922
Conversation
…o tsachi/ratelimiter
x/dsmr/p2p.go
Outdated
if err := c.storage.CheckRateLimit(chunk); err != nil { | ||
return &common.AppError{ | ||
Code: p2p.ErrUnexpected.Code, | ||
Message: err.Error(), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this above c.storage.GetChunkBytes(...)
?
It's preferred to perform the lightest checks first, so that if a request will fail, we minimize the resources used to fail it. This is not a critical difference since a malicious peer can inspect our code and craft a response that will fail at a specific check, but it's a good practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(unrelated to this PR) I think the above check if the bytes are present is unnecessary.
Since we ignore the result of the query, the only case where it changes our handling is if we hit an unexpected database error (ie. not database.ErrNotFound
). I think we're just wasting a potential database read here.
x/dsmr/storage_test.go
Outdated
// find the chunk that corresponds to this expiry in the chunks slice. | ||
chunkIndex := -1 | ||
for i, chunk := range chunks { | ||
if chunk.Expiry == chunkExpiry { | ||
// found. | ||
chunkIndex = i | ||
break | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use slices.Index
or slices.IndexFunc
helper here, which does the same search for an equivalent value?
x/dsmr/storage_test.go
Outdated
require.NoError(storage.AddLocalChunkWithCert(chunk, nil)) | ||
} | ||
|
||
if testCase.minTime != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding this check creates the possibility of a test case where a field is accidentally ignored ie. minTime = 0
, but there's a non-zero number of accepted chunks, which will be ignored.
Could we remove this possibility by just removing this check and relying on the fact that SetMin(0, nil)
will be a no-op that should be safe to call.
{ | ||
name: "success - accepted block clear previous limit", | ||
expiryTimes: []int64{0, 50, 100}, | ||
newChunkExpiry: 150, | ||
weightLimit: chunkSize * 3, | ||
expectedErr: nil, | ||
minTime: 10, | ||
acceptedChunksExpiry: []int64{50}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should accepting a block clear it from the limit within a window?
I think it should still count against the rate limit because it was consumed within that window.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a matter of framing, right ? Just think about this as the side of unaccepted chunks per producer.
Can you think of any clear advantage of maintaining a separate list of accepted-but-not-yet-expired chunks ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think it's preferable that accepting a chunk does not allow the same validator to consume the resources within that window of time, but this can be changed without a network upgrade, so think this is fine for now.
If we do this over a time window (irrespective of accepted/pending), then we have a clear upper bound on the max amount of chunk storage consumed by a single validator. If we clear the chunks when they are accepted, then we don't have a well defined limit because once the chunks are accepted the validator that produced it can consume more resources.
x/dsmr/storage.go
Outdated
@@ -228,20 +238,24 @@ func (s *ChunkStorage[T]) VerifyRemoteChunk(c Chunk[T]) (*warp.BitSetSignature, | |||
return nil, nil | |||
} | |||
|
|||
// putVerifiedChunk assumes that the given chunk is gurenteed not to surpass the producer's rate limit. | |||
// The rate limit is being checked via a call to CheckRateLimit from BuildChunk as well as by VerifyRemoteChunk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make sure this comment is up to date? This is called to self-regulate within BuildChunk
and it is called in the p2p handler ChunkSignatureRequestVerifier
to rate limit peers, but this is not called in VerifyRemoteChunk
atm.
Also, with the current code, it's not safe to call this within VerifyRemoteChunk
because we call VerifyRemoteChunk
within Accept
. This means that if a peer consumed more bandwidth, this could fail within accept (which would fatal).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
Co-authored-by: aaronbuchwald <[email protected]> Signed-off-by: Tsachi Herman <[email protected]>
Signed-off-by: aaronbuchwald <[email protected]>
What ?
This PR adds chunk rate limiting.
How ?
The
rules
contains now one additional method:The
storage
have an additional function namedCheckRateLimit
that checks if adding a given chunk would breach the rate limit threshold in the following way:Where ?
The new
CheckRateLimit
is called from two places:ChunkSignatureRequestVerifier
, a call was added to ensure that inbound requests would not result in adding a chunk that would exceed the limit.Note
This is a subtle version of #1916, which was time-window aware.