-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace dgraph-io/badger cache storage with etcd-io/bbolt #42571
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
This pull request doesn't have a |
if err != nil { | ||
c.log.Debugf("Key '%s' not found in key-value store", k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but could you log the error instead of saying the key was not found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, done
please check it out
/test |
@elastic/beats-tech-leads / @VihasMakwana could you review this PR? |
Looks like the persistent cache has a few uses related to cloudfoundry including the
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changelog looks good, I also have no experience working with cloudfoundry so I can't be much help there, but we must have tested this in the past.
I see @jsoriano in the Git history so maybe he can give us some leads.
I cannot help with testing as I haven't used CF in years, but I can try to help with the background of this cache. You can find here a summary of the analysis that lead to using badger for this use case: #19511 (comment) The summary-of-the-summary is that we needed it to perform well in clusters with several thousands of applications, and we needed it to cleanup unused entries. Badger fitted better than other alternatives as it performed well under pressure, and it had built-in TTL support. The kind of expiration added in this PR may not work so well for this use case, because it won't remove entries that stop being accessed, as the ones for applications that stop producing events, that is the most common use case when we want entries to be removed here. Other thing to take into account is that |
Btw, maybe we need to add some release notes, to warn users of |
Do you mean I need to add an entry to |
Yes, maybe this is enough as this seems to appear in https://www.elastic.co/guide/en/beats/libbeat/current/release-notes-8.17.2.html |
@jsoriano I've added |
Co-authored-by: Jaime Soriano Pastor <[email protected]>
PR open to try to address the root issue upstream: hypermodeinc/badger#2169 |
Thanks for that! The repo seems fairly active, if we can avoid the rewrite that would be great. |
We definitely need to test this on CloudFoundry before releasing it to anybody, looking through the log of closed SDHs it is definitely still used but I don't know by how many users. That CI doesn't effectively test this is concerning, we are maintaining this by hoping nothing that breaks it ever changes. If we can fix upstream and avoid a potential long tail of support pain here then that could be the best path. Efforts are probably better focused on figuring out how to maintain CloudFoundry support properly first. |
@cmacknz Should I close this PR? |
It definitely sounds like we are not in a position to make big changes to this yet. I'm not officialy a codeowner for cloudfoundry, so if you feel like the risk is too great go ahead and close it. Just because we wrote code doesn't mean we have to keep it :) |
@cmacknz I really wouldn't like to break anything so I guess it would be safer not to merge this PR. That being said I dont feel like having 2 different stores(bbolt and badger) in the codebase for cache isn't ideal. Since using bbolt here might be dangerous here (lack of TTL functionality might cause keeping stale data in the storage) we might want to consider switch from bbolt to badger in other places. |
Fix merged upstream, but I don't know when a version containing it will be released. |
Closing this PR as we are going to update badger once new version is released (that doesn't depend on opencensus, the change was made in this PR) |
Proposed commit message
Replacing dgraph-io/badger persistent storage for key-value cache with etcd-io/bbolt. Originally it was meant to just get rid of
go.opencensus.io
dependency which is introduced by badger (please see parent issue for more details). After it got evident that this won't erasego.opencensus.io
dependency it was decided that this work still should be done since etcd-io/bbolt is already used elsewhere in the project and it isn't a good thing to have multiple storages for cache (again, please see parent issue for more details (in comments)).Implementation should be fairly straight-forward but I would like to clarify one thing - since bolt doesn't support value expiration the expiration time (and TTL) are stored as metadata of the value. Upon value retrieval it is checked for expiration and if it is expired then nil is returned and value gets deleted from bolt DB.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs