-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
httpcache vs. Conditional Request vs. X-Varied-Authorization
header
#437
Comments
I'm not sure if this is something that can directly fix in Given the API we expose in As a starting point with current features, maybe try something like this: clientCreator, err := githubapp.NewDefaultCachingClientCreator(
config.Github,
githubapp.WithClientTimeout(60*time.Second),
githubapp.WithClientMiddleware(
githubapp.ClientLogging(zerolog.InfoLevel, githubapp.LogRateLimitInformation(&githubapp.RateLimitLoggingOption{
Limit: true,
Remaining: true,
Used: true,
Reset: true,
Resource: true,
})),
func(next http.RoundTripper) http.RoundTripper {
return &httpcache.Transport{
Transport: next,
Cache: httpTransportCache,
MarkCachedResponses: true,
}
},
),
) Note that this no longer uses That said, the |
Thanks for the hint. I digged a bit into it and also stumbled upon https://github.com/bored-engineer/github-conditional-http-transport. I read through it and you are right. If the claims are true, the "home grown middleware" might not be the full solution. It really looks like a not solvable problem when you rely on Installation tokens and want to avoid any non-stable implementation. About this issue: Not sure if it is worth to keep it open. Feel free to close it. Appreciated your response. |
@bluekeyes I found another small possible idea: Maybe the etag is not the solution, but only the Last-Modified header. See bored-engineer/github-conditional-http-transport#1 |
Yeah, using type PreferLastModified struct {
next http.RoundTripper
}
func (plm *PreferLastModified) RoundTrip(req *http.Request) (*http.Response, error) {
res, err := plm.next.RoundTrip(req)
if res != nil {
// If the response includes a Last-Modified header, remove any ETag
// header to force the cache to use time-based conditional requests
if lastModified := res.Header.Get("last-modified"); lastModified != "" {
res.Header.Del("etag")
}
}
return res, err
}
// Later, when initializing the ClientCreator
clientCreator, err := githubapp.NewDefaultCachingClientCreator(
config.Github,
githubapp.WithClientTimeout(60*time.Second),
githubapp.WithClientMiddleware(
githubapp.ClientLogging(zerolog.InfoLevel, githubapp.LogRateLimitInformation(&githubapp.RateLimitLoggingOption{
Limit: true,
Remaining: true,
Used: true,
Reset: true,
Resource: true,
})),
func(next http.RoundTripper) http.RoundTripper {
return &httpcache.Transport{
Transport: &PreferLastModified{next},
Cache: httpTransportCache,
MarkCachedResponses: true,
}
},
),
) Like my last suggestion, this puts the cache outside of the authentication to avoid seeing the If this all works, it might be something we could support natively in go-githubapp. I think it would need a new option function ( |
Context
I run
go-githubapp
withgithub.com/gregjones/httpcache
(github.com/gregjones/httpcache/diskcache
in particular). I run it in an "Github App context", means: The user installs a Github app, based on this, I get permission to make request:followed by ...
followed by ...
This works great, as expected.
Caching is enabled and it writes and read the cache. However, I also activated logging via
to get details about the usage of the cache:
I discovered that the cache is not working as I expected and as described in Use conditional requests if appropriate.
Caching deep dive
The caching entry of a request contains the response headers like (stripped version)
httpcache
is respecting (and comparing) all headers listed inVary
in https://github.com/gregjones/httpcache/blob/901d90724c7919163f472a9812253fb26761123d/httpcache.go#L160This includes
X-Varied-Authorization
.X-Varied-Authorization
contains the token that has a limited lifetime.Once the token lifetime is reached, it makes an uncached (not conditional) request, as
X-Varied-Authorization
varies, but the content itself does not change at all. In bigger repositories, when you aim to crawl those, you can hit the rate limit pretty quickly.As far as I understand, this behavior is in line with RFC 7234.
Question on
X-Varied-Authorization
In this usecase (crawling repositories by a Github app), I was wondering:
When the
X-Varied-Authorization
header is not compared in https://github.com/gregjones/httpcache/blob/901d90724c7919163f472a9812253fb26761123d/httpcache.go#L121, the cache hitrate would increase by a lot.Another benefit: You would not cache an auth secret (here, I am not 100% sure if this is a real issue, as the token is not valid anymore).
Implementation
Next thought would be "How would this be implemented?".
The simplest solution would be "to hack this in" by forking
httpcache
and adding a piece of if-condition into thevaryMatches
or related.Another option: Can we somehow "hook" into the process somewhere? I would like to avoid copying the diskcache part, as it receives only a []byte which would require a lot of parsing.
Small runnable example
If requested, I can provide a minimal code example to reproduce it.
Disclaimer
I do understand that this is not exactly a
go-githubapp
issue. Rather a caching issue. However, it is pretty much related to the usecase ofgo-githubapp
.I would appreciate your thoughts on this.
The text was updated successfully, but these errors were encountered: