You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL;DR
Let's discuss some approaches for actively invalidating cache tiles when sources are changed. We hope to submit a pull request for review once there is some consensus on the approach.
The Problem 🔢
We use and love martin tile server. One issue we face is invalidating tiles when sources are updated with user changes. Our specific use case involves a user editing geometry and saving to a PostgreSQL database. After a user change, we call a redraw on the map layer, but with martin, this would continue to show cached tiles without the change. Any other visitor requests to the cached tiles also don't show the changes either.
Users submit a geometry change that writes to the PostgreSQL.
Developer Use Approaches 👀
First we'll go into how a frontend might request a tile or tile set to be re-fetched, then how it might be implemented in martin.
Versioning
Appending a version "number" to the URL request, when the frontend changes we could increment the version number or timestamp to notify martin to re-fetch a tile or tile set. It's well understood technique for cache busting, be it from the before times. However, martin would need to know about the versions of tiles for any smart re-fetching OR a version change would just be analogous to a boolean flag to remove the tile from the cache and re-fetch from the source. It also complicates URL parsing in martin.
It might look something like: https://www-to-martin-endpoint/mvt?source/z/x/y/version OR https://www-to-martin-endpoint/mvt?source-version/z/x/y etc.
It's worth noting this approach would be handled well by other caches like NGINX or CDN's.
Fingerprinting
Process of calculating a hash on frontend tile and parsing that as a URL request. The main issue of this approach is the frontend doesn't know the fingerprint of the tile or source beforehand. The other issue is vector tiles don't really know about the added feature geometry changes we add on top, at least in our implementation, so calculating an accurate fingerprint of a resulting tile where a geometry crosses more than one tile would be problematic. So this seems like the wrong mechanism for this feature. But happy to explore something like this if anyone has a good idea.
It might look like versioning in practice: https://www-to-martin-endpoint/mvt?source-437b930db84b8079c2dd804a71936b5f/z/x/y/ where fingerprint is the hash hex string attached to source.
It's worth noting this approach would be handled well by other caches like NGINX or CDN's.
Query String
Appending a version or flag on the query parameters in the GET request: ?version=2 OR ?invalidate=true etc. Easy to parse in martin, it could also be a timestamp.
It is not handled well in other caching schemes like some CDN's
Use Cache-Control Header Cache-Control Header value could be used a request value of no-cache or no-store. In this case we would be abusing the meaning of the terms, but both are valid HTTP Header Cache-Control Values. From the frontend developers would modify the TransformRequest in mapbox or similar in other libraries to add the cache-control value in the request. Looking forward to thoughts on this option as I can see a store value that is passed from a result of successful PostgreSQL update would be an approach to setting the value.
Use ETag Header Entity Tags in the header have some neat properties, like passing a case-sensitive prefix of /W to indicate a weak validation on version, but this would have similar issues as versioning and fingerprinting about, martin and the frontend would need to sync on versions. It would be supplied to martin from the front end using the transform request discussed in Cache-Control Header.
Custom Header X-custom-header is deprecated by avoiding any conflicting header names we could choose a terse header such as martin-refetch: true | false | next or similar. Developers would implement the custom header similar to what is described in Cache-Control above.
A New Endpoint
🔬 Potentially we could expand this to a wider discussion about some form of webhook interface that could allow for sources to flag when changes happen to layers or features which would remove tiles from the cache. This would be a powerful addition that could do a lot more than just invalidate tiles, though that would what this discussion would be about specifically.
Implementation Approaches 🛠️
One of the issues we face is invalidating not just one cache but all tiles without the source changes. I think the strategy should be to remove from the cache all "connected" tiles from configuration defined zoom level bounds. For example if a tile request was to be re-fetched, martin would remove the tile from the cache if it exists, then get then return the new tile, while removing a zoom level above and tiles zoom level bellow (if they exist). The fetch has only retrieved the requested tile but cache is then forcing a re-fetch for any new requests for other affected tiles.
As for custom headers, we can add an atrix_web allow_header for the custom name martin-refetch for example. In the DynTileSource we can add a new invalidation expr to pass to the get_or_insert_cached_value macro. Along with a new expr for removing other tiles around the TileCoord.
Webhook
Alternatively adding a new get_webhook factory and routes, with a simple definition of a source name and timestamp/serial of a change. The implementation in the tile cache would need some thought, would it invalidate the source everywhere or calculate the geometry change? Any thoughts on this idea or implementation greatly appreciated.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello All 👋 ,
TL;DR
Let's discuss some approaches for actively invalidating cache tiles when sources are changed. We hope to submit a pull request for review once there is some consensus on the approach.
The Problem 🔢
We use and love martin tile server. One issue we face is invalidating tiles when sources are updated with user changes. Our specific use case involves a user editing geometry and saving to a PostgreSQL database. After a user change, we call a redraw on the map layer, but with martin, this would continue to show cached tiles without the change. Any other visitor requests to the cached tiles also don't show the changes either.
Architecture 🏗️
User's browser -> React Frontend -> Backend Authentication -> Martin TS -> PostgreSQL/PostGIS
Users submit a geometry change that writes to the PostgreSQL.
Developer Use Approaches 👀
First we'll go into how a frontend might request a tile or tile set to be re-fetched, then how it might be implemented in martin.
Versioning
Appending a version "number" to the URL request, when the frontend changes we could increment the version number or timestamp to notify martin to re-fetch a tile or tile set. It's well understood technique for cache busting, be it from the before times. However, martin would need to know about the versions of tiles for any smart re-fetching OR a version change would just be analogous to a boolean flag to remove the tile from the cache and re-fetch from the source. It also complicates URL parsing in martin.
It might look something like:
https://www-to-martin-endpoint/mvt?source/z/x/y/version
ORhttps://www-to-martin-endpoint/mvt?source-version/z/x/y
etc.It's worth noting this approach would be handled well by other caches like NGINX or CDN's.
Fingerprinting
Process of calculating a hash on frontend tile and parsing that as a URL request. The main issue of this approach is the frontend doesn't know the fingerprint of the tile or source beforehand. The other issue is vector tiles don't really know about the added feature geometry changes we add on top, at least in our implementation, so calculating an accurate fingerprint of a resulting tile where a geometry crosses more than one tile would be problematic. So this seems like the wrong mechanism for this feature. But happy to explore something like this if anyone has a good idea.
It might look like versioning in practice:
https://www-to-martin-endpoint/mvt?source-437b930db84b8079c2dd804a71936b5f/z/x/y/
where fingerprint is the hash hex string attached to source.It's worth noting this approach would be handled well by other caches like NGINX or CDN's.
Query String
Appending a version or flag on the query parameters in the GET request:
?version=2
OR?invalidate=true
etc. Easy to parse in martin, it could also be a timestamp.It is not handled well in other caching schemes like some CDN's
Use Cache-Control Header
Cache-Control Header value could be used a request value of no-cache or no-store. In this case we would be abusing the meaning of the terms, but both are valid HTTP Header Cache-Control Values. From the frontend developers would modify the TransformRequest in mapbox or similar in other libraries to add the cache-control value in the request. Looking forward to thoughts on this option as I can see a store value that is passed from a result of successful PostgreSQL update would be an approach to setting the value.
Use ETag Header
Entity Tags in the header have some neat properties, like passing a case-sensitive prefix of
/W
to indicate a weak validation on version, but this would have similar issues as versioning and fingerprinting about, martin and the frontend would need to sync on versions. It would be supplied to martin from the front end using the transform request discussed in Cache-Control Header.Custom Header
X-custom-header is deprecated by avoiding any conflicting header names we could choose a terse header such as
martin-refetch: true | false | next
or similar. Developers would implement the custom header similar to what is described in Cache-Control above.A New Endpoint
🔬 Potentially we could expand this to a wider discussion about some form of webhook interface that could allow for sources to flag when changes happen to layers or features which would remove tiles from the cache. This would be a powerful addition that could do a lot more than just invalidate tiles, though that would what this discussion would be about specifically.
Implementation Approaches 🛠️
One of the issues we face is invalidating not just one cache but all tiles without the source changes. I think the strategy should be to remove from the cache all "connected" tiles from configuration defined zoom level bounds. For example if a tile request was to be re-fetched, martin would remove the tile from the cache if it exists, then get then return the new tile, while removing a zoom level above and tiles zoom level bellow (if they exist). The fetch has only retrieved the requested tile but cache is then forcing a re-fetch for any new requests for other affected tiles.
As for custom headers, we can add an atrix_web allow_header for the custom name
martin-refetch
for example. In theDynTileSource
we can add a new invalidation expr to pass to theget_or_insert_cached_value
macro. Along with a new expr for removing other tiles around theTileCoord
.Webhook
Alternatively adding a new
get_webhook
factory and routes, with a simple definition of a source name and timestamp/serial of a change. The implementation in the tile cache would need some thought, would it invalidate the source everywhere or calculate the geometry change? Any thoughts on this idea or implementation greatly appreciated.Thank you for reading my wall of text!
Beta Was this translation helpful? Give feedback.
All reactions