Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement caching for get_blocks #27

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

lukasbesch
Copy link

@lukasbesch lukasbesch commented Dec 2, 2021

On some posts the response time is quite long.

This pull request implements a caching for the parsing using transients (or site transients for multisite).
The cache key contains a md5 hash of the content and the post meta (only on posts).
By default the expiration is set to 0, so that the cached data will be used until either post content or post meta changes, or until you clear transients manually.

In my case this decreases the response time from ≈30s on the first call to ≈1.5s on subsequent calls.
I will have to check which specific block might cause the performance issue as not all posts take that much time.

Did anyone have a similar experience?
I did not dig deeper into it, my guess is that the pQuery parser is kind of slow, or it trips on particular blocks.


This pull request additionally introduces two new filters:

  • rest_api_blocks_output
  • rest_api_handle_block

I use them to add additional data to some third-party blocks which don't provide all necessary data as attributes or to apply other transformations.

@spacedmonkey
Copy link
Owner

Thanks for the PR!

I like the idea, but how does cache invalidation happen here?

@spacedmonkey spacedmonkey self-requested a review December 10, 2021 12:03
@lukasbesch
Copy link
Author

The cache key is using a md5 hash of the post content plus a md5 hash of the serialized post meta data.
So if the post content or meta changes, the cached data is not used anymore, but it is still in the database.

The expiration is currently set to 0, so that the transient never expires. Probably we should use something like one year, so that the cache is not filled with content that might never reappear. Expired transients are cleared on database upgrades.

One more thing to note: the cache key does not include the post id, so if two posts have exactly the same content (and meta), they use the same cache key. But I guess that is rather unlikely.

@spacedmonkey
Copy link
Owner

What about blocks that stored in widgets or templates parts that do not have post ids?

@lukasbesch
Copy link
Author

Good point!
If no post_id is supplied (or it is 0), only the content string is hashed. I added some tests for that, but:

In general it is problematic if blocks such as core/latest-posts are cached – the attributes do not change but the rendered property does change. Another example would be a third party block that fetches data from an external API.

As an alternative, we could cache not on content level but on block level (after the default parsing into blocks), with a whitelist of cacheable blocks.
That would result in a lot more cache keys though, and might not be significantly faster (will have to measure that).

Also I'd rather like to find the root of the issue that causes the long response time rather than just trying to "cache it away" :)

What do you think?

@spacedmonkey
Copy link
Owner

I had a try at caching in my own PR: #29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants