Guide plugin authors to write source plugins that correctly interact with the cache #31214
Replies: 15 comments
-
Oh interesting! Yeah, not setting a parent in onCreateNode when creating a node is a definite no no. I'm struggling to think of when that'd ever actually make sense... if we can't think of any, we should just hard error there (or maybe warn initially to not break people's sites). |
Beta Was this translation helpful? Give feedback.
-
Another thing we could check for is if all (or most, say 80%+) or a plugin's nodes are deleted startup due to being stale. That'd be easy to calculate and again should be extremely uncommon so we could issue a strong warning that the plugin probably has a bug. |
Beta Was this translation helpful? Give feedback.
-
Screenshot transformer does that - in very similar way as linked plugin (because both use |
Beta Was this translation helpful? Give feedback.
-
Ah yeah, ok so it's a weakness in the helper. Oh and we also shouldn't then delete nodes that are linked to from other nodes 🤔 That's the real problem. File nodes don't need a parent. They just need not deleted. If I'm remembering the stale node algorithm correctly, we only check for parents. |
Beta Was this translation helpful? Give feedback.
-
This is a pretty major bug. The stale collection stuff is our GC step but since we're not collecting all references, we're deleting too much. I'm not sure though how to cheaply find ___NODE references since recursing through every node would be very expensive. |
Beta Was this translation helpful? Give feedback.
-
Not ideal, but we do recurse through each node when we produce example value for schema inference. But using that has potential to easily introduce regressions when changing seemingly unrelated code. I honestly am not sure we should do that (shield ___NODE linked nodes from being garbage collected) - especially with I guess maybe instead of changing gatsby/packages/gatsby/src/redux/actions.js Lines 760 to 773 in e651e9c That + warning/error containing instructions to use that action seems like it would solve this without adding overhead of recursing nodes to find ___NODE fields |
Beta Was this translation helpful? Give feedback.
-
Hmm thought of a cheap way to do the fix. Just track which nodes are created in onCreateNode from which node. This would be an implicit parent/child relationship even if the data wasn't modeled that way. We could wrap the createNode action to save this. Then in the GC step, we could do a deep search on the remaining parent nodes for links. |
Beta Was this translation helpful? Give feedback.
-
So I just tried to reproduce this with
Any ideas? |
Beta Was this translation helpful? Give feedback.
-
Turns out the Instagram API returns tags for an image in random order every time. So content digest is different every time! That's why they were getting marked as dirty and couldn't reproduce. |
Beta Was this translation helpful? Give feedback.
-
This was published in |
Beta Was this translation helpful? Give feedback.
-
Going to close this since rather than guiding we've been able to fix it with code itself. Thanks @KyleAMathews, @pieh and @sidharthachatterjee! |
Beta Was this translation helpful? Give feedback.
-
Reopening this, because this was fixed in single instance ( |
Beta Was this translation helpful? Give feedback.
-
Hiya! This issue has gone quiet. Spooky quiet. 👻 We get a lot of issues, so we currently close issues after 30 days of inactivity. It’s been at least 20 days since the last update here. If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open! Thanks for being a part of the Gatsby community! 💪💜 |
Beta Was this translation helpful? Give feedback.
-
Hi there, I have an issue and I think it might fit here. I wrote a plugin called gatsby-source-gh-readme, which you can find on NPM gatsby-source-gh-readme and the repo is on github here gatsby-source-gh-readme. The plugin makes a graphql call to the github graphql api and should pull in the file contents of the The plugin works, for each README.md file Gatsby creates a node of type markdown and then those node are processed by the remark plugins just like the blog post markdown files... However, frequently (almost every time,) I need to clear the cache before running I prepare the data for // Helper function that processes a repository node to match Gatsby's node structure
const processRepo = repo => {
const nodeId = createNodeId(`repo-readme-${repo.id}`);
const readme = repo.readme;
const nodeData = Object.assign({}, repo, {
id: `repo-readme-${nodeId}`,
parent: null,
children: [],
internal: {
mediaType: "text/markdown",
type: `GithubReadme`,
content: readme,
contentDigest: createContentDigest(repo)
}
});
return nodeData;
}; you can see that Apart from the fact that there is probably a better way to write this plugin, it does seem to be example of what your talking about in this issue. Apart from flagging this as a potential example of the problem discussed in this issue. I have some general questions about this plugin that are not related:
|
Beta Was this translation helpful? Give feedback.
-
Here is an example of a theme that creates tag nodes that disappear from the cache for no obvious reason: reflexjs/reflexjs#39. |
Beta Was this translation helpful? Give feedback.
-
This is related to #11747 as an isolated, reproducible case of caching issues. cc/ @pieh @DSchau
Description
Numerous plugins in the ecosystem aren't interlinking a node and the parent on
onCreateNode
which causes nodes in the cache to suddenly go missing. I stumbled across this issue working on a proof of concept withgatsby-source-instagram-all
but I'm sure this affects plenty of other plugins.It's not currently intuitive how to properly set up data sourcing in ways that interact well with the cache.
These issues are unfortunate because they create distrust in the cache (rightfully so). Logging warnings and helping to guide plugin authors towards the right path will help alleviate caching woes.
Steps to reproduce
Use a plugin like the reproduction and stop, then start, the server.
Expected result
Restarting the server and running the build should work.
Actual result
Restarting the server and running the build require clearing the cache.
Steps to fix
Currently, the steps to fix involves
rm -rf .cache
🙀.To solve the fundamental issue here will require two steps (I think)
• We should issue a warning in
onCreateNode
if there's no node passed as a parent and link to a doc that elaborates on the issue• We should have a doc for plugin authors that describe the Gatsby caching system and how nodes need to be interlinked to be properly cached and avoid errors
I'm anticipating that the doc should be in the style of a guide for plugin authors that might be sourcing nodes since the existing docs (that I can find) are more low-level but I'll defer to @marcysutton here.
Related
Beta Was this translation helpful? Give feedback.
All reactions