Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid OOM and infinite loops in moduleInfo.load() #3769

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

finnag
Copy link
Contributor

@finnag finnag commented Sep 15, 2023

what

moduleInfo.load() now loads modules and dependencies without recursing, and only loads each dependency once.

why

moduleInfo.load() used to load modules and depedencies recursively, and due to some unfortunate circumstances it would usually create a very deep call stack where each load() called another load() to load the next one.

The terraform parsing uses enough stack space that this can become a problem, so for a slightly big project this sometimes caused atlantis to die from a stack overflow:

runtime: goroutine stack exceeds 1000000000-byte limit runtime: sp=0xc021478380 stack=[0xc021478000, 0xc041478000] fatal error: stack overflow
[...]
github.com/runatlantis/atlantis/server/events.moduleInfo.load(...)
	...atlantis/server/events/modules.go:108 +0x46b fp=0xc021478750 sp=0xc021478570 pc=0xfeaa6b
github.com/runatlantis/atlantis/server/events.moduleInfo.load(...)
	...atlantis/server/events/modules.go:108 +0x46b fp=0xc021478930 sp=0xc021478750 pc=0xfeaa6b
github.com/runatlantis/atlantis/server/events.moduleInfo.load(...)
	...atlantis/server/events/modules.go:108 +0x46b fp=0xc021478b10 sp=0xc021478930 pc=0xfeaa6b

... and so on, several hundred times.

tests

  • I have tested my changes by make test-all

references

@finnag finnag requested a review from a team as a code owner September 15, 2023 10:49
@github-actions github-actions bot added the go Pull requests that update Go code label Sep 15, 2023
@finnag finnag force-pushed the remove-recursion-module-load branch 3 times, most recently from 2db96fe to af24c99 Compare September 18, 2023 16:22
@finnag finnag changed the title fix: remove recursion from moduleInfo.load() fix: avoid OOM and infinite loops in moduleInfo.load() Sep 27, 2023
@jamengual jamengual added bug Something isn't working waiting-on-review Waiting for a review from a maintainer labels Sep 27, 2023
@GenPage
Copy link
Member

GenPage commented Oct 6, 2023

@finnag Have we updated the tests in modules_test.go or do they cover this new change in functionality?

@jamengual jamengual added waiting-on-response Waiting for a response from the user and removed waiting-on-review Waiting for a review from a maintainer labels Oct 10, 2023
@glasser
Copy link
Contributor

glasser commented Nov 7, 2023

We just ran into this as well.

@jamengual
Copy link
Contributor

@finnag do you have time to reply to the comment? Thanks.

Copy link

This issue is stale because it has been open for 1 month with no activity. Remove stale label or comment or this will be closed in 1 month.

@github-actions github-actions bot added the Stale label Dec 17, 2023
@glasser
Copy link
Contributor

glasser commented Dec 17, 2023

Would love to see this get merged. I could take over if help is needed and @finnag is unavailable.

@github-actions github-actions bot removed the Stale label Dec 18, 2023
@jamengual
Copy link
Contributor

jamengual commented Dec 18, 2023 via email

Copy link

This issue is stale because it has been open for 1 month with no activity. Remove stale label or comment or this will be closed in 1 month.

@glasser
Copy link
Contributor

glasser commented Jan 20, 2024

Haven't had a chance to take over yet

vincentgna added a commit to vincentgna/atlantis that referenced this pull request Feb 20, 2024
@finnag finnag requested a review from a team as a code owner February 26, 2024 17:58
@finnag finnag requested review from lukemassa and nitrocode and removed request for a team February 26, 2024 17:58
@finnag finnag force-pushed the remove-recursion-module-load branch from 74cc57d to 69b5c6a Compare April 25, 2024 14:07
moduleInfo.load() used to load modules and depdencies recursively,
and due to some unfortunate circumstances it would usually create a
very deep call stack where each load() called another load() to load
the next one.

The terraform parsing uses enough stack space that this can become a
problem, so for a slightly big project this sometimes caused atlantis
to die from a stack overflow:

runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc021478380 stack=[0xc021478000, 0xc041478000]
fatal error: stack overflow
[...]
github.com/runatlantis/atlantis/server/events.moduleInfo.load(...)
	...atlantis/server/events/modules.go:108 +0x46b fp=0xc021478750 sp=0xc021478570 pc=0xfeaa6b
github.com/runatlantis/atlantis/server/events.moduleInfo.load(...)
	...atlantis/server/events/modules.go:108 +0x46b fp=0xc021478930 sp=0xc021478750 pc=0xfeaa6b
github.com/runatlantis/atlantis/server/events.moduleInfo.load(...)
	...atlantis/server/events/modules.go:108 +0x46b fp=0xc021478b10 sp=0xc021478930 pc=0xfeaa6b

... and so on, several hundred times.
@finnag finnag force-pushed the remove-recursion-module-load branch from 69b5c6a to a5814f9 Compare October 8, 2024 15:48
@finnag
Copy link
Contributor Author

finnag commented Oct 8, 2024

Had some time to look at atlantis again and rebased all my PRs to v0.30.0. We've run with this one for a year+. The diff with whitespace removed is pretty minimal - what kind of tests are required for this?

@X-Guardian
Copy link
Contributor

Hi @finnag. Take a look at the current server\events\modules_test.go file and add some tests that cover your code change, i.e. test cases with multiple duplicate dependencies and verify that they are only loaded once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working go Pull requests that update Go code never-stale waiting-on-response Waiting for a response from the user
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants