Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow skipping parsing of locals #1554

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

amnk
Copy link
Contributor

@amnk amnk commented Feb 18, 2021

Allow skipping parsing of locals

Since locals can have side effects, we might run into cases when we don't need to pasre them. One example is checkVersionConstraints(), which is executed early on and relies on parsing HCL.

Fixes #1427

Since `locals` can have side effects, we might run into cases when
we don't need to pasre them. One example is `checkVersionConstraints()`,
which is executed early on and relies on parsing Terraform.

Closes: gruntwork-io#1427
@amnk amnk changed the title Amnk/issues 1427 Allow skipping parsing of locals Feb 18, 2021
@@ -460,6 +460,7 @@ func checkVersionConstraints(terragruntOptions *options.TerragruntOptions) error
terragruntOptions,
nil,
[]config.PartialDecodeSectionType{config.TerragruntVersionConstraints},
false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanity checks:

  1. Does this mean you can no longer use locals in version constraints? I could see this being problematic if someone is trying to manage version constraints across many places with a single local variable somewhere...
  2. What other parts of terragrunt.hcl do we parse as part of checking version constraints? Could those other parts also be using run_cmd? Should we instead update this code to explicitly only parse version constraints and no other parts of the file?
  3. Whatever we decide, we should update the docs accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amnk Bump ^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brikis98 So it actually works like that already. At least kinda like that. In our code we have a decodeList option, to pass what exactly should be parsed, and when we check versions - we specify that. But locals and include are always parsed. This is probably because as you said - someone can use locals in version constraints. My code added ability to switch off locals parsing.

I initially thought that Terragrunt wants to follow Terraform, which does not support interpolation syntax in versions. Imho this is the most stable approach, and this will help us avoid creating lots of special cases in the code.

@yorinasub17
Copy link
Contributor

Had a thought (NOTE: haven't had a chance to review this yet, but wanted to share some thoughts on the approach):

Would this be resolved if we memoize the run_cmd function call? That is, we cache the results in memory, and return the cached results when it is called with the same parameters? That seems like it would avoid changing the availability of locals, while solving the problem of the external side effect being called multiple times.

@amnk
Copy link
Contributor Author

amnk commented Feb 25, 2021

Had a thought (NOTE: haven't had a chance to review this yet, but wanted to share some thoughts on the approach):

Would this be resolved if we memoize the run_cmd function call? That is, we cache the results in memory, and return the cached results when it is called with the same parameters? That seems like it would avoid changing the availability of locals, while solving the problem of the external side effect being called multiple times.

@yorinasub17 I was thinking about that, but in the end decided that ordering might be an issue here (e.g. someone might expect them to be executed later after applying some state). But still, it is a good idea, and we can implement it and document a warning.

@brikis98
Copy link
Member

Had a thought (NOTE: haven't had a chance to review this yet, but wanted to share some thoughts on the approach):

Would this be resolved if we memoize the run_cmd function call? That is, we cache the results in memory, and return the cached results when it is called with the same parameters? That seems like it would avoid changing the availability of locals, while solving the problem of the external side effect being called multiple times.

Doesn't that assume that run_cmd would never have side effects? E.g., If you had run_cmd("date"), you'd expect different results each time that appeared in your terragrunt.hcl.

@yorinasub17
Copy link
Contributor

in the end decided that ordering might be an issue here (e.g. someone might expect them to be executed later after applying some state)

We don't make any promises on when run_cmd runs (as it is a function that is executed on parsing of the config), so I don't think this is a concern. For the use case where you need precise ordering, you should be using before_hook and after_hook, which has more precise timing on when the command is called.

Doesn't that assume that run_cmd would never have side effects? E.g., If you had run_cmd("date"), you'd expect different results each time that appeared in your terragrunt.hcl.

I think we can mitigate that by memoizing on the source file diagnostics (where the function call appears).

That said, I can't really come up with any use case where this is the case and the args are exactly the same. Furthermore, it seems like you can easily workaround it by wrapping it in a bash script that accepts a nonce-arg for cache busting, unlike the current challenge where you can't workaround avoiding two calls to run_cmd.

@brikis98
Copy link
Member

brikis98 commented Mar 1, 2021

Doesn't that assume that run_cmd would never have side effects? E.g., If you had run_cmd("date"), you'd expect different results each time that appeared in your terragrunt.hcl.

I think we can mitigate that by memoizing on the source file diagnostics (where the function call appears).

Ah, that's a great idea. Yea, let's go with this approach then 👍 @amnk Could you implement this?

That said, I can't really come up with any use case where this is the case and the args are exactly the same. Furthermore, it seems like you can easily workaround it by wrapping it in a bash script that accepts a nonce-arg for cache busting, unlike the current challenge where you can't workaround avoiding two calls to run_cmd.

Just because we can't think of it, doesn't mean there's no use case for it, and while I agree that it's better to fix this dual-execution issue, I'd rather do so without introducing a new issue. Sounds like the memoization approach based on source file diagnostics accomplishes both, so I think we're good to go here.

@yorinasub17
Copy link
Contributor

yorinasub17 commented Mar 1, 2021

Ah, that's a great idea. Yea, let's go with this approach then 👍 @amnk Could you implement this?

Ugh I just took a quick scan, and I don't think we have this information to implement it in the function. I thought there might be a way to hook in the parser in the implementation function, but it doesn't look like you have anyway to get the diagnostics...

It looks like the error reporting works to wrap the error from the function implementation. That is, the function caller has the info, but it won't pass it through to the implementation function so there is no way to look it up.

So we may not be able to implement this...

Just because we can't think of it, doesn't mean there's no use case for it, and while I agree that it's better to fix this dual-execution issue, I'd rather do so without introducing a new issue. Sounds like the memoization approach based on source file diagnostics accomplishes both, so I think we're good to go here.

Given above, we might have to go with memoizing on filename (source file), command, and args if we take the memoization approach.

I want to make one more case for why I think introducing this specific new issue is better than the new issue introduced by alternative locals parsing, where you can't use locals in certain sections of the code where you could before).

The new issue we introduce (where it won't call the same command twice in a single call, in a single file) can be worked around in the following different ways:

  • You can split each call you want to make into separate hcl files that get read in with read_terragrunt_config. This should be doable if we memoize on the filename property, which is available when constructing the eval context, and AFAIK, this filename should be updated correctly by the read_terragrunt_config function.

  • You can add a nonce arg to each run_cmd call. This requires updating the run cmd call to use a wrapper script that ignores the nonce arg, but it does workaround the issue by cache busting within the file. In fact, I personally think this is better practice since it makes explicit that the command is intentionally called multiple times.

Contrast to the new issue we introduce by going with this current implementation, where we skip parsing locals for certain use cases. The new issue of avoiding locals parsing is that we won't be able to use locals for version constraints. The workaround for this is to use json files directly with file and jsondecode, or call read_terragrunt_config directly to DRY. This means that if you have nested or complex data files, you can potentially have a fairly long and complex interpolation in the version constraint, since you can't break it down into multiple steps with locals like you can now.

Personally, I think the workaround in the former issue actually has more advantages of making your code clearer, and is probably preferred in terms of how code should be written in terragrunt. I also think the memoization is likely going to easier to maintain and understand from an implementation perspective going forward.

@brikis98
Copy link
Member

brikis98 commented Mar 1, 2021

Fine by me 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

run_cmd in locals block executed twice
3 participants