Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try() function errors on nonexistent resource #24402

Open
danieladams456 opened this issue Mar 18, 2020 · 18 comments
Open

try() function errors on nonexistent resource #24402

danieladams456 opened this issue Mar 18, 2020 · 18 comments
Labels
config custom-conditions Feedback on variable validation, preconditions. postconditions, checks, and test assertions enhancement

Comments

@danieladams456
Copy link

danieladams456 commented Mar 18, 2020

Terraform Version

Terraform v0.12.23
+ provider.aws v2.53.0

Terraform Configuration Files

provider "aws" {}

output "nonexistent_file" {
  value = try(file("nonexistent"), "no file")
}

output "nonexistent_role" {
  value = try(aws_iam_role.nonexistent.arn, "no role")
}

Debug Output

https://gist.github.com/danieladams456/3037dd17100be21f816806450aba6ef8

Expected Behavior

The try function should catch the error and return no role to the output.

Actual Behavior

Error: Reference to undeclared resource

  on main.tf line 8, in output "nonexistent_role":
   8:   value = try(aws_iam_role.nonexistent.arn, "no role")

A managed resource "aws_iam_role" "nonexistent" has not been declared in the
root module.

Steps to Reproduce

  1. terraform init
  2. terraform apply

Additional Context

My use case is for a multi-module terraform project that is generated off a standard template. One resource in one of the modules is sometimes not needed. I would like to be able to gracefully detect if that resource is not present and return null in the output. Other modules that consume that output would implement null handling, but I don't want to have to edit my main.tf to stop passing around that variable if the resource isn't there. This also lets me just delete the hcl file with that single resource in it vs having to set a flag variable and have both the resource and output be conditional on that.

References

none I could find

@pkolyvas pkolyvas added the custom-conditions Feedback on variable validation, preconditions. postconditions, checks, and test assertions label Apr 9, 2020
@DavidGamba
Copy link

It would be great if you could extend the data resources to have a flag for when they fail but without failing the entire plan so you could handle that as a conditional.

@sami12rom
Copy link

any news on this?

@soniadas0210
Copy link

any update on this requirement ?

@scalp42
Copy link
Contributor

scalp42 commented Apr 1, 2023

Another perfect example for this is the chicken/egg issue with Route53 private zones and associations to VPCs, where you have different states (say at VPC level and a global level like Route53).

@elduds
Copy link

elduds commented May 18, 2023

My god I can't believe it took me so long googling to find this.

What is try() for except conditionally catching and handling errors!?

Use case is conditionally adding a workspace identifier tag for resources provisioned from a TFC workspace, typically added to AWS provider default_tags {}. Feels like a pretty obvious requirement:

  • If a tfe_workspace datasource is passed to the module, resolve the value of the of technical:terraform:workspace_url tag to be the workspace URL from that data source.
  • Otherwise, such as a local execution, ignore the tag.
locals {
  tfe_workspace_html_uri = try(
    data.tfe_workspace.current.html_url, null)

  tf_context_tags = {
    "technical:terraform:workspace_url": local.tfe_workspace_html_uri
  }
}

provider "aws" {
  region = var.region
  default_tags {
    tags = merge(var.resourcetags, local.tf_context_tags)
  }  
}

@algo7
Copy link

algo7 commented Jun 8, 2023

Another perfect example for this is the chicken/egg issue with Route53 private zones and associations to VPCs, where you have different states (say at VPC level and a global level like Route53).

I am currently facing exactly the same condition as you mentioned

@andrewmackett
Copy link

Has anybody found a workaround for this issue?

@rpgd60
Copy link

rpgd60 commented Nov 20, 2023

Like others, I need it for data_sources - fail graciously if the underlying query returns no values.

@Pasqual24
Copy link

Pasqual24 commented Jan 5, 2024

Same issue on Azure. How do you handle a case where an Azure resource is deleted and Terraform doesn't "see" it during the Terraform Plan ?

The Plan generated seems good, Terraform will deploy a child resource... but there's no validation to check whether the parent ressource really exists or not. The Plan looks good, but Apply fails because the parent resource has been deleted outside of Terraform.

I'd love to use a data source query to explicitly check if the parent resource exists, since Terraform can't handle it, but then an error is generated and the whole deployment stops :-/

@Nyque
Copy link

Nyque commented Mar 26, 2024

The workaround so far (not a very good one though) is to use AWS CLI to get the necessary info.

data "external" "example" {
  # Call AWS CLI in shell script and return a boolean in JSON string format
  program = ["bash", "${path.module}/example.sh"]
}

output "nonexistent_role" {
  value = data.external.example.result ? aws_iam_role.nonexistent.arn : null
}

@omry-arpaly
Copy link

Adding my support for looking into this issue - try() should be able to handle "Error: Reference to undeclared resource" errors: it's one of its major use cases. The fact that the documentation provides examples that just happens to involve accessing potentially missing parts of an existing structure (e.g., array member) just increases the confusion, and never highlights that limitation (=bug).

@MrTrustworthy
Copy link

MrTrustworthy commented May 22, 2024

➕ 1 on this feature request

Not sure if this would work for all of the use cases mentioned above, but there's a proposal for a (hopefully) simple and backwards compatible approach that might work for all(?) use cases:

  1. Add an attribute to data blocks that behaves like nullable = optional(bool, false)
  2. If set to false (the default), it behaves as it currently does
  3. If set to true, a failure of the data block to look up the corresponding resource will lead to the data being set to null, and not cause an failure & abort.
  4. All attempts to read from a nullable = true data block can either use simple null-checks or try(data.x.myattribute), depending on what they want/need to do, and it will just work as expected.

It's similar to how nullable works within modules/variables, so it's just an extension of this concept to data blocks and wouldn't be a completely new & unexpected mechanic.

It also doesn't rely on try() retroactively being able to catch hard errors during the data evaluation, so at least this sounds easier to implement.

@apparentlymart
Copy link
Contributor

apparentlymart commented May 29, 2024

Hi all,

The try function is specifically for catching dynamic errors, by which I mean errors that occur based on invalid types or values rather than on references to undeclared objects. This is similar to how in many general-purpose languages the exception handling mechanism cannot "catch" statically-invalid code such as a reference to a variable that wasn't declared, or a syntax error.

While I can see that code generation does blur the line between "static" and "dynamic", in most cases there's no reason to dynamically check whether a resource is declared because that decision cannot be made dynamically based on runtime data. The try function design prioritizes still returning an error for situations where something cannot possibly ever be valid for the current configuration, because that reduces the chance of someone making a mistake where try would always fail but then not notice that. Changing that would give worse feedback in the common case in support of a relatively-rare situation.

It's also not really feasible to catch static errors with a function, because a function call is itself a dynamic operation. A statically invalid expression causes a validation error long before Terraform even begins expression evaluation. If there is something to be solved here then we'll need to solve it in a different way.


If you are using code generation to decide whether or not a particular resource block is generated, a possible solution is to also make the code generator produce a local value whose definition varies depending on whether the resource block was generated.

For example, if the code generator decides to generate the resource then it could also generate a local value that refers to it:

resource "aws_iam_role" "example" {
  # ...
}

locals {
  example_iam_role_arn = aws_iam_role.example.arn
}

...but if the code generator decides not to generate the resource then it would still generate a declaration for the same local value name but set it to null instead:

locals {
  example_iam_role_arn = null
}

Then other code in the module can refer to local.example_iam_role_arn regardless of the code generation decision, as long as it's able to deal with the value possibly being null:

output "example_role_arn" {
  value = local.example_iam_role_arn
}

If you're already doing code generation anyway then I would expect the code generator to use techniques like this to deal with its differences as a code generation concern, rather than using a weird mix of code generation and dynamic decision making together.


The other examples given in subsequent comments don't seem to be about code-generation, but I don't really understand what they are about. If you shared a non-code-generation-related use-case in the comments above I'd appreciate if you could share a fuller example of what you are doing so that I can understand more.

I suspect that this issue has come to conflate multiple different use-cases just because they led you all to the same error message, and so I'd like to understand those use-cases so that the team can think about potential solutions.

@MrTrustworthy
Copy link

MrTrustworthy commented Aug 19, 2024

If you shared a non-code-generation-related use-case in the comments above I'd appreciate if you could share a fuller example of what you are doing so that I can understand more.

I can give one example:

  • Let's say I have 2 resources. For example, a VM and a Service Account. To create the VM, I need to configure it to use the service account - in short, I need a reference to the Service Account.
  • Now, I need to create this VM + Service Account pair for each of my applications (or users, or ...). So, I have a list of applications in my locals, and I do a for_each over the list/set of applications to generate those 2 resources for each.
  • But I don't actually manage the Service Accounts in Terraform - or, at least not in this TF workspace/repo. Let's assume this is generated/managed somewhere centrally for the whole organisation. So, to get the reference of the Service account, I can't use a resource, I have to use a data block to import it. I can easily do that based on the application name, so that's easy as well. So far, so good.
  • Now the issue: There might not always be a Service Account for each application. My app bananas might simply not have a Service Account, for some organisational reasons. In cases like those, I want to have a slightly different behaviour - for simplicity, let's say I have one central, shared "catch-all" Service Account that I use for all VMs that don't have a dedicated one.

That example is simplified, but I hope it gives you a situation that's not code-generation related. Ultimately, it's about being able to use a data block (generally in dynamic situations that combine it with for_each) when you can't guarantee that it (it = at least one element of the list) is always present.

Again, the issue here is not that the entire "block of text" like data.myresource.myinstance is not defined at all, but that the lookup to that entity at data.myresource.myinstance["bananas"] returns a "non existent" status. I'd like my TF code to be able to deal with this situation, instead of forcing it to be a complete abort. Ultimately, I don't even need the try function to catch it - a data block that's nullable would work already.

@sworisbreathing
Copy link

sworisbreathing commented Oct 25, 2024

@apparentlymart my situation is pretty similar to what @MrTrustworthy described, especially with respect to combining the data block with for_each and also to "thing a central team must manage".

To add to that, there are other situations where the tech you're interfacing with has "virtual" resources. For example, in AWS LakeFormation you can assign permissions to various principals such as IAM roles, IAM users, etc, but also there's this magic principal called IAM_ALLOWED_PRINCIPALS which doesn't really exist. With the current terraform behavior we have to engineer special logic to handle this edge case when managing LF permissions at scale.

There are other cases where third-party tech insists on managing some of the resource lifecycle on its own, meaning there will be inevitable conflicts if you try to import the resources into terraform. You might need to still manage other aspects of it though, so you might do something like:

resource "myresource" "foo" {
  # ...
}

data "myresource" "bar" {
  for_each   = # ...
  must_exist = false # or nullable = true
}

resource "myresource" "foo_bar_association" {
  for_each = data.myresource.bar
  foo_id   = myresource.foo.id
  bar_id   = each.value.id
}

I think having a nullable (default to false) or must_exist (default to true) attribute on data sources would be a reasonable feature addition. In either case, terraform would default to the current behavior (which is to throw an error when the lookup fails) but with the option turned on, it'd just add an appropriate warning message instead. In the above example, as soon as terraform picks up the fact that data.myresource.bar["x"] had ceased to exist, the plan would correctly attempt to destroy myresource.foo_bar_association["x"] rather than bombing out with an error

@apparentlymart
Copy link
Contributor

apparentlymart commented Oct 25, 2024

Hi @MrTrustworthy and @sworisbreathing,

I appreciate you taking the time to answer my question.

Unfortunately, I don't work on the Terraform team at HashiCorp anymore, so I can't personally do anything to act on your responses, but I did want to note that both of you seem to be discussing the use-case of #16380 rather than the use-case that this issue was about, and so maybe the discussion over there will give you some ideas about different ways to solve your problems.


The following is just a personal response and not a statement on behalf of the Terraform team, but for what it's worth...

The main challenge with using the existence of something to decide whether to declare something else is that it's an inherent contradiction. You can see this for yourself using the following configuration that uses an existing data source that is already capable of returning an empty result:

data "aws_vpcs" "maybe" {
  tags = {
    Name = "exists"
  }
}

resource "aws_vpc" "exists" {
  count = length(data.aws_vpcs.maybe.ids) == 0 ? 1 : 0

  cidr_block = "10.111.0.0/16"
  tags = {
    Name = "exists"
  }
}

If you plan and apply this when no VPC exists, the first plan/apply round will indeed detect that data.aws_vpcs.maybe.ids is empty and so propose to create aws_vpc.exists.

But then if you run another plan/apply round it will then find out that the VPC exists, and so data.aws_vpcs.maybe.ids won't be empty anymore, and so Terraform will propose to destroy aws_vpc.exists.

And then if you run again, it'll propose to create again, and so on. This configuration can never converge because it contradicts itself... it says "this aws_vpc should exist if it doesn't exist", which is an impossible state to reach.

Therefore if there is to be a solution for your use-cases, it's gotta be something other than a data source that returns an empty result, or a decision made based on the failure of a data source. However, I can't say if the Terraform team is open to discussing anything like that since the existing issue for this suggestion was already closed. 🤷‍♂️

@DavidGamba
Copy link

IMHO if I introduce a nullable data block into my codebase and the result never converges it is a me problem, not a Terraform problem. Every language allows to shot oneself in the foot. There are many existing ways to do it in Terraform.

All the mitigations exist already to prevent you to delete the resource by mistake if your configuration is not converging because of a mistake you made, starting with just don't apply that change 😉
On the other hand, the benefits of having the kinds of things people are wanting to do with the try or the nullable data block would far outweigh the dangers.

As for the data block, I feel like introducing another keyword would be unnecessary, all providers already support the data blocks and allowing me to decide what to do when there are 0 results seems fairly simple to implement.

@rishabhToshniwal
Copy link

Hi,

I am facing similar issue while checking if access_entry exist in EKS and if not, I want to create it. The problem for me is if the entry doesn't exist, I get the error.

provider "aws" {
  # Configuration options

}

data "aws_caller_identity" "current" {}

data "aws_partition" "current" {}

data "aws_iam_session_context" "current" {

  arn = try(data.aws_caller_identity.current.arn, "")
}


data "aws_eks_access_entry" "creator" {
  cluster_name  = "eksaccess-dev-eks-2321"
  principal_arn = "arn:aws:iam::xxxxxxxxx:role/aCognito-authrole-tags"
}


locals {
  eks_access_entry_arn = try(data.aws_eks_access_entry.creator.access_entry_arn, "")
}

output "eks_access_entry_outputs" {
  value = local.eks_access_entry_arn
}


The error I am getting is:

 Error: reading EKS Access Entry (eksaccess-dev-eks:arn:aws:iam:xxxxxxx:role/aCognito-authrole-tags): couldn't find resource
│
│   with data.aws_eks_access_entry.creator,
│   on main.tf line 21, in data "aws_eks_access_entry" "creator":
│   21: data "aws_eks_access_entry" "creator" {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config custom-conditions Feedback on variable validation, preconditions. postconditions, checks, and test assertions enhancement
Projects
None yet
Development

No branches or pull requests