-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run s3 jobs via step functions #136
base: main
Are you sure you want to change the base?
Conversation
@@ -30,27 +30,35 @@ resource "aws_iam_policy" "run_task" { | |||
|
|||
data "aws_iam_policy_document" "run_task" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file looks exactly like the one for the scheduler role:
https://github.com/navapbc/platform-test/blob/main/infra/modules/service/scheduler_role.tf
But it didn't seem right to share them, given that they are acting on different arns, and are assumed by different services.
for_each = var.file_upload_jobs | ||
|
||
name = "${var.service_name}-${each.key}" | ||
role_arn = aws_iam_role.workflow_orchestrator.arn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same role that the cron scheduler uses 😎
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments about simplifying a few things but otherwise looks amazing.
input_template = replace(replace(jsonencode({ | ||
containerOverrides = [ | ||
{ | ||
name = local.container_name, | ||
command = each.value.task_command | ||
} | ||
] | ||
}), "\\u003c", "<"), "\\u003e", ">") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can simplify this:
input_template = replace(replace(jsonencode({
task_command = each.value.task_command
}), "\\u003c", "<"), "\\u003e", ">")
then below where we set the container overrides we can do
"Command.$" : "$.task_command"
# | ||
# The syntax for parsing the input data comes from JSONPath. | ||
"Name" : local.container_name, | ||
"Command.$" : "$.containerOverrides[0].command[*]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment but I think we can simplify the input transformer and this line.
resource "aws_cloudwatch_log_group" "file_upload_jobs" { | ||
for_each = var.file_upload_jobs | ||
|
||
name_prefix = "/aws/vendedlogs/states/${var.service_name}-${each.key}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that if someone names one of their file_upload_jobs the same as one of their scheduled_jobs then the logs would conflict. Could we update this to:
name_prefix = "/aws/vendedlogs/states/${var.service_name}-${each.key}" | |
name_prefix = "/aws/vendedlogs/states/file-upload-jobs/${var.service_name}-${each.key}" |
and also make a similar change in scheduled_jobs.tf to add /scheduled-jobs/
into the path
dynamic "statement" { | ||
for_each = aws_sfn_state_machine.file_upload_jobs | ||
|
||
content { | ||
actions = [ | ||
"states:StartExecution", | ||
] | ||
resources = [statement.value.arn] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize this previously when we worked on the scheduled jobs, but wouldn't this be much simpler as a non-dynamic statement with an array for the resources list:
dynamic "statement" { | |
for_each = aws_sfn_state_machine.file_upload_jobs | |
content { | |
actions = [ | |
"states:StartExecution", | |
] | |
resources = [statement.value.arn] | |
statement { | |
actions = [ | |
"states:StartExecution", | |
] | |
resources = [for job in aws_sfn_state_machine.file_upload_jobs : job.arn] |
can we try this? If it works can we make the same change to scheduler_role.tf?
dynamic "statement" { | ||
for_each = aws_sfn_state_machine.file_upload_jobs | ||
|
||
content { | ||
actions = [ | ||
"states:DescribeExecution", | ||
"states:StopExecution", | ||
] | ||
resources = ["${statement.value.arn}:*"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my previous comment, would this be simpler as
dynamic "statement" { | |
for_each = aws_sfn_state_machine.file_upload_jobs | |
content { | |
actions = [ | |
"states:DescribeExecution", | |
"states:StopExecution", | |
] | |
resources = ["${statement.value.arn}:*"] | |
statement { | |
actions = [ | |
"states:DescribeExecution", | |
"states:StopExecution", | |
] | |
resources = [for job in aws_sfn_state_machine.file_upload_jobs : "${job.arn}:*"] |
Ticket
Resolves navapbc/template-infra#744
Changes
This PR changes the Event Bridge file upload job from triggering an ECS task directly, to triggering a step function that then triggers an ECS task. We are doing this because Step Functions brings a basic observability layer (it shows failures and successes) to the triggering of ECS tasks.
Context for reviewers
This PR looks a lot like the scheduled jobs PR. It even shares a role with the scheduled jobs (
infra/modules/service/workflow_orchestrator_role.tf
)Testing
(the gif runs through the whole process end to end, and ends up 70 seconds long)
Screen.Recording.2024-09-13.at.2.mp4
Preview environment