Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: opt-out ssm parameters for github app #4335

Merged
merged 30 commits into from
Feb 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
499c7a1
feat(ssm): condition ssm parameters creation to a new variable to avo…
AppliNH Jan 6, 2025
ccfb39a
docs(examples): add a new example to use manual ssm parameters option
AppliNH Jan 6, 2025
70c0a2a
Merge branch 'main' into main
npalm Jan 9, 2025
4c18eab
Merge remote-tracking branch 'upstream/main'
AppliNH Feb 13, 2025
65a0755
docs: auto update terraform docs
github-actions[bot] Feb 13, 2025
fd65672
mv(example external-managed-ssm): change example name and refactor doc
AppliNH Feb 13, 2025
b613805
other(github_app): include ssm parameters inside existing github_app …
AppliNH Feb 13, 2025
e313ecf
Merge branch 'main' of github.com:AppliNH/terraform-aws-github-runner
AppliNH Feb 13, 2025
f5b6a5a
docs: auto update terraform docs
github-actions[bot] Feb 13, 2025
c653599
fix(modules ssm): fix outputs
AppliNH Feb 13, 2025
a73d557
Merge branch 'main' of github.com:AppliNH/terraform-aws-github-runner
AppliNH Feb 13, 2025
201a973
docs: auto update terraform docs
github-actions[bot] Feb 13, 2025
c2d2cd1
fix(ssm): fix module.ssm refs
AppliNH Feb 13, 2025
556bca1
fix(ssm ouputs): use simplier condition as coalesce cant be used here
AppliNH Feb 13, 2025
2ced074
Merge branch 'main' of github.com:AppliNH/terraform-aws-github-runner
AppliNH Feb 13, 2025
91723f0
other(vars github_app): add validation block and update description
AppliNH Feb 18, 2025
9b3b65e
docs: auto update terraform docs
github-actions[bot] Feb 18, 2025
64bda92
other(vars github_app): add note regarding precedence
AppliNH Feb 18, 2025
af32f7e
docs: auto update terraform docs
github-actions[bot] Feb 18, 2025
3b799b2
doc: add note regarde github app secrets in SSM
AppliNH Feb 18, 2025
e96a560
chore: adjust examples and add script to easy test example
npalm Feb 19, 2025
328eb74
docs: auto update terraform docs
github-actions[bot] Feb 19, 2025
12bffc8
add script to setup ssm
npalm Feb 19, 2025
e946698
Merge branch 'main' into main
npalm Feb 20, 2025
89673da
doc: clean docs
AppliNH Feb 20, 2025
4422e93
docs: add example to index
AppliNH Feb 20, 2025
8006a81
other(ssm-script docs): escape double quotes and refactor doc
AppliNH Feb 20, 2025
a1ef867
docs: auto update terraform docs
github-actions[bot] Feb 20, 2025
0c4a797
chore(ci): add new wexample to workflow
npalm Feb 20, 2025
ad26106
chore: small fixes
npalm Feb 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/terraform.yml
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ jobs:
"ephemeral",
"termination-watcher",
"multi-runner",
"external-managed-ssm-secrets"
]
defaults:
run:
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ Join our discord community via [this invite link](https://discord.gg/bxgXW8jJGh)
| <a name="input_eventbridge"></a> [eventbridge](#input\_eventbridge) | Enable the use of EventBridge by the module. By enabling this feature events will be put on the EventBridge by the webhook instead of directly dispatching to queues for scaling.<br/><br/> `enable`: Enable the EventBridge feature.<br/> `accept_events`: List can be used to only allow specific events to be putted on the EventBridge. By default all events, empty list will be be interpreted as all events. | <pre>object({<br/> enable = optional(bool, true)<br/> accept_events = optional(list(string), null)<br/> })</pre> | `{}` | no |
| <a name="input_ghes_ssl_verify"></a> [ghes\_ssl\_verify](#input\_ghes\_ssl\_verify) | GitHub Enterprise SSL verification. Set to 'false' when custom certificate (chains) is used for GitHub Enterprise Server (insecure). | `bool` | `true` | no |
| <a name="input_ghes_url"></a> [ghes\_url](#input\_ghes\_url) | GitHub Enterprise Server URL. Example: https://github.internal.co - DO NOT SET IF USING PUBLIC GITHUB. However if you are using Github Enterprise Cloud with data-residency (ghe.com), set the endpoint here. Example - https://companyname.ghe.com | `string` | `null` | no |
| <a name="input_github_app"></a> [github\_app](#input\_github\_app) | GitHub app parameters, see your github app. Ensure the key is the base64-encoded `.pem` file (the output of `base64 app.private-key.pem`, not the content of `private-key.pem`). | <pre>object({<br/> key_base64 = string<br/> id = string<br/> webhook_secret = string<br/> })</pre> | n/a | yes |
| <a name="input_github_app"></a> [github\_app](#input\_github\_app) | GitHub app parameters, see your github app. <br/> You can optionally create the SSM parameters yourself and provide the ARN and name here, through the `*_ssm` attributes.<br/> If you chose to provide the configuration values directly here, <br/> please ensure the key is the base64-encoded `.pem` file (the output of `base64 app.private-key.pem`, not the content of `private-key.pem`).<br/> Note: the provided SSM parameters arn and name have a precedence over the actual value (i.e `key_base64_ssm` has a precedence over `key_base64` etc). | <pre>object({<br/> key_base64 = optional(string)<br/> key_base64_ssm = optional(object({<br/> arn = string<br/> name = string<br/> }))<br/> id = optional(string)<br/> id_ssm = optional(object({<br/> arn = string<br/> name = string<br/> }))<br/> webhook_secret = optional(string)<br/> webhook_secret_ssm = optional(object({<br/> arn = string<br/> name = string<br/> }))<br/> })</pre> | n/a | yes |
| <a name="input_idle_config"></a> [idle\_config](#input\_idle\_config) | List of time periods, defined as a cron expression, to keep a minimum amount of runners active instead of scaling down to 0. By defining this list you can ensure that in time periods that match the cron expression within 5 seconds a runner is kept idle. | <pre>list(object({<br/> cron = string<br/> timeZone = string<br/> idleCount = number<br/> evictionStrategy = optional(string, "oldest_first")<br/> }))</pre> | `[]` | no |
| <a name="input_instance_allocation_strategy"></a> [instance\_allocation\_strategy](#input\_instance\_allocation\_strategy) | The allocation strategy for spot instances. AWS recommends using `price-capacity-optimized` however the AWS default is `lowest-price`. | `string` | `"lowest-price"` | no |
| <a name="input_instance_max_spot_price"></a> [instance\_max\_spot\_price](#input\_instance\_max\_spot\_price) | Max price price for spot instances per hour. This variable will be passed to the create fleet as max spot price for the fleet. | `string` | `null` | no |
Expand Down
50 changes: 37 additions & 13 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ To be able to support a number of use-cases, the module has quite a lot of confi
- Linux vs Windows. You can configure the OS types linux and win. Linux will be used by default.
- Re-use vs Ephemeral. By default runners are re-used, until detected idle. Once idle they will be removed from the pool. To improve security we are introducing ephemeral runners. Those runners are only used for one job. Ephemeral runners only work in combination with the workflow job event. For ephemeral runners the lambda requests a JIT (just in time) configuration via the GitHub API to register the runner. [JIT configuration](https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions#using-just-in-time-runners) is limited to ephemeral runners (and currently not supported by GHES). For non-ephemeral runners, a registration token is always requested. In both cases the configuration is made available to the instance via the same SSM parameter. To disable JIT configuration for ephemeral runners set `enable_jit_config` to `false`. We also suggest using a pre-build AMI to improve the start time of jobs for ephemeral runners.
- Job retry (**Beta**). By default the scale-up lambda will discard the message when it is handled. Meaning in the ephemeral use-case an instance is created. The created runner will ask GitHub for a job, no guarantee it will run the job for which it was scaling. Result could be that with small system hick-up the job is keeping waiting for a runner. Enable a pool (org runners) is one option to avoid this problem. Another option is to enable the job retry function. Which will retry the job after a delay for a configured number of times.
- GitHub Cloud vs GitHub Enterprise Server (GHES). The runners support GitHub Cloud (Public GitHub - github.com), GitHub Data Residency instances (ghe.com), and GitHub Enterprise Server. For GHES, we rely on our community for support and testing. We have no capability to test GHES ourselves.
- GitHub Cloud vs GitHub Enterprise Server (GHES). The runners support GitHub Cloud (Public GitHub - github.com), GitHub Data Residency instances (ghe.com), and GitHub Enterprise Server. For GHES, we rely on our community for support and testing. We have no capability to test GHES ourselves.
- Spot vs on-demand. The runners use either the EC2 spot or on-demand life cycle. Runners will be created via the AWS [CreateFleet API](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateFleet.html). The module (scale up lambda) will request via the CreateFleet API to create instances in one of the subnets and of the specified instance types.
- ARM64 support via Graviton/Graviton2 instance-types. When using the default example or top-level module, specifying `instance_types` that match a Graviton/Graviton 2 (ARM64) architecture (e.g. a1, t4g or any 6th-gen `g` or `gd` type), you must also specify `runner_architecture = "arm64"` and the sub-modules will be automatically configured to provision with ARM64 AMIs and leverage GitHub's ARM64 action runner. See below for more details.
- Disable default labels for the runners (os, architecture and `self-hosted`) can achieve by setting `runner_disable_default_labels` = true. If enabled, the runner will only have the extra labels provided in `runner_extra_labels`. In case you on own start script is used, this configuration parameter needs to be parsed via SSM.
Expand All @@ -24,17 +24,44 @@ The module uses the AWS System Manager Parameter Store to store configuration fo
| `ssm_paths.root/var.prefix?/app/` | App secrets used by Lambda's |
| `ssm_paths.root/var.prefix?/runners/config/<name>` | Configuration parameters used by runner start script |
| `ssm_paths.root/var.prefix?/runners/tokens/<ec2-instance-id>` | Either JIT configuration (ephemeral runners) or registration tokens (non ephemeral runners) generated by the control plane (scale-up lambda), and consumed by the start script on the runner to activate / register the runner. |
| `ssm_paths.root/var.prefix?/webhook/runner-matcher-config` | Runner matcher config used by webhook to decide the target for the webhook event. |
| `ssm_paths.root/var.prefix?/webhook/runner-matcher-config` | Runner matcher config used by webhook to decide the target for the webhook event. |

Available configuration parameters:

| Parameter name | Description |
|-------------------------------------|---------------------------------------------------------------------------------------------------|
| `agent_mode` | Indicates if the agent is running in ephemeral mode or not. |
| `disable_default_labels` | Indicates if the default labels for the runners (os, architecture and `self-hosted`) are disabled |
| `enable_cloudwatch` | Configuration for the cloudwatch agent to stream logging. |
| `run_as` | The user used for running the GitHub action runner agent. |
| `token_path` | The path where tokens are stored. |
| Parameter name | Description |
| ------------------------ | ------------------------------------------------------------------------------------------------- |
| `agent_mode` | Indicates if the agent is running in ephemeral mode or not. |
| `disable_default_labels` | Indicates if the default labels for the runners (os, architecture and `self-hosted`) are disabled |
| `enable_cloudwatch` | Configuration for the cloudwatch agent to stream logging. |
| `run_as` | The user used for running the GitHub action runner agent. |
| `token_path` | The path where tokens are stored. |

### Note regarding GitHub App secrets provisioning in SSM

SSM parameters for GitHub App secrets (`webhook_secret`, `key_base64`, `id`) can also be manually created at the SSM path of your choice.

If you opt for this approach, please fill the `*_ssm` attributes of the `github_app` variable as following:

```
github_app = {
key_base64_ssm = {
name = "/your/path/to/ssm/parameter/key-base-64"
arn = "arn:aws:ssm:::parameter/your/path/to/ssm/parameter/key-base-64"
}
id_ssm = {
name = "/your/path/to/ssm/parameter/id"
arn = "arn:aws:ssm:::parameter/your/path/to/ssm/parameter/id"
}
webhook_secret_ssm = {
name = "/your/path/to/ssm/parameter/webhook-secret"
arn = "arn:aws:ssm:::parameter/your/path/to/ssm/parameter/webhook-secret"
}
}
```

Manually creating the SSM parameters that hold the configuration of your GitHub App avoids leaking critical plain text values in your terraform state and version control system. This is a recommended security practice for handling sensitive credentials.

You can read more [over here](../examples/external-managed-ssm-secrets/README.md).

## Encryption

Expand Down Expand Up @@ -124,7 +151,6 @@ You can configure runners to be ephemeral, in which case runners will be used on

The example for [ephemeral runners](examples/ephemeral.md) is based on the [default example](examples/default.md). Have look at the diff to see the major configuration differences.


## Job retry (**Beta**)

You can enable the job retry function to retry a job after a delay for a configured number of times. The function is disabled by default. To enable the function set `job_retry.enable` to `true`. The function will check the job status after a delay, and when the is still queued, it will create a new runner. The new runner is created in the same way as the others via the scale-up function. Hence the same configuration applies.
Expand All @@ -133,7 +159,6 @@ For checking the job status a API call is made to GitHub. Which can exhaust the

The option `job_retry.delay_in_seconds` is the delay before the job status is checked. The delay is increased by the factor `job_retry.delay_backoff` for each attempt. The upper bound for a delay is 900 seconds, which is the max message delay on SQS. The maximum number of attempts is configured via `job_retry.max_attempts`. The delay should be set to a higher value than the time it takes to start a runner.


## Prebuilt Images

This module also allows you to run agents from a prebuilt AMI to gain faster startup times. The module provides several examples to build your own custom AMI. To remove old images, an [AMI housekeeper module](modules/public/ami-housekeeper.md) can be used. See the [AMI examples](ami-examples/index.md) for more details.
Expand Down Expand Up @@ -231,7 +256,7 @@ The watcher is listening for spot termination warnings and create a log message
### Termination handler

!!! warning
This feature will only work once the CloudTrail is enabled.
This feature will only work once the CloudTrail is enabled.

The termination handler is listening for spot terminations by capture the `BidEvictedEvent` via CloudTrail. The handler will log and optionally create a metric for each termination. The intend is to enhance the logic to inform the user about the termination via the GitHub Job or Workflow run. The feature is disabled by default. The feature is enabled once the watcher is enabled, the feature can be disabled explicit by setting `instance_termination_watcher.features.enable_spot_termination_handler = false`.

Expand Down Expand Up @@ -332,5 +357,4 @@ resource "aws_iam_role_policy" "event_rule_firehose_role" {
}
```


NOTE: By default, a runner AMI update requires a re-apply of this terraform config (the runner AMI ID is looked up by a terraform data source). To avoid this, you can use `ami_id_ssm_parameter_name` to have the scale-up lambda dynamically lookup the runner AMI ID from an SSM parameter at instance launch time. Said SSM parameter is managed outside of this module (e.g. by a runner AMI build workflow).
1 change: 1 addition & 0 deletions docs/examples/external-managed-ssm-secrets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
--8<-- "examples/external-managed-ssm-secrets/README.md"
1 change: 1 addition & 0 deletions docs/examples/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ Examples are located in the [examples](https://github.com/github-aws-runners/ter
- _[Prebuilt Images](prebuilt.md)_: Example usages of deploying runners with a custom prebuilt image.
- _[Windows](windows.md)_: Example usage of creating a runner using Windows as the OS.
- _[Termination watcher](termination-watcher.md)_: Example usages of termination watcher.
- _[Externally managed SSM secrets](external-managed-ssm-secrets.md)_: Example usage of externally managed SSM secrets for the GitHub App credentials.
2 changes: 1 addition & 1 deletion examples/default/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This module shows how to create GitHub action runners. Lambda release will be do

Steps for the full setup, such as creating a GitHub app can be found in the root module's [README](https://github.com/github-aws-runners/terraform-aws-github-runner). First download the Lambda releases from GitHub. Alternatively you can build the lambdas locally with Node or Docker, there is a simple build script in `<root>/.ci/build.sh`. In the `main.tf` you can simply remove the location of the lambda zip files, the default location will work in this case.

> The default example assumes local built lambda's available. Ensure you have built the lambda's. Alternativly you can downlowd the lambda's. The version needs to be set to a GitHub release version, see https://github.com/github-aws-runners/terraform-aws-github-runner/releases
> The default example assumes local built lambda's available. Ensure you have built the lambda's. Alternatively you can download the lambda's. The version needs to be set to a GitHub release version, see https://github.com/github-aws-runners/terraform-aws-github-runner/releases

```bash
cd ../lambdas-download
Expand Down
Loading
Loading