Skip to content

MaterializeInc/terraform-aws-materialize

Repository files navigation

Materialize on AWS Cloud Platform

Terraform module for deploying Materialize on AWS Cloud Platform with all required infrastructure components.

The module has been tested with:

  • PostgreSQL 15
  • Materialize Helm Operator Terraform Module v0.1.8

Warning

This module is intended for demonstration/evaluation purposes as well as for serving as a template when building your own production deployment of Materialize.

This module should not be directly relied upon for production deployments: future releases of the module will contain breaking changes. Instead, to use as a starting point for your own production deployment, either:

  • Fork this repo and pin to a specific version, or
  • Use the code as a reference when developing your own deployment.

Providers Configuration

The module requires the following providers to be configured:

provider "aws" {
  region = "us-east-1"
  # Other AWS provider configuration as needed
}

# Required for EKS authentication
provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    args        = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
    command     = "aws"
  }
}

# Required for Materialize Operator installation
provider "helm" {
  kubernetes {
    host                   = module.eks.cluster_endpoint
    cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

    exec {
      api_version = "client.authentication.k8s.io/v1beta1"
      args        = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
      command     = "aws"
    }
  }
}

Note: The Kubernetes and Helm providers are configured to use the AWS CLI for authentication with the EKS cluster. This requires that you have the AWS CLI installed and configured with access to the AWS account where the EKS cluster is deployed.

You can also set the AWS_PROFILE environment variable to the name of the profile you want to use for authentication with the EKS cluster:

export AWS_PROFILE=your-profile-name

Disk Support for Materialize

This module supports configuring disk support for Materialize using NVMe instance storage and OpenEBS and lgalloc.

When using disk support, you need to use instance types from the r7gd or r6gd family or other instance types with NVMe instance storage.

Enabling Disk Support

To enable disk support with default settings:

enable_disk_support = true

This will:

  1. Install OpenEBS via Helm
  2. Configure NVMe instance store volumes using the bootstrap script
  3. Create appropriate storage classes for Materialize

Advanced Configuration

In case that you need more control over the disk setup:

enable_disk_support = true

disk_support_config = {
  openebs_version = "4.2.0"
  storage_class_name = "custom-storage-class"
  storage_class_parameters = {
    volgroup = "custom-volume-group"
  }
}

Requirements

Name Version
terraform >= 1.0
aws ~> 5.0
helm ~> 2.0
kubernetes ~> 2.0
random ~> 3.0

Providers

Name Version
aws 5.88.0

Modules

Name Source Version
aws_lbc ./modules/aws-lbc n/a
certificates ./modules/certificates n/a
database ./modules/database n/a
eks ./modules/eks n/a
networking ./modules/networking n/a
nlb ./modules/nlb n/a
operator github.com/MaterializeInc/terraform-helm-materialize v0.1.9
storage ./modules/storage n/a

Resources

Name Type
aws_cloudwatch_log_group.materialize resource
aws_iam_access_key.materialize_user resource
aws_iam_role.materialize_s3 resource
aws_iam_role_policy.materialize_s3 resource
aws_iam_user.materialize resource
aws_iam_user_policy.materialize_s3 resource
aws_caller_identity.current data source
aws_region.current data source

Inputs

Name Description Type Default Required
availability_zones List of availability zones list(string)
[
"us-east-1a",
"us-east-1b",
"us-east-1c"
]
no
bucket_force_destroy Enable force destroy for the S3 bucket bool true no
bucket_lifecycle_rules List of lifecycle rules for the S3 bucket
list(object({
id = string
enabled = bool
prefix = string
transition_days = number
transition_storage_class = string
noncurrent_version_expiration_days = number
}))
[
{
"enabled": true,
"id": "cleanup",
"noncurrent_version_expiration_days": 90,
"prefix": "",
"transition_days": 90,
"transition_storage_class": "STANDARD_IA"
}
]
no
cert_manager_chart_version Version of the cert-manager helm chart to install. string "v1.17.1" no
cert_manager_install_timeout Timeout for installing the cert-manager helm chart, in seconds. number 300 no
cert_manager_namespace The name of the namespace in which cert-manager is or will be installed. string "cert-manager" no
cluster_enabled_log_types List of desired control plane logging to enable list(string)
[
"api",
"audit",
"authenticator",
"controllerManager",
"scheduler"
]
no
cluster_version Kubernetes version for the EKS cluster string "1.32" no
create_vpc Controls if VPC should be created (it affects almost all resources) bool true no
database_name Name of the database to create string "materialize" no
database_password Password for the database (should be provided via tfvars or environment variable) string n/a yes
database_username Username for the database string "materialize" no
db_allocated_storage Allocated storage for the RDS instance (in GB) number 20 no
db_instance_class Instance class for the RDS instance string "db.t3.large" no
db_max_allocated_storage Maximum storage for autoscaling (in GB) number 100 no
db_multi_az Enable multi-AZ deployment for RDS bool false no
disk_support_config Advanced configuration for disk support (only used when enable_disk_support = true)
object({
install_openebs = optional(bool, true)
run_disk_setup_script = optional(bool, true)
create_storage_class = optional(bool, true)
openebs_version = optional(string, "4.2.0")
openebs_namespace = optional(string, "openebs")
storage_class_name = optional(string, "openebs-lvm-instance-store-ext4")
storage_class_provisioner = optional(string, "local.csi.openebs.io")
storage_class_parameters = optional(object({
storage = optional(string, "lvm")
fsType = optional(string, "ext4")
volgroup = optional(string, "instance-store-vg")
}), {})
})
{} no
enable_bucket_encryption Enable server-side encryption for the S3 bucket bool true no
enable_bucket_versioning Enable versioning for the S3 bucket bool true no
enable_cluster_creator_admin_permissions To add the current caller identity as an administrator bool true no
enable_disk_support Enable disk support for Materialize using OpenEBS and NVMe instance storage. When enabled, this configures OpenEBS, runs the disk setup script for NVMe devices, and creates appropriate storage classes. bool true no
enable_monitoring Enable CloudWatch monitoring bool true no
environment Environment name (e.g., prod, staging, dev) string n/a yes
helm_chart Chart name from repository or local path to chart. For local charts, set the path to the chart directory. string "materialize-operator" no
helm_values Additional Helm values to merge with defaults any {} no
install_aws_load_balancer_controller Whether to install the AWS Load Balancer Controller bool true no
install_cert_manager Whether to install cert-manager. bool false no
install_materialize_operator Whether to install the Materialize operator bool true no
install_metrics_server Whether to install the metrics-server for the Materialize Console bool true no
kubernetes_namespace The Kubernetes namespace for the Materialize resources string "materialize-environment" no
log_group_name_prefix Prefix for the CloudWatch log group name (will be combined with environment name) string "materialize" no
materialize_instances Configuration for Materialize instances. Due to limitations in Terraform, materialize_instances cannot be defined on the first terraform apply.
list(object({
name = string
namespace = optional(string)
database_name = string
environmentd_version = optional(string, "v0.130.4")
cpu_request = optional(string, "1")
memory_request = optional(string, "1Gi")
memory_limit = optional(string, "1Gi")
create_database = optional(bool, true)
create_nlb = optional(bool, true)
internal_nlb = optional(bool, true)
enable_cross_zone_load_balancing = optional(bool, true)
in_place_rollout = optional(bool, false)
request_rollout = optional(string)
force_rollout = optional(string)
balancer_memory_request = optional(string, "256Mi")
balancer_memory_limit = optional(string, "256Mi")
balancer_cpu_request = optional(string, "100m")
}))
[] no
metrics_retention_days Number of days to retain CloudWatch metrics number 7 no
namespace Namespace for all resources, usually the organization or project name string n/a yes
network_id The ID of the VPC in which resources will be deployed. Only used if create_vpc is false. string "" no
network_private_subnet_ids A list of private subnet IDs in the VPC. Only used if create_vpc is false. list(string) [] no
network_public_subnet_ids A list of public subnet IDs in the VPC. Only used if create_vpc is false. list(string) [] no
node_group_ami_type AMI type for the node group string "AL2023_ARM_64_STANDARD" no
node_group_capacity_type Capacity type for worker nodes (ON_DEMAND or SPOT) string "ON_DEMAND" no
node_group_desired_size Desired number of worker nodes number 2 no
node_group_instance_types Instance types for worker nodes.

Recommended Configuration for Running Materialize with disk:
- Tested instance types: r6gd, r7gd families (ARM-based Graviton instances)
- Enable disk setup when using r7gd
- Note: Ensure instance store volumes are available and attached to the nodes for optimal performance with disk-based workloads.
list(string)
[
"r7gd.2xlarge"
]
no
node_group_max_size Maximum number of worker nodes number 4 no
node_group_min_size Minimum number of worker nodes number 1 no
operator_namespace Namespace for the Materialize operator string "materialize" no
operator_version Version of the Materialize operator to install string null no
orchestratord_version Version of the Materialize orchestrator to install string null no
postgres_version Version of PostgreSQL to use string "15" no
private_subnet_cidrs CIDR blocks for private subnets list(string)
[
"10.0.1.0/24",
"10.0.2.0/24",
"10.0.3.0/24"
]
no
public_subnet_cidrs CIDR blocks for public subnets list(string)
[
"10.0.101.0/24",
"10.0.102.0/24",
"10.0.103.0/24"
]
no
service_account_name Name of the service account string "12345678-1234-1234-1234-123456789012" no
single_nat_gateway Use a single NAT Gateway for all private subnets bool false no
tags Default tags to apply to all resources map(string)
{
"Environment": "dev",
"Project": "materialize",
"Terraform": "true"
}
no
use_local_chart Whether to use a local chart instead of one from a repository bool false no
use_self_signed_cluster_issuer Whether to install and use a self-signed ClusterIssuer for TLS. Due to limitations in Terraform, this may not be enabled before the cert-manager CRDs are installed. bool false no
vpc_cidr CIDR block for VPC string "10.0.0.0/16" no

Outputs

Name Description
cluster_certificate_authority_data Base64 encoded certificate data required to communicate with the cluster
cluster_oidc_issuer_url The URL on the EKS cluster for the OpenID Connect identity provider
database_endpoint RDS instance endpoint
eks_cluster_endpoint EKS cluster endpoint
eks_cluster_name EKS cluster name
materialize_s3_role_arn The ARN of the IAM role for Materialize
metadata_backend_url PostgreSQL connection URL in the format required by Materialize
nlb_details Details of the Materialize instance NLBs.
oidc_provider_arn The ARN of the OIDC Provider
operator_details Details of the installed Materialize operator
persist_backend_url S3 connection URL in the format required by Materialize using IRSA
private_subnet_ids List of private subnet IDs
public_subnet_ids List of public subnet IDs
s3_bucket_name Name of the S3 bucket
vpc_id VPC ID

Post-Deployment Setup

After successfully deploying the infrastructure with this module, you'll need to:

  1. (Optional) Configure storage classes
  2. Install the Materialize Operator
  3. Deploy your first Materialize environment

See our Operator Installation Guide for instructions.

Connecting to Materialize instances

By default, Network Load Balancers are created for each Materialize instance, with three listeners:

  1. Port 6875 for SQL connections to the database.
  2. Port 6876 for HTTP(S) connections to the database.
  3. Port 8080 for HTTP(S) connections to the web console.

The DNS name and ARN for the NLBs will be in the terraform output as nlb_details.

TLS support

For example purposes, optional TLS support is provided by using cert-manager and a self-signed ClusterIssuer.

More advanced TLS support using user-provided CAs or per-Materialize Issuers are out of scope for this Terraform module. Please refer to the cert-manager documentation for detailed guidance on more advanced usage.

To enable installation of cert-manager and configuration of the self-signed ClusterIssuer
  1. Set install_cert_manager to true.
  2. Run terraform apply.
  3. Set use_self_signed_cluster_issuer to true.
  4. Run terraform apply.

Due to limitations in Terraform, it cannot plan Kubernetes resources using CRDs that do not exist yet. We need to first install cert-manager in the first terraform apply, before defining any ClusterIssuer or Certificate resources which get created in the second terraform apply.

Upgrade Notes

v0.3.0

We now install the AWS Load Balancer Controller and create Network Load Balancers for each Materialize instance.

If managing Materialize instances with this module, additional action may be required to upgrade to this version.

If you want to disable NLB support
  • Set install_aws_load_balancer_controller to false.
  • Set materialize_instances[*].create_nlb to false.
If you want to enable NLB support
  • Leave install_aws_load_balancer_controller set to its default of true.
  • Set materialize_instances[*].create_nlb to false.
  • Run terraform apply.
  • Set materialize_instances[*].create_nlb to true.
  • Run terraform apply.

Due to limitations in Terraform, it cannot plan Kubernetes resources using CRDs that do not exist yet. We need to first install the AWS Load Balancer Controller in the first terraform apply, before defining any TargetGroupBinding resources which get created in the second terraform apply.