Feature: allow the use of private networks #1567

xavierleune · 2024-11-22T10:17:43Z

Hi there,

First, thank you for this awesome project! 🙌
This feature allows you to deploy k3s using only private IPs on your servers. It requires a VPN and NAT configuration, which are out of scope for this deployment.

This is my very first time working with Terraform, so I'm open to any feedback on this pull request as it can probably be improved.

Have a nice day!

CF #282 #1255

cedric-lovit

I'm super excited for this feature. So thank you very much for putting effort in this!
You mentioned this was the first time you used terraform... You did super well!

Welcome to Terraform world! :D

autoscaler-agents.tf

modules/host/main.tf

nicolaracco · 2024-11-28T11:48:30Z

I attempted to switch to this implementation in an existing cookbook without making any other change to the kube.tf file. I’ve posted the current configuration here. However, I encountered several instances of the following error:

Error: Attempt to get attribute from null value
│ 
│   on .terraform/modules/kube-hetzner/modules/host/out.tf line 10, in output "private_ipv4_address":
│   10:   value = one(hcloud_server.server.network).ip
│     ├────────────────
│     │ hcloud_server.server.network is empty set of object
│ 
│ This value is null, so it does not have any attributes.
╵

Maybe is it expecting to always find the network referenced as an input? If I have some time later today, I’ll try to debug the issue and add more info

xavierleune · 2024-12-03T22:21:00Z

@nicolaracco I confirm that the network should always be defined. It must be created before your cluster, because you have to be connected to this network through a non-k8s server.

xavierleune · 2024-12-04T10:10:14Z

@mysticaltech do you have an opinion on this PR ? 🙏

mysticaltech · 2024-12-04T11:21:19Z

@xavierleune Thanks for this, will review it tonight

4erdenko · 2024-12-10T07:50:14Z

@xavierleune Thanks for this, will review it tonight

Hey, any updates?
Disabling public addresses on nodes its a important security thing.

mkajander · 2025-01-10T00:28:08Z

Tested this today (along with NAT + Wireguard) and was able to successfully deploy a cluster with public ips disabled for the nodes.

mysticaltech

Looking good! Will merge and try it before releasing.

mysticaltech · 2025-01-20T19:55:20Z

@xavierleune Here's my test deploy, the one I usually use for everything. Please make sure it deploys, this is important.

locals {
  # You have the choice of setting your Hetzner API token here or define the TF_VAR_hcloud_token env
  # within your shell, such as such: export TF_VAR_hcloud_token=xxxxxxxxxxx 
  # If you choose to define it in the shell, this can be left as is.

  # Your Hetzner token can be found in your Project > Security > API Token (Read & Write is required).
  hcloud_token = "xxxxxx"
}

module "kube-hetzner" {
  providers = {
    hcloud = hcloud
  }
  hcloud_token = var.hcloud_token != "" ? var.hcloud_token : local.hcloud_token

  source = "../kube-hetzner"

  cluster_name = "test12"

  initial_k3s_channel = "v1.30"

  ssh_public_key  = file("/home/karim/.ssh/id_ed25519.pub")
  ssh_private_key = file("/home/karim/.ssh/id_ed25519")

  network_region = "eu-central"

  control_plane_nodepools = [
    {
      name        = "control-plane",
      server_type = "cx22",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 1
    }
  ]

  agent_nodepools = [
    {
      name        = "agent-small",
      server_type = "cx22",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 1
    }
  ]

  autoscaler_nodepools = [
    {
      name        = "autoscaled-small"
      server_type = "cx22",
      location    = "fsn1",
      min_nodes   = 1
      max_nodes   = 2
    }
  ]

  load_balancer_type     = "lb11"
  load_balancer_location = "fsn1"

}

provider "hcloud" {
  token = var.hcloud_token != "" ? var.hcloud_token : local.hcloud_token
}

terraform {
  required_version = ">= 1.5.0"
  required_providers {
    hcloud = {
      source  = "hetznercloud/hcloud"
      version = ">= 1.43.0"
    }
  }
}

output "kubeconfig" {
  value     = module.kube-hetzner.kubeconfig
  sensitive = true
}

variable "hcloud_token" {
  sensitive = true
  default   = ""
}

Currently, having the following error.

mysticaltech

@xavierleune The ball is in your camp. Please, let's make sure this is fully backward compatible. See above. Thanks!

xavierleune · 2025-01-20T20:59:58Z

@mysticaltech thanks for your feedback, I'll have a look !

README.md

Mikopet · 2025-01-22T15:06:33Z

This is something I am also really looking forward to.

I have never done k8s cluster with publicly available nodes, and made me concerned. In fact, I was thinking about creating an own module because of that. But having this here would be way better :-)

Big thanks for the implementation!

xavierleune · 2025-01-23T09:20:28Z

@mysticaltech I think that should do the trick, your test file now works as expected

mysticaltech · 2025-01-23T12:00:12Z

Thanks @xavierleune, will try ASAP

devops-fd · 2025-01-26T09:50:46Z

Super excited to get this live, when it's expected to merge this?

devops-fd · 2025-01-27T17:12:49Z

Hello @xavierleune I have tried to run your branch and got some issues with post creation script as shown in the console

module.kube-hetzner.null_resource.first_control_plane: Provisioning with 'remote-exec'...
module.kube-hetzner.null_resource.first_control_plane (remote-exec): Connecting to remote host via SSH...
module.kube-hetzner.null_resource.first_control_plane (remote-exec):   Host: 10.127.128.101
module.kube-hetzner.null_resource.first_control_plane (remote-exec):   User: root
module.kube-hetzner.null_resource.first_control_plane (remote-exec):   Password: false
module.kube-hetzner.null_resource.first_control_plane (remote-exec):   Private key: true
module.kube-hetzner.null_resource.first_control_plane (remote-exec):   Certificate: false
module.kube-hetzner.null_resource.first_control_plane (remote-exec):   SSH Agent: false
module.kube-hetzner.null_resource.first_control_plane (remote-exec):   Checking Host Key: false
module.kube-hetzner.null_resource.first_control_plane (remote-exec):   Target Platform: unix
module.kube-hetzner.null_resource.first_control_plane (remote-exec): Connected!
module.kube-hetzner.null_resource.first_control_plane (remote-exec): + /etc/cloud/rename_interface.sh
module.kube-hetzner.null_resource.first_control_plane: Still creating... [10s elapsed]
module.kube-hetzner.null_resource.first_control_plane (remote-exec): loop
module.kube-hetzner.null_resource.first_control_plane (remote-exec): Error: Device 'eth0' not found.
module.kube-hetzner.null_resource.first_control_plane (remote-exec): Error: unknown connection ''.
module.kube-hetzner.null_resource.first_control_plane (remote-exec): got failure exit code, repeating
module.kube-hetzner.null_resource.first_control_plane (remote-exec): loop

Since disabling the public interface will result in a missing eth0

So changes must be made to locals.tf line 844 script to take this case into consideration

lefterisALEX · 2025-01-28T04:36:04Z

i think this block should be executed if eth0 do not exists.

    # In case of a private-only network, eth0 may not exist
    if ip link show eth0 &>/dev/null; then
        eth0_connection=$(nmcli -g GENERAL.CONNECTION device show eth0)
        nmcli connection modify "$eth0_connection" \
          con-name eth0 \
          connection.interface-name eth0
    fi

gePower

thanks for this code

locals.tf

mysticaltech · 2025-02-15T11:01:19Z

Thanks @xavierleune, on it, will do one final review and we can merge this.

mysticaltech

General Review

Overall, this PR adds valuable functionality for running Kube-Hetzner clusters with private IPs or even purely private connectivity. The approach is sensible: it updates modules to handle disable_ipv4 and disable_ipv6 flags, modifies how the code retrieves the node address, and clarifies the user guide for NAT setups.

Below is a deeper review highlighting potential concerns:

Connectivity Checks & Fallbacks
- Various sections (e.g., modules/host/main.tf) now use coalesce(ipv4, ipv6, network ip). If both IPv4 and IPv6 are disabled, the module depends on a NAT or VPN for connectivity. You might want more explicit error handling for the scenario where even the private IP is unreachable.
Resiliency of SSH Wait-Loop
- The code that loops until the server is ready is set with a 600-second (or indefinite) wait. That might be okay, but in a worst-case scenario, it can stall a pipeline. Consider adding a strict timeout or additional checks to fail more gracefully.
Backwards Compatibility
- Changes appear backward compatible overall—existing setups that keep IPv4 enabled will continue to operate as before. Just ensure that users with older configurations do not break if they never define the new boolean flags.
Validation & Documentation
- For purely private setups, you now rely on user-managed NAT or VPN. That’s correct, but do consider adding more disclaimers about the need for a stable route, especially if both IPv4 and IPv6 are disabled.
Maintainability
- The repeated coalesce logic is absolutely correct, but if the code in future evolves to support multiple interfaces or custom hostnames, a small helper or a local variable to unify the address selection might help.

Given these changes, I have some specific line-by-line suggestions below.

Summary

You’re on the right track. I recommend a few improvements to handle the purely private case more robustly and to keep the code a bit more self-explanatory. Great job overall!

kube.tf.example

modules/host/main.tf

mysticaltech · 2025-02-15T17:59:47Z

@xavierleune Almost there brother, thanks in advance 🚀

xavierleune · 2025-02-26T14:38:29Z

Hello there, sorry I've been away from this pull request for too long, I'll have a look at your feedbacks in the coming days !

README.md

…e is disabled

xavierleune · 2025-02-26T15:37:13Z

This one might be the one 🤞

amedeopalopoli · 2025-02-26T20:29:54Z

Thanks @xavierleune, on it, will do one final review and we can merge this.

Hi all,

in Hetzner Private Networks I think we should add the below command in the init
ip route add default via ${network-gateway}

As well as we did here I think we should add a condition when we get Network Manager up.

cloudinit_runcmd_common = <<EOT
....
# Make sure the network is up
- [systemctl, restart, NetworkManager]
- [systemctl, status, NetworkManager]
- [ip, route, add, default, via, '172.31.1.1', dev, 'eth0']
....
EOT

In this case I think we need to add default network gateway if public ip address is not enabled

mysticaltech · 2025-02-26T21:13:07Z

Thanks @xavierleune, on it, will do one final review and we can merge this.

Hi all,

in Hetzner Private Networks I think we should add the below command in the init ip route add default via ${network-gateway}

As well as we did here I think we should add a condition when we get Network Manager up.
cloudinit_runcmd_common = <<EOT
....
# Make sure the network is up
- [systemctl, restart, NetworkManager]
- [systemctl, status, NetworkManager]
- [ip, route, add, default, via, '172.31.1.1', dev, 'eth0']
....
EOT
In this case I think we need to add default network gateway if public ip address is not enabled

Thanks @xavierleune! What do you think about @amedeopalopoli's suggestion above?

xavierleune · 2025-02-27T08:48:37Z

@amedeopalopoli thanks for the suggestion, however I'm not sure we should do that, actually without public ip we should add a route for 0.0.0.0/0 in hetzner to the private ip of your gateway ; I followed this tutorial as I pointed out in the readme. I think that trying to use hetzner gateway without public ip would fail.
The setup I propose in the readme works for me since 3 months and I recreated a control plane yesterday that has access to net. So I'm not sure how to deal with your suggestion ^^

lefterisALEX · 2025-02-27T09:29:27Z

An example i have it working in my case.
Subnet : 192.168.100.0/24
NAT_GW IP: 192.168.100.2 (The IP of the instance i use as NAT Gateway)

then on the instance (running ubuntu) that has only private interface i do this:

    cat <<'EOF' >> /etc/systemd/network/10-enp7s0.network
    [Match]
    Name=enp7s0
    
    [Network]
    DHCP=yes
    Gateway=192.168.100.1
    EOF
    echo "DNS=9.9.9.9" | tee -a /etc/systemd/resolved.conf
    systemctl reload systemd-networkd
    systemctl restart systemd-resolved

Not 100% sure if i missed something , but i am happy to help if needed

xavierleune · 2025-02-27T09:47:34Z

@lefterisALEX thanks for the feedback, seems a lot like the approach I was talking about, BTW I think you might get rid of those steps by adding the route in hetzner routes for your subnet ; it should be automatic then, like this but with my gateway ^^

aru-amedeo · 2025-02-27T10:51:04Z

use hetzner gateway without public ip w

I'm sorry @xavierleune but if we have nodes with only private IPs they cannot reach out internet (0.0.0.0) unless you add
- [ip, route, add, default, via, '$gateway-network', dev, 'private-interface']

am I wrong?

xavierleune · 2025-02-27T11:17:44Z

use hetzner gateway without public ip w

I'm sorry @xavierleune but if we have nodes with only private IPs they cannot reach out internet (0.0.0.0) unless you add - [ip, route, add, default, via, '$gateway-network', dev, 'private-interface']

am I wrong?

You can, if you add your gateway in the routes for 0.0.0.0/0, like in my previous screenshot ; that will leverage nat masquerade

lefterisALEX · 2025-02-27T11:36:15Z

i think you need both @xavierleune . If you see in the documentation you shared in Step 3 - Configuring NAT and Step 4 - Achieving a persistent configuration it says that configuration is needed on client side as well.
For example for ubuntu:

xavierleune · 2025-02-28T11:23:09Z

@lefterisALEX ok I can see that but from my understanding 10.0.0.1 is already declared because it's the subnet gateway for hetzner. I think cloud-init already handles that part.

Here is an example for ip route show for a control plane created yesterday:

default via 10.0.0.1 dev eth1 proto dhcp src 10.127.128.101 metric 100
10.0.0.0/8 via 10.0.0.1 dev eth1 proto dhcp src 10.127.128.101 metric 100
10.0.0.1 dev eth1 proto dhcp scope link src 10.127.128.101 metric 100
10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1
10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink
10.42.3.0/24 via 10.42.3.0 dev flannel.1 onlink
10.42.4.0/24 via 10.42.4.0 dev flannel.1 onlink
10.42.5.0/24 via 10.42.5.0 dev flannel.1 onlink
10.42.6.0/24 via 10.42.6.0 dev flannel.1 onlink
10.42.7.0/24 via 10.42.7.0 dev flannel.1 onlink
10.42.8.0/24 via 10.42.8.0 dev flannel.1 onlink
10.42.9.0/24 via 10.42.9.0 dev flannel.1 onlink
169.254.169.254 via 10.0.0.1 dev eth1 proto dhcp src 10.127.128.101 metric 100

xavierleune · 2025-02-28T13:01:01Z

@aru-amedeo I'm curious, did you tried this branch ? Because as far as my tests goes, it's working just fine right now ; cloud init pulls the network config from hetzner api so the default route use the gateway subnet ; just like for servers with a public interface, without a route on 0.0.0.0/0 the main gateway from hetzner (172.31.1.1) is the default route, without any need to declare it. The current terraform does not configure the public gateway for servers with public interfaces ; neither it needs to do it for servers with only private network. Am I missing something here ?

amedeopalopoli · 2025-02-28T13:22:15Z

@lefterisALEX ok I can see that but from my understanding 10.0.0.1 is already declared because it's the subnet gateway for hetzner. I think cloud-init already handles that part.

Here is an example for ip route show for a control plane created yesterday:

default via 10.0.0.1 dev eth1 proto dhcp src 10.127.128.101 metric 100
10.0.0.0/8 via 10.0.0.1 dev eth1 proto dhcp src 10.127.128.101 metric 100
10.0.0.1 dev eth1 proto dhcp scope link src 10.127.128.101 metric 100
10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1
10.42.2.0/24 via 10.42.2.0 dev flannel.1 onlink
10.42.3.0/24 via 10.42.3.0 dev flannel.1 onlink
10.42.4.0/24 via 10.42.4.0 dev flannel.1 onlink
10.42.5.0/24 via 10.42.5.0 dev flannel.1 onlink
10.42.6.0/24 via 10.42.6.0 dev flannel.1 onlink
10.42.7.0/24 via 10.42.7.0 dev flannel.1 onlink
10.42.8.0/24 via 10.42.8.0 dev flannel.1 onlink
10.42.9.0/24 via 10.42.9.0 dev flannel.1 onlink
169.254.169.254 via 10.0.0.1 dev eth1 proto dhcp src 10.127.128.101 metric 100

Hi @xavierleune as far as I understood, when you create a private network, you should add #1567 (comment). This allows you to use the network route when the destination address is part of the internal network and when it does not (e.g. public ip like 8.8.8.8). This is why I said you this PR is not enough to achieve what you aim to do. It's missing the cloudinit stuff to configure the default gw when a public ip should be routable from nodes with no public interface attached to.

xavierleune force-pushed the feature/private-network branch from 15e7571 to e187c93 Compare November 22, 2024 10:18

cedric-lovit reviewed Nov 23, 2024

View reviewed changes

autoscaler-agents.tf Outdated Show resolved Hide resolved

autoscaler-agents.tf Outdated Show resolved Hide resolved

modules/host/main.tf Outdated Show resolved Hide resolved

mysticaltech previously approved these changes Jan 20, 2025

View reviewed changes

mysticaltech dismissed their stale review via 8c70b92 January 20, 2025 19:20

mysticaltech requested changes Jan 20, 2025

View reviewed changes

lefterisALEX reviewed Jan 21, 2025

View reviewed changes

README.md Show resolved Hide resolved

gePower reviewed Feb 8, 2025

View reviewed changes

locals.tf Show resolved Hide resolved

mysticaltech requested changes Feb 15, 2025

View reviewed changes

kube.tf.example Show resolved Hide resolved

modules/host/main.tf Show resolved Hide resolved

xavierleune commented Feb 26, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

xavierleune and others added 4 commits February 26, 2025 16:28

Feature: allow the use of private networks

f56fd9f

Kubeconfig should use private ip if the load balancer public interfac…

7454014

…e is disabled

More concise syntaxes

ef0cbc6

Ran terraform fmt

3eb60ae

mysticaltech and others added 3 commits February 26, 2025 16:28

fixed fmt

b701e5b

Fallback on hcloud_server_network.server if network is not specified

13bfd0f

Fix eth management & readme

c2bb343

xavierleune force-pushed the feature/private-network branch from 51dbf30 to c2bb343 Compare February 26, 2025 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: allow the use of private networks #1567

Feature: allow the use of private networks #1567

xavierleune commented Nov 22, 2024

cedric-lovit left a comment •

edited

Loading

nicolaracco commented Nov 28, 2024

xavierleune commented Dec 3, 2024

xavierleune commented Dec 4, 2024

mysticaltech commented Dec 4, 2024

4erdenko commented Dec 10, 2024

mkajander commented Jan 10, 2025

mysticaltech left a comment

mysticaltech commented Jan 20, 2025

mysticaltech left a comment •

edited

Loading

xavierleune commented Jan 20, 2025

Mikopet commented Jan 22, 2025

xavierleune commented Jan 23, 2025

mysticaltech commented Jan 23, 2025

devops-fd commented Jan 26, 2025

devops-fd commented Jan 27, 2025

lefterisALEX commented Jan 28, 2025

gePower left a comment

mysticaltech commented Feb 15, 2025

mysticaltech left a comment

mysticaltech commented Feb 15, 2025

xavierleune commented Feb 26, 2025

xavierleune commented Feb 26, 2025

amedeopalopoli commented Feb 26, 2025

mysticaltech commented Feb 26, 2025

xavierleune commented Feb 27, 2025

lefterisALEX commented Feb 27, 2025 •

edited

Loading

xavierleune commented Feb 27, 2025

aru-amedeo commented Feb 27, 2025

xavierleune commented Feb 27, 2025

lefterisALEX commented Feb 27, 2025

xavierleune commented Feb 28, 2025

xavierleune commented Feb 28, 2025

amedeopalopoli commented Feb 28, 2025

Feature: allow the use of private networks #1567

Are you sure you want to change the base?

Feature: allow the use of private networks #1567

Conversation

xavierleune commented Nov 22, 2024

cedric-lovit left a comment • edited Loading

Choose a reason for hiding this comment

nicolaracco commented Nov 28, 2024

xavierleune commented Dec 3, 2024

xavierleune commented Dec 4, 2024

mysticaltech commented Dec 4, 2024

4erdenko commented Dec 10, 2024

mkajander commented Jan 10, 2025

mysticaltech left a comment

Choose a reason for hiding this comment

mysticaltech commented Jan 20, 2025

mysticaltech left a comment • edited Loading

Choose a reason for hiding this comment

xavierleune commented Jan 20, 2025

Mikopet commented Jan 22, 2025

xavierleune commented Jan 23, 2025

mysticaltech commented Jan 23, 2025

devops-fd commented Jan 26, 2025

devops-fd commented Jan 27, 2025

lefterisALEX commented Jan 28, 2025

gePower left a comment

Choose a reason for hiding this comment

mysticaltech commented Feb 15, 2025

mysticaltech left a comment

Choose a reason for hiding this comment

General Review

Summary

mysticaltech commented Feb 15, 2025

xavierleune commented Feb 26, 2025

xavierleune commented Feb 26, 2025

amedeopalopoli commented Feb 26, 2025

mysticaltech commented Feb 26, 2025

xavierleune commented Feb 27, 2025

lefterisALEX commented Feb 27, 2025 • edited Loading

xavierleune commented Feb 27, 2025

aru-amedeo commented Feb 27, 2025

xavierleune commented Feb 27, 2025

lefterisALEX commented Feb 27, 2025

xavierleune commented Feb 28, 2025

xavierleune commented Feb 28, 2025

amedeopalopoli commented Feb 28, 2025

cedric-lovit left a comment •

edited

Loading

mysticaltech left a comment •

edited

Loading

lefterisALEX commented Feb 27, 2025 •

edited

Loading