optionally disable clusterIP for Service Type=LoadBalancer #3623

xiaonancc77 · 2022-10-18T07:11:34Z

One-line PR description:
This is useful for VIP-based implementations of Service Type=LoadBalancer where the ClusterIP is not needed. The largest motivations for this feature is:
the number of loadbalancers is limited by the number of available clusterIP.

Issue link:
Optionally Disable ClusterIP for type: LoadBalancer Services #3622

Other comments:

k8s-ci-robot · 2022-10-18T07:11:42Z

Welcome @xiaonancc77!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2022-10-18T07:11:42Z

Hi @xiaonancc77. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2022-10-18T07:11:45Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xiaonancc77
Once this PR has been reviewed and has the lgtm label, please assign dcbw for approval by writing /assign @dcbw in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

keps/sig-network/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

aojea · 2022-10-18T12:57:13Z

keps/sig-network/3622-disable-lb-cluster-ip/README.md

+* the number of load balancers is now limited to the number of available clusterIP. 
+* clusterIP are allocated for an LB even though they are not used.
+
+For clusters that have integrations for Service Type=LoadBalancer but don't require clusterIP should have the option to disable clusterIP allocation.


how the other Pods in the cluster will be able to reach this Service?
I think this will break the service discovery per example, that uses DNS for the svcname and returns the ClusterIP as A record

In some business scenarios, some services do not need to be accessed by the pods in the cluster, but are accessed by requests from outside the cluster.In some cases, a large number of ClusterIPs that are not needed will be allocated.

In some actual production environments, the access within the cluster is also expected to be accessed through the cloud provider loadBalancer(service external ip) instead of ipvs/iptables, because the cloud provider loadBalancer has session retention, health detection, traffic monitoring and more functions .

dns: If need access within the cluster, dns can configure the k8s_external plugin(dns can return the ExternalIP for service name)

those are a lot of exceptions for justifying changing a default behavior 😅

However, there is indeed a need for LB service not to be accessed by pods in the cluster. In this case, it is really unnecessary to allocate clusterIP.

aojea · 2022-10-19T11:09:07Z

the number of loadbalancers is limited by the number of available clusterIP.

This is the part I find harder to understand, ClusterIPs come from the ServiceCIDR that use to be private network and you can use up to a /12 , however, load balancer IPs has to be public and those are much expensive and scarce.

xiaonancc77 · 2022-10-20T02:55:49Z

the number of loadbalancers is limited by the number of available clusterIP.

This is the part I find harder to understand, ClusterIPs come from the ServiceCIDR that use to be private network and you can use up to a /12 , however, load balancer IPs has to be public and those are much expensive and scarce.

First of all this is true in the private cloud scenario.

However, in public cloud scenarios, such as Tencent Cloud and Alibaba Cloud, in a subnet, the network segments of the container network of all k8s clusters cannot overlap, that is to say, the network segment resources determine the number or the size of clusters that can be purchased.
It is precisely because of this constraint that it brings difficulties to our production environment. If we allocate a large server ip range to multiple clusters, but we do not need to use clusterIP in the clusters, it will cause waste here. Setting a smaller service ip range is not enough for us to create services.

In addition, among public cloud vendors, the vip provided by load balancing is also divided into public network ip and internal ip. The internal IP of the load balancer is associated with the external IP of the k8s cluster server, which can also meet the needs of access outside the cluster. At the same time, the internal ip, like the container ip, belongs to the private network and it is not so scarce.

Whether it is a private cloud or a public cloud, there is indeed a service bound to a load balancer, and there is no need to be accessed by pods in the cluster.
In our production environment, this problem has reached a certain magnitude, which will bring us a limit on the size of the cluster and the number of clusters.

aojea · 2022-10-20T06:33:44Z

If we allocate a large server ip range to multiple clusters, but we do not need to use clusterIP in the clusters, it will cause waste here

but the service ip range can overlap, and there are current limitations on the number of Service supported at scale, with a /20 you'll have the possibility to use 4093 ClusterIPs ... just curious, why don't you use the same Service CIDR for all the clusters?

xiaonancc77 · 2022-10-20T08:00:02Z

In many public cloud providers, multiple clusters on the same subnet are not allowed to use overlapping network segments, including service ip ranges.

Even though all the clusters use the same service ip range, if the setting is too large, it will crowd out the container network ip resources. If the setting is too small, it will not be able to expand the IP segment when it is not enough(many public cloud providers currently do not allow the service ip range to be modified after the cluster is created. ).

And in the case where clusterIP is not required, there is really no need for mandatory allocation, and this requirement can be opened as optional.

In many public cloud providers, multiple clusters on the same subnet are not allowed to use overlapping network segments, including service ip ranges.

Even though all the clusters use the same service ip range, if the setting is too large, it will crowd out the container network ip resources. If the setting is too small, it will not be able to expand the IP segment when it is not enough(many public cloud providers currently do not allow the service ip range to be modified after the cluster is created.
).

xiaonancc77 · 2022-10-20T08:01:14Z

In the case where clusterIP is not required, there is really no need for mandatory allocation, and this requirement can be opened as optional.

xiaonancc77 · 2022-10-24T07:27:23Z

@aojea @caseydavenport @dcbw

aojea · 2022-10-25T15:40:59Z

/assign @danwinship @thockin @khenidak

thockin · 2022-10-31T22:21:45Z

I appreciate the KEP, but I don't think we want to do this. We did the equivalent for nodePort under duress, but nodePorts are ACTUALLY a limited resource. As @aojea says, you CAN reuse the same service CIDR on many clusters (in most implementations) and those CAN be very large. This new exception would have to propagate pretty far and wide - every discovery mechanism, including but not limited to DNS, would need to be aware that this previously required field could now be empty.

The risk of this change is quite large. I'll leave it open for a bit to collect feedback, but I'm -1 on this proposal.

robscott · 2022-11-03T20:45:07Z

@xiaonancc77 With the upcoming work on Gateway API for L4 load balancing I don't think we'll have any expectations of also provisioning a ClusterIP, maybe it would be more straightforward to invest in that going forward instead of a new KEP on Service?

khenidak · 2022-11-03T21:24:57Z

The motivation

* the number of load balancers is now limited to the number of available clusterIP. 
* clusterIP are allocated for an LB even though they are not used.

is centered around preserving ClusterIPs. This is specially painful to deal with if the cluster was setup with a small ClusterIPs CIRD. I think we have a KEP to allow multiples of these CIDRs which should also solve for the problem this KEP is trying to solve for, right?

xiaonancc77 · 2022-11-04T03:04:56Z

@robscott Do you have a link to the relevant documentation, please? Thanks～～

xiaonancc77 · 2022-11-04T03:36:22Z

@khenidak Thank you for your reply, you are right, but being able to configure multiple service CIDRs can solve some of our pain points, but not all.

In a private network, container CIDR, service CIDR, and LoadBalancer CIDR cannot overlap.
In the current mode, an LoadBalancer service will consume a service ip and a LoadBalancer ip at the same time.
In a private network, our business will have a large number of LoadBalancer service requirements, which will quickly consume the IP resources(half the number of IPs is unnecessary) of our private network.
Therefore, this limits the number or scale of clusters we can create in the same private network

shaneutt · 2022-12-22T17:30:38Z

/cc @shaneutt

MikeZappa87 · 2022-12-22T17:30:48Z

"the number of loadbalancers is limited by the number of available clusterIP." <---- What is your service cluster cidr? It would be good to understand why you are running out of clusterIP's

shaneutt · 2022-12-22T17:41:19Z

@robscott Do you have a link to the relevant documentation, please?

The project is https://gateway-api.sigs.k8s.io/ and the repository is https://github.com/kubernetes-sigs/gateway-api.

One of the long term goals of this project is to enable the Gateway resource as an alternative to Service type=LoadBalancer, kubernetes-sigs/gateway-api#223 is a potentially side-relevant issue, but I don't think we currently have a completely formal statement of this intent in our issues. This goal would appear to align with your goals, but we would need to start formalizing that intent further and pulling together the interested parties.

We would love to have you join one of the upcoming Gateway API community meetings and talk to us more about your use case and needs (note that they wont be starting back up until January 9th due to the holidays). Please feel free to put something on our agenda for an upcoming meeting, or if you prefer async we have a discussions board and we're in #sig-network-gateway-api on Kubernetes Slack.

"the number of loadbalancers is limited by the number of available clusterIP." <---- What is your service cluster cidr? It would be good to understand why you are running out of clusterIP's

I'm also very curious about the impetus for this change. Can you please help us to better understand:

how you get into this resource depletion problem, what kind of numbers are we talking in terms of LoadBalancers?
are new clusters being deployed with this problem, or is this specific to older long-running clusters? If so why?
what is the friction with re-deploying clusters onto larger IP pools?

thockin · 2022-12-22T18:04:02Z

In almost all implementations, the service clusterIP range is virtual - it never hits the wire. That said, you don't want it to overlap with real IPs that you use elsewhere in your network. It can be any range that you can afford to consume - link-local or RFC-1918 or CGNAT or class E or even puplic IPs you know you will never use. IPv6 should be even easier. You can also use the same range in every cluster.

Adding more API to economize on these has a real cost for maintenacne and testing, and the cost is forever. I don't think we want to do this proposal this way - Gateway is our way out of the "stacked" model of Services.

/close

k8s-ci-robot · 2022-12-22T18:04:08Z

@thockin: Closed this PR.

In response to this:

In almost all implementations, the service clusterIP range is virtual - it never hits the wire. That said, you don't want it to overlap with real IPs that you use elsewhere in your network. It can be any range that you can afford to consume - link-local or RFC-1918 or CGNAT or class E or even puplic IPs you know you will never use. IPv6 should be even easier. You can also use the same range in every cluster.

Adding more API to economize on these has a real cost for maintenacne and testing, and the cost is forever. I don't think we want to do this proposal this way - Gateway is our way out of the "stacked" model of Services.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 18, 2022

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Oct 18, 2022

k8s-ci-robot requested a review from caseydavenport October 18, 2022 07:11

k8s-ci-robot requested a review from dcbw October 18, 2022 07:11

k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 18, 2022

optionally disable clusterIP for Service Type=LoadBalancer

0d4b4ef

xiaonancc77 force-pushed the svc branch from f359c85 to 0d4b4ef Compare October 18, 2022 07:16

xiaonancc77 mentioned this pull request Oct 18, 2022

Optionally Disable ClusterIP for type: LoadBalancer Services #3622

Closed

4 tasks

aojea reviewed Oct 18, 2022

View reviewed changes

xiaonancc77 closed this Oct 20, 2022

xiaonancc77 reopened this Oct 20, 2022

k8s-ci-robot assigned danwinship, khenidak and thockin Oct 25, 2022

k8s-ci-robot requested a review from shaneutt December 22, 2022 17:30

k8s-ci-robot closed this Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optionally disable clusterIP for Service Type=LoadBalancer #3623

optionally disable clusterIP for Service Type=LoadBalancer #3623

xiaonancc77 commented Oct 18, 2022 •

edited

Loading

k8s-ci-robot commented Oct 18, 2022

k8s-ci-robot commented Oct 18, 2022

k8s-ci-robot commented Oct 18, 2022

aojea Oct 18, 2022

xiaonancc77 Oct 19, 2022

aojea Oct 19, 2022

xiaonancc77 Oct 20, 2022 •

edited

Loading

aojea commented Oct 19, 2022

xiaonancc77 commented Oct 20, 2022

aojea commented Oct 20, 2022

xiaonancc77 commented Oct 20, 2022 •

edited

Loading

xiaonancc77 commented Oct 20, 2022

xiaonancc77 commented Oct 24, 2022

aojea commented Oct 25, 2022

thockin commented Oct 31, 2022

robscott commented Nov 3, 2022

khenidak commented Nov 3, 2022

xiaonancc77 commented Nov 4, 2022 •

edited

Loading

xiaonancc77 commented Nov 4, 2022

shaneutt commented Dec 22, 2022

MikeZappa87 commented Dec 22, 2022

shaneutt commented Dec 22, 2022 •

edited

Loading

thockin commented Dec 22, 2022

k8s-ci-robot commented Dec 22, 2022

optionally disable clusterIP for Service Type=LoadBalancer #3623

optionally disable clusterIP for Service Type=LoadBalancer #3623

Conversation

xiaonancc77 commented Oct 18, 2022 • edited Loading

k8s-ci-robot commented Oct 18, 2022

k8s-ci-robot commented Oct 18, 2022

k8s-ci-robot commented Oct 18, 2022

aojea Oct 18, 2022

Choose a reason for hiding this comment

xiaonancc77 Oct 19, 2022

Choose a reason for hiding this comment

aojea Oct 19, 2022

Choose a reason for hiding this comment

xiaonancc77 Oct 20, 2022 • edited Loading

Choose a reason for hiding this comment

aojea commented Oct 19, 2022

xiaonancc77 commented Oct 20, 2022

aojea commented Oct 20, 2022

xiaonancc77 commented Oct 20, 2022 • edited Loading

xiaonancc77 commented Oct 20, 2022

xiaonancc77 commented Oct 24, 2022

aojea commented Oct 25, 2022

thockin commented Oct 31, 2022

robscott commented Nov 3, 2022

khenidak commented Nov 3, 2022

xiaonancc77 commented Nov 4, 2022 • edited Loading

xiaonancc77 commented Nov 4, 2022

shaneutt commented Dec 22, 2022

MikeZappa87 commented Dec 22, 2022

shaneutt commented Dec 22, 2022 • edited Loading

thockin commented Dec 22, 2022

k8s-ci-robot commented Dec 22, 2022

xiaonancc77 commented Oct 18, 2022 •

edited

Loading

xiaonancc77 Oct 20, 2022 •

edited

Loading

xiaonancc77 commented Oct 20, 2022 •

edited

Loading

xiaonancc77 commented Nov 4, 2022 •

edited

Loading

shaneutt commented Dec 22, 2022 •

edited

Loading