|
| 1 | +# Improving configurability of Kafka listeners |
| 2 | + |
| 3 | +## Current situation |
| 4 | + |
| 5 | +Kafka listeners are configured in the `Kafka` custom resource in `.spec.kafka.listeners`. |
| 6 | +Apart from the non-configurable internal replication listener, users can configure only 3 listeners: |
| 7 | +* `plain` listener available within the Kubernetes cluster only. |
| 8 | +The only configuration options are authentication and network policy rules. |
| 9 | +* `tls` listener available within the Kubernetes cluster only. |
| 10 | +The only configuration options are authentication, network policy rules and custom certificates. |
| 11 | +* `external` listener available for use outside of the Kubernetes cluster. |
| 12 | +This listener has 4 different types (`nodeport`, `loadbalancer`, `ingress` and `route`). |
| 13 | +Some of these types have extensive configuration options including overriding advertised hosts etc. |
| 14 | + |
| 15 | +The listeners have fixed ports: |
| 16 | +* 9092 for the `plain` listener |
| 17 | +* 9093 for the `tls` listener |
| 18 | +* 9094 for the `external` listener |
| 19 | + |
| 20 | +The configuration in the Kafka custom resource can look for example like this: |
| 21 | + |
| 22 | +```yaml |
| 23 | + listeners: |
| 24 | + plain: |
| 25 | + authentication: |
| 26 | + type: scram-sha-512 |
| 27 | + networkPolicyPeers: |
| 28 | + - podSelector: |
| 29 | + matchLabels: |
| 30 | + app: kafka-plaintext-consumer |
| 31 | + - podSelector: |
| 32 | + matchLabels: |
| 33 | + app: kafka-plaintext-producer |
| 34 | + tls: |
| 35 | + authentication: |
| 36 | + type: tls |
| 37 | + networkPolicyPeers: |
| 38 | + - podSelector: |
| 39 | + matchLabels: |
| 40 | + app: kafka-consumer |
| 41 | + - podSelector: |
| 42 | + matchLabels: |
| 43 | + app: kafka-producer |
| 44 | + external: |
| 45 | + type: route |
| 46 | + authentication: |
| 47 | + type: tls |
| 48 | +``` |
| 49 | +
|
| 50 | +But in the simple form it might look like this when now configuration options are used, just the listeners are enabled: |
| 51 | +
|
| 52 | +```yaml |
| 53 | + listeners: |
| 54 | + plain: {} |
| 55 | + tls: {} |
| 56 | +``` |
| 57 | +
|
| 58 | +In the operator code, each listener has currently its own code path which implements it. |
| 59 | +This works fine, but seems to offer some space for improvement since there are many similarities between the listeners and processing them by the same code would be more useful. |
| 60 | +But our current implementation makes that hard because each listener has its own class in the `api` module with its own configs, settings etc. |
| 61 | + |
| 62 | +## Motivation for change |
| 63 | + |
| 64 | +The current implementation has many limitations |
| 65 | +* Only 3 listeners can be configured |
| 66 | +* 2 out of the 3 listeners have fixed roles in terms of encryption, so when you want to use for example encrypted listeners only, you have effectively only 2 available listeners. |
| 67 | +* The `plain` and `tls` listeners do not offer some additional configuration options such as overriding advertised hosts etc. |
| 68 | +* Only one listener can be used for outside of the Kubernetes cluster. |
| 69 | +* Since each Kafka listener can have only a single authentication mechanism, possibilities for offering different authentication to different clients are limited. |
| 70 | +* Port numbers are assigned and are not configurable. |
| 71 | + |
| 72 | +Some of the use cases raised by our users in the past are listed here: |
| 73 | + |
| 74 | +### Multiple external listeners |
| 75 | + |
| 76 | +In some situations, having multiple external listener could be very useful. |
| 77 | +Apart from the obvious such as multiple authentication mechanisms or listeners with and without encryption, multiple listeners can be also useful to handle access from different networks outside the Kubernetes cluster. |
| 78 | + |
| 79 | +For example in AWS, you can have two types of load balancers: public and internal. |
| 80 | +Public load balancers are exposed to the Internet and can be used by applications running anywhere online. |
| 81 | +On the other hand internal load balancers are exposed only within the VPC (Virtual Private Cloud) - and accessible only to the applications running outside of Kubernetes but inside the private network of the user. |
| 82 | + |
| 83 | +### Configurability of internal listeners |
| 84 | + |
| 85 | +Some users heavily customize their Kubernetes network. |
| 86 | +One of the examples is joining their Kubernetes network with their network outside of Kubernetes. |
| 87 | +In such case, a common problem is using or not using the cluster service DNS suffix (`.cluster.local` by default) which is in some cases needed and in some not. |
| 88 | +Having the ability to override the advertised hostnames on internal listeners similarly to how they can be treated on the external listener or being able to have one internal listeners could help in these situations. |
| 89 | + |
| 90 | +## Proposed changes |
| 91 | + |
| 92 | +This proposal tries to make the configuration of the listeners more flexible. |
| 93 | +It should use array / list instead of an object with predefined listeners. |
| 94 | +It should also open more configuration options to all of them. |
| 95 | + |
| 96 | +_Note: The replication listener should be still kept non-configurable and hidden from users._ |
| 97 | + |
| 98 | +The new configuration can look like this: |
| 99 | + |
| 100 | +```yaml |
| 101 | + listeners: |
| 102 | + - name: encrypted1 |
| 103 | + port: 9092 |
| 104 | + type: service |
| 105 | + tls: true |
| 106 | + authentication: |
| 107 | + type: tls |
| 108 | + overrides: |
| 109 | + brokers: |
| 110 | + - broker: 0 |
| 111 | + advertisedHost: my-cluster-brokers-kafka-0.myns.cluster.local |
| 112 | + - broker: 1 |
| 113 | + advertisedHost: my-cluster-brokers-kafka-1.myns.cluster.local |
| 114 | + - broker: 2 |
| 115 | + advertisedHost: my-cluster-brokers-kafka-2.myns.cluster.local |
| 116 | + - name: encrypted2 |
| 117 | + port: 9093 |
| 118 | + type: service |
| 119 | + tls: true |
| 120 | + authentication: |
| 121 | + type: scram-sha-512 |
| 122 | + - name: routes |
| 123 | + port: 9094 |
| 124 | + type: route |
| 125 | + tls: true |
| 126 | + authentication: |
| 127 | + type: tls |
| 128 | + - name: nodeports |
| 129 | + port: 9095 |
| 130 | + type: nodeport |
| 131 | + tls: true |
| 132 | + authentication: |
| 133 | + type: tls |
| 134 | +``` |
| 135 | + |
| 136 | +The listeners will be configurable as array. |
| 137 | +Each listener will have its own unique name and will specify a type and unique port as required values. |
| 138 | +The port can be set to anything apart from `9091` (internal replication listener) and `9404` (Prometheus). |
| 139 | +A new type service will be introduced for the internal listeners designed for apps running inside the same Kubernetes cluster. |
| 140 | +Additionally, all the types from the existing external listeners will be supported as well. |
| 141 | + |
| 142 | +Together with this change, I suggest to change the default value of the listener `tls` flag from `true` to `false`. |
| 143 | +The current situation, when TLS is enabled by default without using the `tls` field and needed to explicitly use the `tls: false` to disable TLS seems unintuitive and confusing. |
| 144 | + |
| 145 | +### Backwards compatibility |
| 146 | + |
| 147 | +The old format can be easily converted into the new format without any information loss. |
| 148 | +The example YAML below corresponds to the first example from the _Current situation_ section: |
| 149 | + |
| 150 | +```yaml |
| 151 | + listeners: |
| 152 | + - name: plain |
| 153 | + port: 9092 |
| 154 | + type: service |
| 155 | + tls: false |
| 156 | + authentication: |
| 157 | + type: scram-sha-512 |
| 158 | + networkPolicyPeers: |
| 159 | + - podSelector: |
| 160 | + matchLabels: |
| 161 | + app: kafka-plaintext-consumer |
| 162 | + - podSelector: |
| 163 | + matchLabels: |
| 164 | + app: kafka-plaintext-producer |
| 165 | + - name: tls |
| 166 | + port: 9093 |
| 167 | + type: service |
| 168 | + tls: true |
| 169 | + authentication: |
| 170 | + type: tls |
| 171 | + networkPolicyPeers: |
| 172 | + - podSelector: |
| 173 | + matchLabels: |
| 174 | + app: kafka-consumer |
| 175 | + - podSelector: |
| 176 | + matchLabels: |
| 177 | + app: kafka-producer |
| 178 | + - name: external |
| 179 | + port: 9094 |
| 180 | + type: route |
| 181 | + tls: true |
| 182 | + authentication: |
| 183 | + type: tls |
| 184 | +``` |
| 185 | + |
| 186 | +The second example could be converted like this: |
| 187 | + |
| 188 | +```yaml |
| 189 | + listeners: |
| 190 | + - name: plain |
| 191 | + port: 9092 |
| 192 | + type: service |
| 193 | + - name: tls |
| 194 | + port: 9093 |
| 195 | + type: service |
| 196 | +``` |
| 197 | + |
| 198 | +Being easily able to convert the old and new structures makes it easy to have single code in our operator while being able to easily work with the old format as well. |
| 199 | + |
| 200 | +### Code changes |
| 201 | + |
| 202 | +Since both the new and the old structure will use the same `listeners` property, the CRD generator will need to be enhanced to support the OpenAPI `oneOf` for different types. |
| 203 | +(The CRD generator already supports `oneOf` for having allowed only one of multiple fields defined, but not for having one field with multiple different types) |
| 204 | + |
| 205 | +```yaml |
| 206 | +listeners: |
| 207 | + oneOf: |
| 208 | + - type: object |
| 209 | + properties: |
| 210 | + # ... |
| 211 | + - type: array |
| 212 | + items: |
| 213 | + # ... |
| 214 | +``` |
| 215 | + |
| 216 | +In the `api` module, the setter and getter for the `listeners` field will either use generic `JsonNode` type and inside the `api` module decide whether it should be decoded into the new or old format when the getter or setter is called or will implement their own custom `JsonSerializer` to (de)serialize the values. |
| 217 | +The `api` module will keep both the old and new format to be able to also serialize back to the original CRD. |
| 218 | +When the old format is used, it will be converted in the `fromCrd` method inside the `KafkaCluster` class in the `cluster-operator` module and from there on, only the new format will be used to configure the pods, services etc. |
| 219 | +Since all listeners will be part of the same array, it should be possible to simplify the code configuring the listeners and just pass all of them through a single loop. |
| 220 | + |
| 221 | +### Multiple external listeners |
| 222 | + |
| 223 | +Having multiple external listeners would require to have multiple sets of external services. |
| 224 | +For backwards compatibility, when the name of the external listener is `external`, the current names of the services, routes and ingresses will be used. |
| 225 | +For any other external listeners, a separate set of services will be created and named after the name of the listener. |
| 226 | + |
| 227 | +_Note: separate services are needed to create different external listeners with different configurations._ |
| 228 | + |
| 229 | +## Rejected alternatives |
| 230 | + |
| 231 | +### Extending the current API |
| 232 | + |
| 233 | +Some of the problems described in this proposal can be also handled just by extending the existing object. |
| 234 | +But it will give less flexibility than the proposal described above and will keep the code harder to maintain. |
| 235 | + |
| 236 | +### Using a different field for new format |
| 237 | + |
| 238 | +The new listeners configuration could use a different field - for example `endpoints`. |
| 239 | +That will simplify the CRD generator and the `api` module which will not need to be able to handle multiple types under the same property. |
| 240 | +But it seems less elegant and the ability to handle multiple different types might be useful on other places as well. |
0 commit comments