Skip to content

Commit 942c0c0

Browse files
danehanskfswain
authored andcommitted
Separates EnvoyExtensionPolicy from Ext Proc (kubernetes-sigs#200)
* Separates EnvoyExtensionPolicy from Ext Proc Signed-off-by: Daneyon Hansen <[email protected]> * Removes dplicative EnvoyExtensionPolicy from manifests Signed-off-by: Daneyon Hansen <[email protected]> --------- Signed-off-by: Daneyon Hansen <[email protected]>
1 parent 94c0ee1 commit 942c0c0

File tree

3 files changed

+35
-40
lines changed

3 files changed

+35
-40
lines changed

pkg/README.md

+4-8
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,6 @@
3838
```
3939
Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
4040

41-
4241
1. **Deploy Gateway**
4342

4443
```bash
@@ -52,17 +51,14 @@
5251
1. **Deploy Ext-Proc**
5352

5453
```bash
55-
kubectl apply -f ./manifests/gateway/ext_proc.yaml
56-
kubectl apply -f ./manifests/gateway/patch_policy.yaml
54+
kubectl apply -f ./manifests/ext_proc.yaml
5755
```
58-
> **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
59-
60-
1. **OPTIONALLY**: Apply Traffic Policy
6156

62-
For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
57+
1. **Deploy Envoy Gateway Custom Policies**
6358

6459
```bash
65-
kubectl apply -f ./manifests/gateway/traffic_policy.yaml
60+
kubectl apply -f ./manifests/extension_policy.yaml
61+
kubectl apply -f ./manifests/patch_policy.yaml
6662
```
6763

6864
1. **Try it out**

pkg/manifests/extension_policy.yaml

+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
apiVersion: gateway.envoyproxy.io/v1alpha1
2+
kind: EnvoyExtensionPolicy
3+
metadata:
4+
name: ext-proc-policy
5+
namespace: default
6+
spec:
7+
extProc:
8+
- backendRefs:
9+
- group: ""
10+
kind: Service
11+
name: inference-gateway-ext-proc
12+
port: 9002
13+
processingMode:
14+
request:
15+
body: Buffered
16+
response:
17+
# The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
18+
# The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly.
19+
messageTimeout: 1000s
20+
backendSettings:
21+
circuitBreaker:
22+
maxConnections: 40000
23+
maxPendingRequests: 40000
24+
maxParallelRequests: 40000
25+
timeout:
26+
tcp:
27+
connectTimeout: 24h
28+
targetRef:
29+
group: gateway.networking.k8s.io
30+
kind: HTTPRoute
31+
name: llm-route

pkg/manifests/gateway/ext_proc.yaml

-32
Original file line numberDiff line numberDiff line change
@@ -103,35 +103,3 @@ spec:
103103
port: 9002
104104
targetPort: 9002
105105
type: ClusterIP
106-
---
107-
apiVersion: gateway.envoyproxy.io/v1alpha1
108-
kind: EnvoyExtensionPolicy
109-
metadata:
110-
name: ext-proc-policy
111-
namespace: default
112-
spec:
113-
extProc:
114-
- backendRefs:
115-
- group: ""
116-
kind: Service
117-
name: inference-gateway-ext-proc
118-
port: 9002
119-
processingMode:
120-
request:
121-
body: Buffered
122-
response:
123-
# The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
124-
# The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly.
125-
messageTimeout: 1000s
126-
backendSettings:
127-
circuitBreaker:
128-
maxConnections: 40000
129-
maxPendingRequests: 40000
130-
maxParallelRequests: 40000
131-
timeout:
132-
tcp:
133-
connectTimeout: 24h
134-
targetRef:
135-
group: gateway.networking.k8s.io
136-
kind: HTTPRoute
137-
name: llm-route

0 commit comments

Comments
 (0)