Separates EnvoyExtensionPolicy from Ext Proc (kubernetes-sigs#200)

danehans · kfswain · commit 942c0c09a963 · 2025-01-27T15:55:00.000Z
* Separates EnvoyExtensionPolicy from Ext Proc

Signed-off-by: Daneyon Hansen &lt;daneyon.hansen@solo.io&gt;

* Removes dplicative EnvoyExtensionPolicy from manifests

Signed-off-by: Daneyon Hansen &lt;daneyon.hansen@solo.io&gt;

---------

Signed-off-by: Daneyon Hansen &lt;daneyon.hansen@solo.io&gt;
diff --git a/pkg/README.md b/pkg/README.md
@@ -38,7 +38,6 @@
    ```
    Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
 
-
 1. **Deploy Gateway**
 
    ```bash
@@ -52,17 +51,14 @@
 1. **Deploy Ext-Proc**
 
    ```bash
-   kubectl apply -f ./manifests/gateway/ext_proc.yaml
-   kubectl apply -f ./manifests/gateway/patch_policy.yaml
+   kubectl apply -f ./manifests/ext_proc.yaml
    ```
-   > **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
-
-1. **OPTIONALLY**: Apply Traffic Policy
 
-   For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
+1. **Deploy Envoy Gateway Custom Policies**
 
    ```bash
-   kubectl apply -f ./manifests/gateway/traffic_policy.yaml
+   kubectl apply -f ./manifests/extension_policy.yaml
+   kubectl apply -f ./manifests/patch_policy.yaml
    ```
 
 1. **Try it out**
diff --git a/pkg/manifests/extension_policy.yaml b/pkg/manifests/extension_policy.yaml
@@ -0,0 +1,31 @@
+apiVersion: gateway.envoyproxy.io/v1alpha1
+kind: EnvoyExtensionPolicy
+metadata:
+  name: ext-proc-policy
+  namespace: default
+spec:
+  extProc:
+    - backendRefs:
+      - group: ""
+        kind: Service
+        name: inference-gateway-ext-proc
+        port: 9002
+      processingMode:
+        request:
+          body: Buffered
+        response:
+      # The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
+      # The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly. 
+      messageTimeout: 1000s
+      backendSettings:
+        circuitBreaker:
+          maxConnections: 40000
+          maxPendingRequests: 40000
+          maxParallelRequests: 40000
+        timeout:
+          tcp:
+            connectTimeout: 24h
+  targetRef:
+    group: gateway.networking.k8s.io
+    kind: HTTPRoute
+    name: llm-route
diff --git a/pkg/manifests/gateway/ext_proc.yaml b/pkg/manifests/gateway/ext_proc.yaml
@@ -103,35 +103,3 @@ spec:
       port: 9002
       targetPort: 9002
   type: ClusterIP
----
-apiVersion: gateway.envoyproxy.io/v1alpha1
-kind: EnvoyExtensionPolicy
-metadata:
-  name: ext-proc-policy
-  namespace: default
-spec:
-  extProc:
-    - backendRefs:
-      - group: ""
-        kind: Service
-        name: inference-gateway-ext-proc
-        port: 9002
-      processingMode:
-        request:
-          body: Buffered
-        response:
-      # The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
-      # The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly. 
-      messageTimeout: 1000s
-      backendSettings:
-        circuitBreaker:
-          maxConnections: 40000
-          maxPendingRequests: 40000
-          maxParallelRequests: 40000 
-        timeout:
-          tcp:
-            connectTimeout: 24h
-  targetRef:
-    group: gateway.networking.k8s.io
-    kind: HTTPRoute
-    name: llm-route