Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Extend protocol to allow clients send metadata for backend-proxycalls #26

Open
shalom-aviv opened this issue Dec 30, 2024 · 4 comments

Comments

@shalom-aviv
Copy link

Problem:

We use Centrifugo to transmit data to our backend server via RPC calls and channel publications. However, we have a need to transmit telemetry for specific calls, which the current implementation does not support.

Proposal:

One solution we see is to extend the Centrifugo protocol to allow the transmission of metadata (key-value pairs) alongside data in RPC and Publish calls. This would enable specifying which keys should be processed and passed into headers during proxying of calls.

Solution Options:

  • Allow users to pass any string key-value pairs into metadata. However, this may lead to misuse.

  • Take a more localized approach, considering that telemetry might be needed not only by us but also by other Centrifugo users. Centrifugo could implement a metadata transmission mechanism, but only allow clients to send data that it supports, as specified in all client SDKs.

@FZambia
Copy link
Member

FZambia commented Jan 7, 2025

Hello @shalom-aviv

You mean that for client side PublishRequest and RPCRequest would be nice to have:

map<string, string> meta;

Right?

Generally makes sense, though we already have bytes data for SubscribeRequest and ConnectRequest which exactly serves a role of additional custom payload for proxied calls. So it makes me think we can be more consistent and add bytes meta (or bytes metadata) instead of map<string, string> meta to PublishRequest and RPCRequest. Ideally I'd like it to be called meta/metadata instead of data for ConnectRequest and SubscribeRequest too – may be renamed in long-term (we can't use bytes data for PublishRequest and RPCRequest because it's already used in those).

Will it work for you?

@shalom-aviv
Copy link
Author

Yes, meta works well as a solution. However, I believe that using a map<string, string> is a more universal option. Ideally, it would be even better to support a map<string, typed_value> where the values have explicit types, which could add more flexibility and safety.

Regarding your suggestion to rename the field to meta or metadata for ConnectRequest and SubscribeRequest, I think it’s a great idea. It would make the API more consistent and intuitive.

Additionally, I think it’s worth considering how this metadata mechanism would work for web and mobile clients. Providing a unified way to handle metadata across platforms would greatly simplify integration and ensure transparency. For example, ensuring that the metadata passed by clients is forwarded as-is to the backend could help maintain a clear and consistent flow of information.

For our project, it is necessary for telemetry data sent by the client through the Centrifuge SDK (in RPC or Publish calls) to be automatically added to HTTP headers during Centrifuga’s proxy calls to our backend. This would allow the built-in telemetry mechanism to automatically link client requests, Centrifuga, and the backend, creating a complete picture.

The client sends telemetry identifiers to Centrifuga and Grafana, then Centrifuga includes them in the proxy call headers, and the backend retrieves these headers into its environment, ensuring end-to-end tracing.

sequenceDiagram
    participant User
    participant Client as Centrifuge Client
    participant Server as Centrifuge Server
    participant Backend as Backend REST API

    User ->> Client: Sends PublishRequest or RPCRequest with meta field
    Client ->> Server: Transmit request with meta data
    Server ->> Backend: Proxy request<br>Centrifuge Server check  meta data from client<br>Adds meta as HTTP headers

    Backend -->> Server: Responds with data
    Server -->> Client: Returns data
    Client -->> User: Displays response
Loading

@FZambia
Copy link
Member

FZambia commented Jan 12, 2025

However, I believe that using a map<string, string> is a more universal option. Ideally, it would be even better to support a map<string, typed_value> where the values have explicit types, which could add more flexibility and safety.

Making metadata bytes is a compromise here which makes it possible to use any structure on application level and encode it to bytes, for Protobuf protocol we need to use primitive types.

For our project, it is necessary for telemetry data sent by the client through the Centrifuge SDK (in RPC or Publish calls) to be automatically added to HTTP headers during Centrifuga’s proxy calls to our backend. This would allow the built-in telemetry mechanism to automatically link client requests, Centrifuga, and the backend, creating a complete picture.

Do you need this metadata to be transformed into headers? Because in current approaches it's delivered to the backend as part of proxy request body JSON (or part of Protobuf message in GRPC proxy case). So it will be much more straightforward for Centrifugo to continue passing it as metadata field in the proxy payload.

BTW, could you provide more details about exact keys/values to be sent? Just curious because I potentially may find more ideas for the protocol in it.

@shalom-aviv
Copy link
Author

However, I believe that using a map<string, string> is a more universal option. Ideally, it would be even better to support a map<string, typed_value> where the values have explicit types, which could add more flexibility and safety.

Making metadata bytes is a compromise here which makes it possible to use any structure on application level and encode it to bytes, for Protobuf protocol we need to use primitive types.

For our project, it is necessary for telemetry data sent by the client through the Centrifuge SDK (in RPC or Publish calls) to be automatically added to HTTP headers during Centrifuga’s proxy calls to our backend. This would allow the built-in telemetry mechanism to automatically link client requests, Centrifuga, and the backend, creating a complete picture.

Do you need this metadata to be transformed into headers? Because in current approaches it's delivered to the backend as part of proxy request body JSON (or part of Protobuf message in GRPC proxy case). So it will be much more straightforward for Centrifugo to continue passing it as metadata field in the proxy payload.

BTW, could you provide more details about exact keys/values to be sent? Just curious because I potentially may find more ideas for the protocol in it.

Hello,

I’d like to clarify my position and provide additional context regarding the proposed solution.

Context and Needs:

In our project, we encountered a need to transmit telemetry data tied to user actions (e.g., RPC or Publish calls). The idea is to ensure that this telemetry information is passed through every step, including Centrifugo proxy requests.

The reason we want to use HTTP headers is that our telemetry system automatically extracts the necessary information from headers. This eliminates the need to write additional code for parsing or processing data on the backend. This approach reduces the burden on developers and integrates seamlessly with the existing telemetry infrastructure.

I understand your suggestion to use bytes for metadata. This provides flexibility, allowing any structures to be serialized. However, modern services and tools increasingly rely on headers for such cases. This enables out-of-the-box integration and minimizes manual work, which might seem like “magic” but significantly simplifies developers’ lives.

Proposal Review:

After our discussion, I’ve critically reconsidered my initial proposal and believe it may carry certain risks:

  • Misuse of the mechanism: Developers could pass arbitrary data into headers, potentially leading to unforeseen consequences or conflicts in the backend environment.
  • Security and predictability issues: If client code blindly sets data under certain keys, it could disrupt the operation of other systems or create maintenance challenges.

Current Position:

The solution you proposed (using bytes for metadata) is technically valid, but it doesn’t fully meet our needs as it would require additional development on our side. However, it’s possible that we may eventually adopt it.

That said, I’m not insisting on my original proposal. My main goal was to initiate a discussion and explore potential improvements. If you have any further alternative ideas to address this case, I’d be happy to hear them. Otherwise, I believe this discussion has already helped me rethink the approach and view the problem more critically.

Thank you again for your time and feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants