-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3079: Low Bandwidth CS API #3079
base: old_master
Are you sure you want to change the base?
Conversation
which low bandwidth features are supported on this server. This object looks like: | ||
``` | ||
"m.low_bandwidth": { | ||
"dtls": 8008, // advertise that this server runs a UDP listener for DTLS here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"dtls": 8008, // advertise that this server runs a UDP listener for DTLS here. | |
"dtls_port": 8008, // advertise that this server runs a UDP listener for DTLS here. |
Also, couldn't it be possible here to specify a host+port for the endpoint seperately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm debating if that information should live in .well-known
as well... Not sure.
"cbor_enum_version": 1, // which table is used for integer keys. This proposal is version 1. Omission indicates no support. | ||
"coap_enum_version": 1, // which table is used for coap path enums. This proposal is version 1. Omission indicates no support. | ||
"observe": 1, // Set to 1 if /sync supports OBSERVE. Omission indicates no support for OBSERVE. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add enum shorthands for these fields as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's tricky, because if the client doesn't support the enums then they can't read the fields to determine the versions of things they may support e.g CoAP path enums. So probably not, though that would be a good reason to make these keys shorter!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the enum is backwards compatible and these values are defined in the first version, it should work I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The client could cache the response of this for longer time, if low-bandwidth is enabled and it isn't getting errors back from the server; so trying to save every single byte on the version response doesn't make much sense to soru
The stickiness does alleviate the message overhead, but the message would need to have a different number then because it's not safe-to-forward any more; its least significant bits should probably be 0b11 (see RFC7252 Section 5.4; also setting the Critical bit as a message with this option number couldn't be successfully processed by a server that doesn't know the option). The reviewers might ask you to pick a number a bit whole lot higher than 256, as anything below probably 560 is a bit more precious because applications with known used options can make a bit more efficient use of these options. (Exemplary rationale). As for alternatives (which I'd not rule out yet, and also because this is a topic that comes up often enough that it shoud find a place in the CoAP FAQ), I've kicked off a thread at the CoRE list. Just to be sure I'm not missing an obvious option, would it be an option to get a CWT / JWT rather than a bearer token? (That could possibly be fed right into the DTLS or EDHOC setup; it'd break the gateway deployment of CoMatrix but that could be fixed elsewhere, and that'd even be advantageous because it'd curb MITM'ing even when the attacker can get a valid certificate as home server). |
|
||
## Unstable prefix | ||
|
||
The `/versions` response should use `org.matrix.msc3079.low_bandwidth` instead of `m.low_bandwidth` whilst the proposal is in review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about also moving that dict into unstable_features
?
|
||
## Security considerations | ||
|
||
DTLS connections should have the same security guarantees as TLS connections. Weak ciphers should not be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is extremely vague, some examples or links to articles which give a better opinion on this would be appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recently came across the official IANA TLS Parameters register, which provides a section of cipher suites, which are then annotated with "DTLS Readiness" and "Recommendation", maybe referencing that, and/or providing one or two "must support" ciphers, could work.
Personally, for those "must support" ciphers, i'll recommend;
TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
But i'll also mention TLS_ECDHE_ECDSA_WITH_AES_128_CCM
for it's ease of implementation and performance on embedded devices. Note however that it comes with a decreased security guarantee due to the use of CCM, which has been regarded as weak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended cipher suites change over time which is why I'm not naming any specific ones. I don't mind indirecting to say "if it's on this list then you should probably support it" but I'm wary about mandating any one cipher suite in this document as that is really not the place of this MSC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not mandating, but giving implementation suggestions to start off with for interoperability, such as those specific cipher suites.
Though, maybe this is going too much into detail, I think there's not a lot of DTLS implementations out there to really have this matter all that much, if one could potentially know all of them at once at this point.
|
||
### HTTP -> CoAP | ||
|
||
CoAP is defined in [RFC 7252: Constrained Application Protocol](https://tools.ietf.org/html/rfc7252) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CoAP defines an NSTART
, a max-amount-of-packets-in-flight parameter, as 1.
This would effectively limit lb to a packet rate of 1 / RTT
, which may make lb not very useful for environments which have a wide long link (think geostationary satellite links, or even moon-to-earth)
Would there be any consideration for a different NSTART? or even a dynamic one? Should this spec allow clients and servers to experiment with this?
As an example, taking a block-wise transfer, with a SZX (size) of 1024
bytes, and an RTT of 300ms, would get me a theoretical transmission rate of 3.5KBps.
Related issue: ruma/lb#6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My proposal would be that the client dictates this variable dynamically in the session, with an option at 258
with a uint
value, which, when received, changes the NSTART
at the server level. This option is Elective (as it is an even number), and thus ignorable by the server.
This would allow the client to detect and adjust the rate at its own pace, knowing its own dynamic network conditions (which are likely to be more variable towards the client than the server, realistically).
A value of 0
could indicate "just send any packet you have in the buffer", this would be useful for already-ordered-and-reliable connections such as websocket, TCP, or others, where the underlying transport already has this covered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some extra context on this; A IETF Draft from 2013 provides some context and additional information around this problem: draft-bormann-core-congestion-control-02 section 5.2, which mentions the then-draft for block-wise transfer, but also RFC 5348, this document contains some very sophisticated methods of keeping congestion under control, so it may not be practical to spec.
Though, i'll experiment on my own, and try to see if there's a simpler formula able to be applied here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So congestion control for CoAP is a problem we're investigating as there are a few problems with CoAP for Matrix as-is:
- NSTART as you mention (which would actually prevent Matrix clients working correctly if they were long-polling /sync) as then you would have 2 in-flight conns when you POST/PUT something. This isn't the case for OBSERVE though.
- Blockwise transfer for large request/response bodies. The fact that you don't have any equivalent concept of a TCP receive window means it takes a long time to send lots of data as you have to ACK each packet before the next packet is sent.
These problems aren't entirely unique to low bandwidth Matrix, so there are papers which try to address this such as:
- CoAP Simple Congestion Control/Advanced (CoCoA)
- pCoCoA: A precise congestion control algorithm for CoAP
- Many others which don't have catchy names.
A lot of this research is ongoing and recent, so at present this MSC is silent on which congestion control algorithm to use. When a good algorithm (for Matrix use cases) is proven, and there is sufficient library support, then we can probably mandate one.
the option ID 256 as `Authorization` in accordance with the [CoAP Option Registry](https://tools.ietf.org/html/rfc7252#section-12.2) | ||
which mandates that options IDs 0-255 are reserved for IETF review. This Option should have the same Critical, | ||
UnSafe, NoCacheKey, Format and Length values as the `Uri-Query` Option. The `Authorization` value SHOULD omit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect and not compliant with the RFC, which define that only odd numbers can be Critical
, as Uri-Query
is;
An Option is identified by an option number, which also provides some additional semantics information, e.g., odd numbers indicate a critical option, while even numbers indicate an elective option.
Note that this is not just a convention, it is a feature of the protocol: Whether an option is elective or critical is entirely determined by whether its option number is even or odd.
https://datatracker.ietf.org/doc/html/rfc7252#section-5.4.6
Thus, this should probably be 257
instead.
value for clients to detect a dead connection. This can be shortened by modifying DTLS to send `CloseNotify` alerts when | ||
the UDP socket receives unrecognised data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has a few security implications, problems, and nuances, as I note over at ShadowJonathan/DusTLS#7.
There are two issues here, both linked to UDP address spoofing (which is very easy to do);
First, an agent could send garbage data, modifying their own source address, to effectively get an amplification attack out of the server, if it continually sends CloseNotify
's to the address it believes is responsible for the garbage data. This can be somewhat mitigated by having a timeout on retransmission of the CloseNotify
, but this could make matrix servers (even if so slightly) a vector for DDoS amplification.
Second, this would make MITM resetting connections extremely easy, as a server needs to have a "security context" (active encryption on the connection) to be able to send alerts (of which CloseNotify
is one of) securely, this would effectively allow an unsecured CloseNotify
packet to control a connection, allowing DoS, combined with the ease of UDP address spoofing, this would be a major vector. This could maybe be mitigated by smart application-level handling of closed connections, and a timeout on how quickly plaintext CloseNotifys can control a connection. The connection itself could be transiently not interrupted by immediately sending a TLS Resume to the server, though then that would become a requirement for any DTLS implementation.
While both could be mitigated, I want to urge that these effectively turn the underlying DTLS layer into an attack vector, so these should really be also noted in the Security Considerations section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You make some good points. I don't really see an alternative here though. When the server gets restarted, the client doesn't know this. The client sends encrypted packets and then gets no response. Is that because of network issues or is that because the server has no idea what you're talking about anymore? When does the client decide to send a TLS Resume? Waiting for the timeout takes a very long time, which degrades the user experience significantly.
In addition, OBSERVE subscriptions are lost when the connection is interrupted in this way (as the subscriptions are all in-memory) so it becomes difficult to know if a resubscription is required or not. I don't have any good solutions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is okay to opt-in to a "have DTLS implementations close connections upon receiving a CloseNotify
from the server", but the tradeoffs should be weighed, so i'm basically only asking for this security angle to be repeated in the security considerations sector, with some warnings to implementations to not allow someone to spam CloseNotify
s without any realistic interval in between, then i believe this has a proper tradeoff with the problem.
Is this abandoned? I remember it being nearly ready a few years back, yet everything related seems to be stale. |
@witchent According to blog post:
Since then Synapse is not much faster, and Dendrite is still in development, so there's still a lot of more important work. I think there are some obstacles for implementing and deploying this spec. For example: E2E messages are encrypted JSON that are encoded as Base64, which sounds not very efficient. |
Realistically I suspect for fields which represent "bytes" (e.g any base64 unpadded fields) they would be represented literally in CBOR. |
@kegsay That's still full-size JSON being encrypted. Anyway I think splitting this MCS (like matrix-org/matrix-spec#1736) would be beneficial as it gives time to evaluate better options. |
Any status update on this work? We are still very interested in a low bitrate solution for use over HF radio scenarios, typically between proxies in remote locations and the nearest city with Internet. |
@rafael2k Currently there's no funding for such work, so it was put on hold. You can use proxy for meanwhile. Oh, also amateur radio should be legally unencrypted anyway, while Matrix is used with E2EE 99% of time. |
Hi @Saiv46. We do not use amateur radio bands (I never mentioned it). Where should I read about the proxy setup? We are looking for funding for this development. |
Hello, @rafael2k! The implementation was linked above: https://github.com/matrix-org/lb |
Thanks @Saiv46 ! |
Rendered
Implementation: https://github.com/matrix-org/lb