Skip to content
This repository has been archived by the owner on Nov 23, 2021. It is now read-only.

DispatchData vs Data for API calls #37

Open
carlbrown opened this issue Aug 26, 2017 · 3 comments
Open

DispatchData vs Data for API calls #37

carlbrown opened this issue Aug 26, 2017 · 3 comments

Comments

@carlbrown
Copy link
Contributor

carlbrown commented Aug 26, 2017

I'm pulling comment from @helje5 from #28 over into this issue, because that issue is getting cluttered.

I don't remember what we decided on here, IMO we should use DispatchData to pass buffers around as this supports iovecs (and AFAIK Data doesn't, right?).

A DispatchData doesn't require copying but supports custom deallocators, that is a key part of its design. (I'm not denying that there might be a better alternative, but DispatchData is much better than Data for this specific purpose (which presumably is the reason why it still exists ;-) )).

For what it's worth, I started with DispatchData in the APIs, and I wanted to keep it that way. The problem is that, at every turn, I found myself having to convert from Data/NSData to DispatchData and back. For example:

  1. Part of this is because BlueSocket (that I really want to replace with something much simpler) explicitly reads/writes all socket data in NSData format.
  2. Part of this is because of control messages. When you want to send "OK\r\n", that string easily converts to a Data, but getting it into a DispatchData is more of a pain in the ass, and when you have several things (like (String($0.count, radix: 16) + "\r\n").data(using: .utf8)) that you are sticking into the same packet, you end up doing that conversion over and over.
  3. Part of it is because of response types. We found that the vast majority of the time, the response body that we wanted to send was either encoded as a String (Because the programmer typed it in to be turned into a response), or encoded as a Data (Because it's the contents of a file pulled off disk or it's the result of a JSONEncoder call). In both of those cases, it was easier not to have to do a bunch of Data<->DispatchData conversions before handing off to http.

So you tell me - What do you think the right answer is here?
[Just in case there's any confusion as to my tone here - I'm not being defensive - I honestly want to know. I'm not thrilled with the compromise I made, and I'd love for there to be an obvious or a consensus choice.]

@helje5
Copy link
Contributor

helje5 commented Aug 26, 2017

I'm pulling comment from @helje5 from #28 over into this issue, because that issue is getting cluttered.

Thanks a lot for that, I should have done this in the first place.

For what it's worth, I started with DispatchData in the APIs, and I wanted to keep it that way. The problem is that, at every turn, I found myself having to convert from Data/NSData to DispatchData and back.

Yes, it is unfortunate that Data simply doesn't offer the facilities of DispatchData (NSData does/did). But my understanding is that this is explicitly wanted/desired by design. Otherwise I would say, lets just use Data and propose additions to it to make it work for Server (it is not really server, but scalable memory handling in general).

If the goal of this effort is not to produce a very fast or scalable server (and I think we can put a checkmark on this) but one which is fits in best and is very easy to use, I'd say lets just stick to Data and live with the copying. It hurts, but maybe that really is the right choice here :-)

  1. Part of this is because BlueSocket (that I really want to replace with something much simpler) explicitly reads/writes all socket data in NSData format.

That should be a non-reason. I think very few people care what BlueSocket does. Just drop it and use something else. It is very easy to build a simple HTTP/1.x server using just DispatchIO and the HTTP parser which is already in. Maybe Johannes can just contribute the Apple one?

The harder part may be the TLS support, I'm not sure what the status of that is wrt async APIs. Do we have something here? (I was suggested a DispatchIO compatible TLS API, do we have sth like that already?).

  1. Part of this is because of control messages. When you want to send "OK\r\n", that string easily converts to a Data

Oh no, please don't do this! Just keep them as static Data or DispatchData objects. A String is a pretty hefty object, don't just create it for below-ASCII stuff like that :-)

  1. Part of it is because of response types.

That goes along my comment wrt the goal of the project. Yes, if scalability is of no concern, just return Strings and Datas.

Summary of my opinion: If this thing is going to be something like Python httplib, something which is easy to approach, fits in well with Foundation as it is now, but isn't actually used in servers w/ actual traffic - let design it for easy of use, not scalability. Or in short: just Data in public API ;-)

P.S.: It is trivial to just add a String.data(using:) -> DispatchData? if you really want that ;-) That should be more of a part of Dispatch/Foundation though (specifically because that could easily add non-copying behaviour to it).

@carlbrown
Copy link
Contributor Author

That should be a non-reason. I think very few people care what BlueSocket does. Just drop it and use something else.

That's my plan - I just haven't had time for that, yet.

Part of this is because of control messages. When you want to send "OK\r\n", that string easily converts to a Data
Oh no, please don't do this! Just keep them as static Data or DispatchData objects. A String is a pretty hefty object, don't just create it for below-ASCII stuff like that :-)

I know. But pretty much every tutorial I've ever seen for any Swift webserver starts with "Hello, world\r\n".data(using: .utf8) or something similar, so I think it's inevitable that people will do that frequently.

P.S.: It is trivial to just add a String.data(using:) -> DispatchData? if you really want that ;-) That should be more of a part of Dispatch/Foundation though (specifically because that could easily add non-copying behaviour to it).

What I don't know is: what's the overhead of that? I need to measure it and see, I just haven't had time. If it turns out that the conversion doesn't slow anything down noticeably, then we can just exclusively use DispatchData internally, and give people methods that they can call with Data and we can convert on the fly. (And I'm worried about using the "non-copying" behavior to make a DispatchData out of a Data that some user randomly handed me for fear that it's going to go out of scope and be deallocated by ARC and the server is going to crash by trying to use a zombie).

@helje5
Copy link
Contributor

helje5 commented Aug 26, 2017

P.S.: It is trivial to just add a String.data(using:) -> DispatchData? if you really want that ;-)

What I don't know is: what's the overhead of that? I need to measure it.

You don't need to measure it. W/o Foundation support the overhead is significant (full malloc/memcpy). W/ Foundation support it would probably be zero (ARC).
I'm no expert in the Swift Foundation implementation, but the way I 'imagine' it Data can be backed by a malloced buffer or by NSData, which is a class cluster (and funny enough this probably includes DispatchData as one concrete implementation at least on Darwin ...).

If you implement it in plain Swift, you would need to malloc and copy the data as neither String nor Data give you APIs to the underlying storage.

In case you expect a lot of string to data conversions, Data should be a lot better for many setups because it likely does have direct Foundation support (I think there was another PR by @seabaylea where we discussed this). Consider String.data(using:). If the encoding matches the underlying String storage, no copying needs to happen. The Data object returned can/could just retain the string and be a view into that String buffer.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants