Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
perf(rpc): io_uring integration & redesign (#477)
* RPC server: io_uring upgrade Separates the server into two parts: the context, and the work pool; the context contains everything generally needed to run the server, the work pool contains a statically polymorphic implementation for a pool to dispatch the actual work to. In doing this, we also separate certain things out into a few different files. The RPC server context API has been modified slightly to reflect this, and the work pool directly exposed, for now. * Don't use file-as-struct * Run style script, respect line length limit * Improve accept failure handling & update TODOs * Handle potentially failing/cancelling of `accept_multishot` by re-queueing it, based on the `IORING_CQE_F_MORE` flag. * Revise/simplify the queueing logic for the `accept_multishot` SQE. * Resolve the EINTR TODO panics, returning a catch-all error value indicating it as a bad but non-critical error. * Update the `a: ?noreturn` `if (a) |*b|` TODO, adding that it's solved in 0.14; it should be resolved after we update to 0.14. * Unify EAGAIN panic message. * Add TODO to remove hacky-ish workaround * Use `self: Type` convention * A few minor fixups and improvements * Simplify test, make server socket nonblocking On MacOS, on basic WorkPool, this means we now need to manually set the accepted socket's flags to the right things, ie, blocking, as opposed to the server socket's nonblocking mode. Means we also have to handle EAGAIN a bit differently in the io_uring backend, but that's a fine tradeoff. * server.zig -> server/lib.zig * Segregate out the basic backend And re-organize some methods based on that change * Re-organize server module * Update LOGGER_SCOPE * Simplify & improve io_uring backend error handling * De-scope `accept_flags` * Simplify `can_use` for linux cross-compilation * (io_uring) Rework error handling, add timeout Do not exit for *any* errors that are specific to the related connection, simply free them and continue to the next CQE. Specifically in the case of `error.SubmissionQueueFull`, instead of immediately failing, we instead first try to flush the submission queue and then try again to submit; if it fails a second time, that means despite flushing the submission queue, it somehow still failed, so we panic, since this indicates something is *very* wrong. This also eliminates the `pending_cqes_buf`, since there is actually no situation in which `consumeOurCqe` returns an error, and we resume work afterwards - either we process all the received CQEs, or we hard exit - this was already essentially the case before, now it's more obvious. For the main submit, we now wait for at least 1 connection, but we also add a timeout SQE to make it terminate if we don't receive a connection or completion of another task for 1 second; this alleviates the busy loop that was running before. * (io_uring) Remove multishot_accept_submitted Also slightly refactor error sets. Now instead of checking to see if we need to set a flag to re-queue the multishot accept, we just pass in the server context on init and queue it, which now makes sense since the context and workpool are separate. * (io_uring) Simplify new entry creation Also add fix for rebase * Misc fixups * Re-organize alias/import * General restructure * Move more specific functions to the only files they're used. * Move the `serve*` functions outside of `Context`, making them free functions which just accept the context and work pool. * Remove `acceptAndServeConnection`; originally this was required to be able to nicely structure the unit test, and used to be more integrated, however it no longer makes sense as a concept. * Inline `handleRequest` into the basic backend. * Make the `acceptHandled` function, moved into the basic backend, guarantee the specified `sync` behavior, and inline `have_accept4`. * Appropriately re-export the relevant parts of the server API. * Added top level doc comments. * Re-oorganize loggers & scopes * Refactor `build_options` imports * Add `no_network_tests` build option And disable the rpc server test when it is enabled * Update circleci with `-Dno-network-tests`
- Loading branch information