pespin has uploaded this change for review. ( https://gerrit.osmocom.org/c/libosmocore/+/41671?usp=email )
Change subject: io_uring: RECVMSG_SENDMSG: Reset fd to blocking after osmo_fd_register() workaround ......................................................................
io_uring: RECVMSG_SENDMSG: Reset fd to blocking after osmo_fd_register() workaround
When IOFD_FLAG_NOTIFY_CONNECTED is requested by the user through osmo_iofd_notify_connected(), a write_cb(0, NULL) callback is done towards the user to notify the socket is connected.
In mode RECVMSG_SENDMSG, at least for SCTP sockets, according to comment in iofd_uring_register() (see also OS#5751) we cannot do the write(0) trick to get notifications, so instead we need to workaround by using a temporary osmo_fd to poll() for write status. To do so, we call osmo_fd_register(), which internally sets the fd as non-blocking (O_NONBLOCK). However, this is becomes an undesired behavior later on when connect happens and io_uring is used to recvmsg/sendmsg, since that could cause sqes to be answered with -EAGAIN cqes at the kernel isntead of keeping (blocking) them internally until the transaction can be resolved. This seems was a "desired" or "expected" behavior in older kernels according to public discussions/tickets, but it seeems it changed over time (see eg. GH#364 below).
In summary, the conclusion seems to be: try to avoid as much as possible mixing O_NONBLOCK with io_uring, as you may get totally unexpected/changing behavior.
Hence, avoid keeping O_NONBLOCK for those sockets when moving back from osmo_fd to io_uring.
See regarding related discussion on the expected behavior with io_uring and O_NONBLOCK: https://www.spinics.net/lists/io-uring/msg04058.html https://github.com/axboe/liburing/issues/364
Change-Id: Idba623730230bc049b827e51b058cd64d23b730f --- M src/core/osmo_io_uring.c 1 file changed, 21 insertions(+), 1 deletion(-)
git pull ssh://gerrit.osmocom.org:29418/libosmocore refs/changes/71/41671/1
diff --git a/src/core/osmo_io_uring.c b/src/core/osmo_io_uring.c index cac6272..278b686 100644 --- a/src/core/osmo_io_uring.c +++ b/src/core/osmo_io_uring.c @@ -534,6 +534,23 @@ static void iofd_uring_write_enable(struct osmo_io_fd *iofd); static void iofd_uring_read_enable(struct osmo_io_fd *iofd);
+/* make an FD blockig: + * osmo_fd_register(ofd) did set fd flag O_NONBLOCK previously. We don't + * want to keep the fd as O_NONBLOCK once we start using io_uring, + * otherwise we'd end up getting cqes with -EAGAIN; better let the kernel + * wait internally for the sqe to complete. */ +static int iofd_reset_fd_blocking(int fd) +{ + int flags; + + /* make FD nonblocking */ + flags = fcntl(fd, F_GETFL); + if (flags < 0) + return flags; + flags &= ~O_NONBLOCK; + flags = fcntl(fd, F_SETFL, flags); + return flags; +}
/* called via osmocom poll/select main handling once outbound non-blocking connect() completes */ static int iofd_uring_connected_cb(struct osmo_fd *ofd, unsigned int what) @@ -547,6 +564,7 @@
/* Unregister from poll/select handling. */ osmo_fd_unregister(ofd); + iofd_reset_fd_blocking(ofd->fd); IOFD_FLAG_UNSET(iofd, IOFD_FLAG_NOTIFY_CONNECTED);
/* Notify the application about this via a zero-length write completion call-back. */ @@ -591,7 +609,8 @@ * Use a temporary osmo_fd which we can use to notify us once the connection is established * or failed (indicated by FD becoming writable). This is needed as (at least for SCTP sockets) * one cannot submit a zero-length writev/sendmsg in order to get notification when the socket - * is writable.*/ + * is writable. + * NOITE: osmo_fd_setup() sets iofd->fd as O_NONBLOCK. */ if (IOFD_FLAG_ISSET(iofd, IOFD_FLAG_NOTIFY_CONNECTED)) { osmo_fd_setup(&iofd->u.uring.connect_ofd, iofd->fd, OSMO_FD_WRITE, iofd_uring_connected_cb, iofd, 0); @@ -664,6 +683,7 @@
if (IOFD_FLAG_ISSET(iofd, IOFD_FLAG_NOTIFY_CONNECTED)) { osmo_fd_unregister(&iofd->u.uring.connect_ofd); + iofd_reset_fd_blocking(iofd->u.uring.connect_ofd.fd); IOFD_FLAG_UNSET(iofd, IOFD_FLAG_NOTIFY_CONNECTED); }