laforge has uploaded this change for review. ( https://gerrit.osmocom.org/c/libosmo-abis/+/36079?usp=email )
Change subject: Fix critical bug in default TCP keepalive user timeout ......................................................................
Fix critical bug in default TCP keepalive user timeout
It turns out that our calculation of the TCP_USER_TIMEOUT value was flawed in several ways:
* there should have been parenthesis around the + operator (line->keepalive_probe_interval + line->keepalive_idle_timeout) as the keepalive_idle_timeout is in seconds, not milli-seconds.
* in the default case, all those values are configured to -1 (E1INP_USE_DEFAULT). This means we're using 1000 * -1 * -1 + -1 = 999 i.e. just below aq second which clearly is not enough for a lossy satellite or wifi back-haul.
This fixes a regression introduced in Ia7659c209aea0d26eb37d31e771adc91b17ae668 i.e. present since libosmo-abis 1.4.0
Change-Id: Idca24d3e676a45d860d9eec60dc2097d8d87f3bf Related: OS#5785, SYS#6801 Fixes: Ia7659c209aea0d26eb37d31e771adc91b17ae668 --- M src/input/ipaccess.c 1 file changed, 41 insertions(+), 16 deletions(-)
git pull ssh://gerrit.osmocom.org:29418/libosmo-abis refs/changes/79/36079/1
diff --git a/src/input/ipaccess.c b/src/input/ipaccess.c index 75d9693..7979b7b 100644 --- a/src/input/ipaccess.c +++ b/src/input/ipaccess.c @@ -588,7 +588,7 @@ static void update_fd_settings(struct e1inp_line *line, int fd) { int ret; - int val; + int val, timeout_val, interval_val, retry_count_val;
if (line->keepalive_num_probes) { /* Enable TCP keepalive to find out if the connection is gone */ @@ -600,31 +600,29 @@ LOGPIL(line, DLINP, LOGL_NOTICE, "TCP Keepalive is enabled\n");
/* The following options are not portable! */ - val = line->keepalive_idle_timeout > 0 ? - line->keepalive_idle_timeout : - DEFAULT_TCP_KEEPALIVE_IDLE_TIMEOUT; - ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &val, sizeof(val)); + timeout_val = line->keepalive_idle_timeout > 0 ? + line->keepalive_idle_timeout : + DEFAULT_TCP_KEEPALIVE_IDLE_TIMEOUT; + ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &timeout_val, sizeof(timeout_val)); if (ret < 0) { LOGPIL(line, DLINP, LOGL_ERROR, "Failed to set TCP keepalive idle time: %s\n", strerror(errno)); } - val = line->keepalive_probe_interval > -1 ? - line->keepalive_probe_interval : - DEFAULT_TCP_KEEPALIVE_INTERVAL; - ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &val, sizeof(val)); + interval_val = line->keepalive_probe_interval > -1 ? + line->keepalive_probe_interval : + DEFAULT_TCP_KEEPALIVE_INTERVAL; + ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &interval_val, sizeof(interval_val)); if (ret < 0) { LOGPIL(line, DLINP, LOGL_ERROR, "Failed to set TCP keepalive interval: %s\n", strerror(errno)); } - val = line->keepalive_num_probes > 0 ? - line->keepalive_num_probes : - DEFAULT_TCP_KEEPALIVE_RETRY_COUNT; - ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &val, sizeof(val)); + retry_count_val = line->keepalive_num_probes > 0 ? + line->keepalive_num_probes : + DEFAULT_TCP_KEEPALIVE_RETRY_COUNT; + ret = setsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &retry_count_val, sizeof(retry_count_val)); if (ret < 0) LOGPIL(line, DLINP, LOGL_ERROR, "Failed to set TCP keepalive count: %s\n", strerror(errno)); - val = 1000 * line->keepalive_num_probes * - line->keepalive_probe_interval + - line->keepalive_idle_timeout; + val = 1000 * retry_count_val * (interval_val + timeout_val); ret = setsockopt(fd, IPPROTO_TCP, TCP_USER_TIMEOUT, &val, sizeof(val)); if (ret < 0) LOGPIL(line, DLINP, LOGL_ERROR, "Failed to set TCP user timeout: %s\n", strerror(errno));