lchan->s and integer overflow

List overview All Threads
Download

newer

older

[PATCH] Refactored tests to not...

[PATCH] Modified openbsc tests to...

Holger Hans Peter Freyther

12 Mar 2013 12 Mar '13

9:08 a.m.

Dear Jolly,

from a quick look it appears to be that a very long running connection could overflow lchan->s to a negative number. Could you either make this code robust or explain why it is not needed? E.g. if we assume that this happens once per multiframe the counter will overflow within 1.3 hours?

holger

Show replies by date

jolly

12 Mar 12 Mar

10:13 a.m.

Holger Hans Peter Freyther wrote:

...

Dear Jolly,

from a quick look it appears to be that a very long running connection could overflow lchan->s to a negative number. Could you either make this code robust or explain why it is not needed? E.g. if we assume that this happens once per multiframe the counter will overflow within 1.3 hours?

holger

hi holger,

lchan->s must never raise above btsb->radio_link_timeout. look at the commit:

+ /* count up radio link counter S */ + lchan->s += 2; + if (lchan->s > btsb->radio_link_timeout) + lchan->s = btsb->radio_link_timeout;

if there would be no limit, a loss of link might also take hours until detected.

have i overseen something?

regards,

andreas

Holger Hans Peter Freyther

1:24 p.m.

On Tue, Mar 12, 2013 at 10:13:01AM +0100, jolly wrote:

dear jolly,

...

/* count up radio link counter S */

lchan->s += 2;

if (lchan->s > btsb->radio_link_timeout)

lchan->s = btsb->radio_link_timeout;

that is what I was missing when browsing the code in gtik. I think there is a small glitch the other way around.

a.) We get criteria value == 0 through OML (for whatever reason)

+ btsb->radio_link_timeout = val[1]

b.) lchan_activate sets lchan->s == 0

+ lchan->s = btsb->radio_link_timeout;

c.) lchan->s -= 1

+ /* count down radio link counter S */ + lchan->s--;

d.) if (lchan->s == 0)

Will not be true as the lchan->s is -1. Is it intended that setting a '0' will disable the link failure checking?

kind regards holger

Holger Hans Peter Freyther

13 Mar 13 Mar

8:23 a.m.

On Tue, Mar 12, 2013 at 01:24:44PM +0100, Holger Hans Peter Freyther wrote:

Good Morning,

...

that is what I was missing when browsing the code in gtik. I think there is a small glitch the other way around.

a.) We get criteria value == 0 through OML (for whatever reason)
          btsb->radio_link_timeout = val[1]
b.) lchan_activate sets lchan->s == 0
  lchan->s = btsb->radio_link_timeout;
c.) lchan->s -= 1
                  /* count down radio link counter S */
                  lchan->s--;
d.) if (lchan->s == 0)

Will not be true as the lchan->s is -1. Is it intended that setting a '0' will disable the link failure checking?

the easiest way to avoid the problem of the number to leave the range of 0 to radio_link_timeout is to use the OSMO_MAX and OSMO_MIN (now that these macros are fixed).

Underflow: lchan->s = OSMO_MAX(lchan->s - 1, 0);

Overflow: lchan->s = OSMO_MIN(lchan->s + 2, btsb->radio_link_timeout);

E.g. the following code (I only compiled it):

diff --git a/src/osmo-bts-sysmo/l1_if.c b/src/osmo-bts-sysmo/l1_if.c index df660c5..16375ef 100644 --- a/src/osmo-bts-sysmo/l1_if.c +++ b/src/osmo-bts-sysmo/l1_if.c @@ -684,7 +684,7 @@ static int handle_ph_data_ind(struct femtol1_hdl *fl1, GsmL1_PhDataInd_t *d /* process radio link timeout coniter S */ if (data_ind->msgUnitParam.u8Size == 0) { /* count down radio link counter S */ - lchan->s--; + lchan->s = OSMO_MAX(lchan->s - 1, 0); DEBUGP(DMEAS, "counting down radio link counter S=%d\n", lchan->s); if (lchan->s == 0) @@ -694,9 +694,7 @@ static int handle_ph_data_ind(struct femtol1_hdl *fl1, GsmL1_PhDataInd_t *d } if (lchan->s < btsb->radio_link_timeout) { /* count up radio link counter S */ - lchan->s += 2; - if (lchan->s > btsb->radio_link_timeout) - lchan->s = btsb->radio_link_timeout; + lchan->s = OSMO_MIN(lchan->s + 2, btsb->radio_link_timeout); DEBUGP(DMEAS, "counting up radio link counter S=%d\n", lchan->s); }

Peter Stuge

5:02 p.m.

Holger Hans Peter Freyther wrote:

...

the easiest way to avoid the problem of the number to leave the range of 0 to radio_link_timeout is to use the OSMO_MAX and OSMO_MIN (now that these macros are fixed).

Underflow: lchan->s = OSMO_MAX(lchan->s - 1, 0);

It avoids the problem, but why is it safe?

The mere possibility that the counter can become negative suggests an imbalance somewhere in the code - but I don't know if that's the case, which is why I ask.

//Peter

Holger Hans Peter Freyther

5:49 p.m.

On Wed, Mar 13, 2013 at 05:02:02PM +0100, Peter Stuge wrote:

...

It avoids the problem, but why is it safe?

The mere possibility that the counter can become negative suggests an imbalance somewhere in the code - but I don't know if that's the case, which is why I ask.

Exactly, the point is that the current code is broken. And in general using OSMO_MIN/OSMO_MAX is better than hand rolling code. And if a range is from a to b it is better to clamp it on both sides that just one.

For this specific issue. I wouldn't mind making the value '0' illegal, GSM 12.21 doesn't specify the legal range and as '0' and '1' will behave the same with the fixed code it does make sense to forbid '0'.

Anyway, the current code is broken so Andreas please fix it ASAP.

holger

Alexander Huemer

12 Mar 12 Mar

10:25 a.m.

On Tue, Mar 12, 2013 at 09:08:21AM +0100, Holger Hans Peter Freyther wrote:

...

from a quick look it appears to be that a very long running connection could overflow lchan->s to a negative number. Could you either make this code robust or explain why it is not needed? E.g. if we assume that this happens once per multiframe the counter will overflow within 1.3 hours?

clang lately got a nice feature for integer over/under-flow detection[1] during runtime, but of course it makes the resulting code extremely slow. I am not sure if it is usable in this case.

Kind regards, -Alexander Huemer

[1] http://blog.regehr.org/archives/905

Holger Hans Peter Freyther

1:42 p.m.

On Tue, Mar 12, 2013 at 10:25:09AM +0100, Alexander Huemer wrote:

...

clang lately got a nice feature for integer over/under-flow detection[1] during runtime, but of course it makes the resulting code extremely slow. I am not sure if it is usable in this case.

Ah nice. I started to use clang3.3 over the weekend for -fsanitize=address, didn't know about the integer instrumentation.

thank you for the pointer holger

4661

Age (days ago)

4661

Last active (days ago)

openbsc@lists.osmocom.org

7 comments

4 participants

tags (0)

participants (4)

Alexander Huemer
Holger Hans Peter Freyther
jolly
Peter Stuge