[PATCH 8/9] Fixed (nanoBTS) delay problems, if RTP stream jitters too much

historical

On 30.01.2014 16:43, Peter Stuge wrote:
> Jacob Erlbeck wrote:
>>> +/* add sec,usec to tv */
>>> +static void tv_add(struct timeval *tv, int sec, int usec)
>>> +{
>>> +
>>> +	while (usec < 0) {
>>> +		usec += USEC_1S;
>>> +		sec--;
>>> +	}
>>> +	tv->tv_sec += sec;
>>> +	tv->tv_usec += usec;
>>> +	while (tv->tv_usec >= USEC_1S) {
>>> +		tv->tv_sec++;
>>> +		tv->tv_usec -= USEC_1S;
>>> +	}
>>> +}
>>
>> I'm not sure whether it is a good idea to use while loops in this case
>> since CPU usage is O(N) of the usec value.
> 
> Remember that N>1 only with very small or very large usec values,
> which I guess will be a corner case since it wasn't handled before?

I think the question here is, do I want have a generic function that
could be used in other places, too? If yes, I strongly advice to take
into account that bad things could happen here (e.g. due to bugs
delivering broken values for usec), eventually causing
very-hard-to-diagnose latency problems.

If you first check for out-of-range usecs, hint it with something like
UNLIKELY() and do the div/mod in that case only. That's probably
slightly faster than the while approach if most of the usecs are within
0..999999 and has still an upper bound.

Or are we looking at the few applications already part of the patch,
which would not have any advantage of allowing out of range usecs.

> 
> Is there a machine where div/mod is actually more efficient than a
> small number of iterations around a loop? :)

That depends on 'small'. But sheer performance (throughput) is not
always the most important, sometimes it's predictability, latency, and
maintainability. And throughput at the cost of the others should be
considered very carefully!

>> Wouldn't it be more convenient to have a function tv_add_us(tv, usec)
>> instead that does the div/mod stuff to a temporary timeval and then
>> just calls timeradd()?

Let's assume I had an implementation like (no it's not C)

void tv_add_us(tv, usec) {
    struct timeval foo;
    foo.tv_sec = usec / 1000000;
    foo.tv_usec = usec % 1000000;

    timeradd(tv, &foo, tv);
}

Looking at the applications of the function:

tv_add(&rs->transmit.last_tv, 0, USEC_20MS);
tv_add(&rs->transmit.last_tv, sample_diff_excess / SAMPLES_1S,
		(sample_diff_excess % SAMPLES_1S) * USEC_SAMPLE);

Which I then would write as

tv_add_us(&rs->transmit.last_tv, USEC_20MS);
tv_add_us(&rs->transmit.last_tv, sample_diff_excess * USEC_SAMPLE);

And thanks to inlining the div/mod would vanish completely in the first
case and would be without extra cost in the second. So no performance
penalty but much better readability.

> Is this a hot path? Then I think it makes sense to optimize harder
> for performance rather than for readability. But compilers are good
> at that too..

I'd rather go for readability by default until profiling tells me
otherwise. Some well-intentioned hand optimizations lead sometimes to
worse performance if the prevent the compiler from doing better
optimizations. So on some platforms a branch taken can be more expensive
than a div/mod.

And did you understand the second application of tv_add() at first
sight? It is correct but that is not obvious IMO.

Jacob