Reproducible segfault apparently from race condition on my system in librtlsdr.
---
$ uname -r
4.14.67-1.pvops.qubes.x86_64
rtl-sdr and libusb are both current git master.
---
1. run librtlsdr under gdb
$ gdb --args rtl_sdr -f 43M -n 1000 /dev/null
2. place a breakpoint when the cancellation condition is hit,
currently line 1915
1914 if (RTLSDR_CANCELING == dev->async_status) {
1915
next_status = RTLSDR_INACTIVE;
1916
(gdb) break ./src/librtlsdr.c:1915
3. run and continue; segfault after second breakpoint
(gdb) run
(gdb) cont
(gdb) cont
Thread 1 "rtl_sdr" received signal SIGSEGV, Segmentation fault.
0x00007ffff7d76a15 in add_to_flying_list (transfer=0x426590) at
../../libusb/io.c:1396
1396 if (!timerisset(cur_tv) || (cur_tv->tv_sec >
timeout->tv_sec) ||
--
The crash appears to be because the transfers are deallocated while
they are still in flight, and then later referenced by libusb.
I think this happens because the pause gives the transfer time to
complete before it is canceled. The cancel then fails because the
transfer was completed, and the current code assumes this means it is
not in flight, when it actually hasn't been handled yet and will be
resubmitted.
The documentation for libusb_transfer::status at
http://libusb.sourceforge.net/api-1.0/structlibusb__transfer.html#a64b2e70e…
states that it is only correct to read the field from within the
callback handler, and the documentation for libusb_cancel_transfer at
http://libusb.sourceforge.net/api-1.0/group__asyncio.html#ga685eb7731f9a059…
states that the transfer
cancellation is only complete when the callback handler is called with
such status.
Although I imagine there are simpler solutions, I think the correct
solution would be to move cancellation of the transfers into the
callback handler entirely, to eliminate race conditions like this and
respect the libusb documentation. I would enjoy crafting a patch to
make such a change, if that would be helpful.
Karl Semich