FCD/Alsa bug (Re: Bug hunting)

This is merely a historical archive of years 2008-2021, before the migration to mailman3.

A maintained and still updated list archive can be found at https://lists.osmocom.org/hyperkitty/list/osmocom-sdr@lists.osmocom.org/.

Christian Gagneraud chris at techworks.ie
Wed Aug 8 15:02:21 UTC 2012


Cross posting to discuss-gnuradio.

The bug in question is that if you instanciate an alsa source on a busy 
device (opened by another app), then the program crashed.

On 08/08/12 00:23, Dimitri Stolnikov wrote:
> Hi Christian,
[...]
>
> The other problem (segfault on trow in ctor) still has to be addressed.

Yes, I started to investigate, and it seems to me that this is not a 
gr-osmosdr bug, but it's a gnuradio one, caused by gr-fcd.

This simple test program have the same problem, yet it only uses gr-fcd.

#include <iostream>
#include <fcd_source_c.h>
int main(int argc, char **argv)
{
     fcd_source_c_sptr fsrc;
     try {
	fsrc = fcd_make_source_c("hw:2"); // KO, from gr-fcd
     }
     catch (std::runtime_error &e) {
	std::cerr << "Error!\n";
     }
     exit(0);
}

g++ test.cc -o test -I/usr/local/include/gnuradio -lgnuradio-fcd

Here is the log:
audio_alsa_source[hw:2]: Device or resource busy
Error!
*** glibc detected *** /home/cgagneraud/sdr/gr-osmosdr/test: free(): 
invalid pointer: 0x08052e3c ***
[...]

And here is a cleaned up backtrace:
operator delete
gruel::msg_accepter::~msg_accepter
checked_delete<gr_hier_block2>
boost::detail::sp_counted_impl_p<gr_hier_block2>::dispose
[...]
const, boost::shared_ptr<gr_basic_block> > > >::~map
__cxa_finalize
__do_global_dtors_aux
[...]
main

The problem is related to 
gnuradio-core/src/lib/runtime/gr_sptr_magic.{h,cc} and the static 
std::map in there.

gr_hier_block2 ctor insert "this" in this map, but then in fcd_source 
ctor, audio_alsa_source ctor throws an exception, so "this" 
(gr_hier_block2/fcd_source) is not a valid pointer anymore.
When the program exits, the map get cleanup up and free is called on 
this pointer.

It's not possible to cleanup the map in fcd_source, because the dtor is 
not called when exception occurs in the ctor (which, btw, leads to some 
memory leaks in alsa_source: namely d_hw_params and d_sw_params).
It's a bad idea to call fetch_initial_sptr(this) before throwing in the 
ctor, because it seems the object get deleted.

Maybe one solution could be to add a function member to mark the 
hier_block2 as zombie, the gr_hier_block2 dtor could then cleanup the 
map depending on that information.

A workaround could be to catch the exception in fcd_source ctor.
and throw exceptions whenever the fcd C API returns
FCD_MODE_NONE. Well, this is kind of a dirty trick that has lot of 
drawbacks.

Maybe some gnuradio gurus could shed some light on how to address this 
issue?

Chris


>
>
> Best regards,
> Dimitri
>
> On Tue, 07 Aug 2012 20:53:25 +0200, Christian Gagneraud
> <chris at techworks.ie> wrote:
>
>> On 07/08/12 15:49, Peter Stuge wrote:
>>> Christian Gagneraud wrote:
>>>> Please find attached the full log.
>>>
>>> Thanks!
>>>
>>>
>>>> I can provide u with a simple python program, but it will depends
>>>> on gnuradio, gr-osmosdr and you will need a compatible device
>>>> plugged in as well.
>>>
>>> No problem, but I think I find the problem.
>>>
>>>
>>>> libusb: 0.783771 debug [libusb_handle_events_timeout_completed]
>>>> doing our own event handling
>>>> libusb: 0.783793 debug [handle_events] poll() 3 fds with timeout in
>>>> 1000ms
>>>> libusb: 0.783811 debug [libusb_release_interface] interface 0
>>>> libusb: 0.785853 debug [handle_events] poll() returned 1
>>>
>>> The above lines show that the interface is released while event
>>> handling is still running. The destructor was called while the IO
>>> thread is running, and the destructor does cancel all IO, but it only
>>> waited 200 ms for the IO to complete, before it went on to close
>>> everything. Please try the attached patch for gr-osmosdr.
>>>
>>> This may or may not be the appropriate fix. This should perhaps be
>>> fixed inside rtl-sdr, or even libusb (we're already considering it)
>>> instead.
>>
>> It doesn't fix the problem alone. The script simply never exits.
>> I've just sent a patch for rtl-sdr that fix the issue.
>> I think it's still possible that a call to join() could block the app
>> forever, so maybe a longer timeout would be safer here. As well the
>> same apply for the osmosdr wrapper code.
>>
>> Chris
>>
>>>
>>>
>>> //Peter
>>>
>>


-- 
Christian Gagneraud,
Embedded systems engineer.
Techworks Marine
1 Harbour road
Dun Laoghaire
Co. Dublin
Ireland
Tel: + 353 (0) 1 236 5990
Web: http://www.techworks.ie/




More information about the osmocom-sdr mailing list