Cross posting to discuss-gnuradio.
The bug in question is that if you instanciate an alsa source on a busy device (opened by another app), then the program crashed.
On 08/08/12 00:23, Dimitri Stolnikov wrote:
Hi Christian,
[...]
The other problem (segfault on trow in ctor) still has to be addressed.
Yes, I started to investigate, and it seems to me that this is not a gr-osmosdr bug, but it's a gnuradio one, caused by gr-fcd.
This simple test program have the same problem, yet it only uses gr-fcd.
#include <iostream> #include <fcd_source_c.h> int main(int argc, char **argv) { fcd_source_c_sptr fsrc; try { fsrc = fcd_make_source_c("hw:2"); // KO, from gr-fcd } catch (std::runtime_error &e) { std::cerr << "Error!\n"; } exit(0); }
g++ test.cc -o test -I/usr/local/include/gnuradio -lgnuradio-fcd
Here is the log: audio_alsa_source[hw:2]: Device or resource busy Error! *** glibc detected *** /home/cgagneraud/sdr/gr-osmosdr/test: free(): invalid pointer: 0x08052e3c *** [...]
And here is a cleaned up backtrace: operator delete gruel::msg_accepter::~msg_accepter checked_delete<gr_hier_block2> boost::detail::sp_counted_impl_p<gr_hier_block2>::dispose [...] const, boost::shared_ptr<gr_basic_block> > > >::~map __cxa_finalize __do_global_dtors_aux [...] main
The problem is related to gnuradio-core/src/lib/runtime/gr_sptr_magic.{h,cc} and the static std::map in there.
gr_hier_block2 ctor insert "this" in this map, but then in fcd_source ctor, audio_alsa_source ctor throws an exception, so "this" (gr_hier_block2/fcd_source) is not a valid pointer anymore. When the program exits, the map get cleanup up and free is called on this pointer.
It's not possible to cleanup the map in fcd_source, because the dtor is not called when exception occurs in the ctor (which, btw, leads to some memory leaks in alsa_source: namely d_hw_params and d_sw_params). It's a bad idea to call fetch_initial_sptr(this) before throwing in the ctor, because it seems the object get deleted.
Maybe one solution could be to add a function member to mark the hier_block2 as zombie, the gr_hier_block2 dtor could then cleanup the map depending on that information.
A workaround could be to catch the exception in fcd_source ctor. and throw exceptions whenever the fcd C API returns FCD_MODE_NONE. Well, this is kind of a dirty trick that has lot of drawbacks.
Maybe some gnuradio gurus could shed some light on how to address this issue?
Chris
Best regards, Dimitri
On Tue, 07 Aug 2012 20:53:25 +0200, Christian Gagneraud chris@techworks.ie wrote:
On 07/08/12 15:49, Peter Stuge wrote:
Christian Gagneraud wrote:
Please find attached the full log.
Thanks!
I can provide u with a simple python program, but it will depends on gnuradio, gr-osmosdr and you will need a compatible device plugged in as well.
No problem, but I think I find the problem.
libusb: 0.783771 debug [libusb_handle_events_timeout_completed] doing our own event handling libusb: 0.783793 debug [handle_events] poll() 3 fds with timeout in 1000ms libusb: 0.783811 debug [libusb_release_interface] interface 0 libusb: 0.785853 debug [handle_events] poll() returned 1
The above lines show that the interface is released while event handling is still running. The destructor was called while the IO thread is running, and the destructor does cancel all IO, but it only waited 200 ms for the IO to complete, before it went on to close everything. Please try the attached patch for gr-osmosdr.
This may or may not be the appropriate fix. This should perhaps be fixed inside rtl-sdr, or even libusb (we're already considering it) instead.
It doesn't fix the problem alone. The script simply never exits. I've just sent a patch for rtl-sdr that fix the issue. I think it's still possible that a call to join() could block the app forever, so maybe a longer timeout would be safer here. As well the same apply for the osmosdr wrapper code.
Chris
//Peter