This patch (of course) still was buggy, and now I can actually also add a
working general solution and a bit more insight.
At first I was really annoyed by this deallocation spaghetti distracting me yet
another time from breaking through to inter-MSC HO; I was already questioning
all the nice and logical FSM references I had designed in osmo-msc, even
contemplated just running off and letting someone else solve it... But now I am
quite glad that I took a closer look, because with this patch I can even remove
some events and states (maybe drop some FSM instances entirely, which were only
introduced to receive a parent_term event), while actually becoming safer than
before and having to do almost no thinking to achieve that.
The new fsm_dealloc_test.c and an improvement to fsm.c is pushed at
branch neels/fsm_dealloc_test.
http://git.osmocom.org/libosmocore/log/?h=neels/fsm_dealloc_test
The first patch of three shows the current situation totally not working out.
The second patch applies fsm.c "fixes" and shows all situations magically
working well.
The third patch simplifies fsm_dealloc_test.c, because it no longer needs the
ST_DESTROYING after the new safeguards are in place. Nice.
So far I am talloc_steal()ing FSM instances "freed" in osmo_fsm_inst_term()
cascades to the first/outermost osmo_fsm_inst_term() fi as talloc parent, so
that all get freed once in the end.
Instead, "freed" instances could be reparented to a future thread volatile
select context once it shows up. For now I'm very glad that I can easily fix
my osmo-msc without having to depend on the select volatile ctx.
- add flag osmo_fsm_inst.deallocating. If true, fsm.c
denies all action
triggers on the FSM instance (no osmo_fsm_inst_term(), no
osmo_fsm_inst_state_chg(), no osmo_fsm_inst_dispatch()).
Actually it is only necessary to avoid re-entering osmo_fsm_inst_term() for the
same FSM instance.
- Dispatching events is fine: if FSM implementations require thwarting events
when terminating, the event handlers can simply test for the new
fi->proc.terminating flag; also, some FSM implementations may actually rely
on receiving events while already terminating, e.g. to dereference other
deallocating objects and not attempt to clean those twice.
- Changing state during osmo_fsm_inst_term() is also fine, along the same
reasoning.
~N