The good part is that we actually get the CMCE messages that tell us when and where will be voice frames that belong together, i.e. we can identify start and end of indivdiual push-to-talk 'segments'. This could be a nice base for extracting them in a useful format.
Ah nice !
I was wondering how that worked ... Currently I was just using the number in the DL_USAGE field which actually incremented at each segment here IIRC :p
Cheers,
Sylvain