There has been additional work done in the academic world on
"data-over-voice" the last few years, mainly out of Chinese and
Iranian universities. Some of them claim reliable data rates of
2000 bps or higher (not including error correction), which is
good enough for low data rate vocoders (e.g. speex).
I immagine that in a real world scenario it would be required
something even more ultra-narrowband than speex, like the ones
destinated for use in HF radio/military world would be required.
So far, the modems have been implemented at the audio interface
of the handset, not the raw OTA compressed voice frames. I
believe a project such as this will make it possible to improve
these modems by being able to directly control the content of
the traffic frames. At the very least, I think it will aid in
synchronization issues between the end points, and easier
handling of VAD/DTX/CNG.
The problem is made very difficult because of transcoding
done within the network. And, depending on the call, there may
be different transcode operations between end points (or maybe
none at all?). Also, the voice frame parameters are not all
protected equally from channel errors. My hope was to actually
start characterizing network transcode behaviors, since I have
not yet found any real data on this. It is possible to do this
without access to the traffic frames, but it makes for more
complex test cases.
Well, probably also targeting very reduced rates such as 800bit/s
could improve the ability to works across more transcoding.