1) the MNCC interface carries both voice and signalling.  For signalling
   you would like something that resembles TCP/SCTP/DCCP, but for the
   voice you would only like UDP semantics.  Choosing either a reliable
   protocol for voice frames or an unrealiable protocol for signalling
   is calling for lots of trouble and will not happen.  So it would have
   to  be multiple sockets.

Currently I am establishing RTP sockets for voice (from jolly/testing branch) while a TCP/IP socket is used for signaling. So yes MNCC socket is used only for signaling (I though also about the possibility of using it as a "backup" socket in case RTP fails but then the reliable vs unreliable issue pops up)
 
2) I don't think the current protocol is endian/alignment safe.  By
   runnign it over a unix domain socket we basically enforce that both
   programs on the MNCC side will run on the same architectuer and not
   cause any problems.  If you run it over a network, making that
   assumption is false.

The endianness can be carried in the payload of any packet to make it safer or it can be exchanged upon socket setup for instance in the MNCC_SOCKET_HELLO message.
Currently I am carrying the size of the sent signaling message in order to distinguish between gsm_mncc and gsm_mncc_rtp (for now). I did it more as a try than everything but I thought it was a waste to send 840 bytes of data when gsm_mncc_rtp messages are only 24 bytes long!
 
Will you be working on implementing this?

I might need a certain amount of guidance due to my lack of experience.

Luca