Hi folks,
just now while resolving some unrelated merge conflict, I notice this bit of code we have in osmo-msc master. (It has recently been tweaked, but the code in question has been like this for a long time.)
https://git.osmocom.org/osmo-msc/tree/src/libmsc/gsm_04_11.c#n1057
static struct gsm_trans *gsm411_alloc_mt_trans(struct gsm_network *net, struct vlr_subscr *vsub) { struct ran_conn *conn; struct gsm_trans *trans; int tid;
LOGP(DLSMS, LOGL_INFO, "Going to send a MT SMS\n");
/* Generate a new transaction ID */ tid = trans_assign_trans_id(net, vsub, GSM48_PDISC_SMS, 0); if (tid == -1) { LOGP(DLSMS, LOGL_ERROR, "No available transaction IDs\n"); return NULL; }
/* Attempt to find an existing connection */ conn = connection_for_subscr(vsub);
/* Allocate a new transaction */ trans = gsm411_trans_init(net, vsub, conn, tid); if (!trans) return NULL;
if (conn) { <===== if no active communication, this is NULL /* Generate unique RP Message Reference */ trans->sms.sm_rp_mr = conn->next_rp_ref++; <===== here }
/* Use SAPI 3 (see GSM 04.11, section 2.3) */ trans->dlci = UM_SAPI_SMS;
return trans; }
We often create an MT SMS transaction when no connection to the subscriber is currently available, which is the case if we create the SMS transaction and only page later.
(To prove that, the call flow is gsm411_send_sms(vsub) trans = gsm411_alloc_mt_trans(vsub) gsm411_rp_sendmsg(trans) gsm411_smr_send() srdownstatelist[0] = gsm411_rl_data_req() gsm411_mn_send() gsm411_mm_send() gsm411_mmsms_est_req() trans->paging_request = subscr_request_conn(...) )
But we only assign an sm_rp_mr reference number when the conn is already present. That means when we send an SMS, the sm_rp_mr reference is unset when we still need to page first??
Does anyone know this code well? I'm almost certain we should always set an RP reference? (i.e. drop the 'if (conn)' condition)
What errors could arise from this? It seems that we would always send an RP reference of zero for all SMS that require paging. Could that be a reason Rhizomatica sometimes saw SMS delivered to the wrong recipient?
So, does anyone already know that I'm on the wrong track here -- otherwise I'll create an issue for this, probably also needing tests.
~N
Hi Neels,
Does anyone know this code well? I'm almost certain we should always set an RP reference? (i.e. drop the 'if (conn)' condition)
I am the author of this code. Thanks a lot for looking into this! Actually, it was assumed that if there is no active RAN connection, we can just start counting from 0x00, as there are no other SMS related transactions, and transaction itself is allocated using talloc_zero(). Until now it was looking good, but...
I just realized that as soon as we establish RAN connection with subscriber, we already have a transaction with sm_rp_mr == 0x00, but conn->next_rp_ref also remains == 0x00, it isn't increased!
This means that we can face a SM-RP-MR conflict (or collision) if another MT SMS would arrive to the MSC (from SMSC over GSUP) when this transaction is still active, i.e. the first SMS is still being sent, because conn->next_rp_ref++ would return 0x00 too :/
I will write a TTCN test case, and fix this issue. Thanks!
Could that be a reason Rhizomatica sometimes saw SMS delivered to the wrong recipient?
I don't think they are using "SMS over GSUP", so most likely no. The 'trans->sms.sm_rp_mr' is only used by "SMS over GSUP" code.
With best regards, Vadim Yanitskiy.
Hi again,
I will write a TTCN test case, and fix this issue. Thanks!
Please see:
(TTCN-3 TC) https://gerrit.osmocom.org/#/c/osmo-ttcn3-hacks/+/12627/ (OsmoMSC) https://gerrit.osmocom.org/#/c/osmo-msc/+/12628/
With best regards, Vadim Yanitskiy.