Attention is currently required from: lynxis lazus.
pespin has posted comments on this change by lynxis lazus. ( https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309?usp=email )
Change subject: Add TCAP based loadsharing/routing ......................................................................
Patch Set 8: Code-Review-1
(50 comments)
Patchset:
PS8: I'd prefer having src/ss7_as_loadshare_tcap.{c,h} renamed to have a "tcap_" prefix at the start, eg. "tcap_loadshare.c", to have all tcap files together.
Commit Message:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/5466b547_988322... : PS8, Line 13: This TCAP session tracking is similar to IP connection tracking. I guess you mean TCP connection here?
File examples/sccp_demo_user.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/4669b36e_046adb... : PS8, Line 30: void *talloc_asn1_ctx; why is this added?
File src/ipa.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/288b4d18_844341... : PS8, Line 51: #include "ss7_as_loadshare_tcap.h" #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/7f375404_c2cf69... : PS8, Line 315: /* update TCAP Transaction Tracking (Rx) */ #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/7a2acee7_1fa70c... : PS8, Line 316: ss7_asp_tcap_rx_sccp(as, asp, opc, dpc, msg); This function naming is a bit misleading since it's not even known whether there's TCAP in the msg...
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/c23f9387_9807ad... : PS8, Line 360: case IPAC_PROTO_OSMO: #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/364ade8c_25d562... : PS8, Line 368: if (hh_ext->proto == IPAC_PROTO_EXT_TCAP_ROUTING) { (probably a switch case here will turn out cleaner together with the ifdef.)
File src/m3ua.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/fc0b3dde_fe4349... : PS8, Line 50: #include "ss7_as_loadshare_tcap.h" #ifdef WITH_TCAP_LOADSHARINGActually, I don't see this header being used in this file. Please have a look...
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/5ef17596_178ebd... : PS8, Line 343: if (!xua) This looks unrelated to this commit? Seems you want to add another commit describing this.
File src/ss7_as.h:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/4e180de7_8a7159... : PS8, Line 6: #include <osmocom/core/hashtable.h> #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/2aec4c38_576e38... : PS8, Line 97: /* optimisation: true if tid_ranges contains PCs (not only wildcards) */ #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/a58f0372_d4b764... : PS8, Line 145: /* Should we do load-sharing based on tcap ids? */ #ifdef WITH_TCAP_LOADSHARING
File src/ss7_as.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/b16d090f_7d4f90... : PS8, Line 40: #include "ss7_as_loadshare_tcap.h" #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/0bd39052_96eaf5... : PS8, Line 139: hash_init(as->tcap.tid_ranges); #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/3ee7de6e_d1a716... : PS8, Line 227: tcap_asp_down(asp); I'm not sure this really belongs here. This function (osmo_ss7_as_del_asp) is to unassign a ASP from an AS, not related to an ASP going down. It's more an administrative configuration than a state change. So either change the function name most probably, or move it somewhere else.
And ofc #ifdef WITH_TCAP_LOADSHARING.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/fa007d58_5f2947... : PS8, Line 246: tcap_disable(as); #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/6fe9a695_f0204a... : PS8, Line 545: case OSMO_SS7_AS_TMOD_LOADSHARE: #ifdef WITH_TCAP_LOADSHARING we definetly not want to hit this path for 99% of users every time a packet is being routed...
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/930ce3fc_6a579a... : PS8, Line 634: extra line not needed.
File src/ss7_as_loadshare_tcap.h:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/e0972887_f89394... : PS8, Line 3: #include <complex.h> I wonder why do we need this here...
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/3b36ee96_ee2e93... : PS8, Line 14: #ifdef WITH_TCAP_LOADSHARING YOu can remove this, simply don't even include it if not requested by configure...
File src/ss7_as_loadshare_tcap.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/327be0b0_8aec90... : PS8, Line 61: return osmo_ntohl(*(uint32_t *)src->buf); is this always properly aligned? maybe use osmo_load_32be or however it's called?
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/8e582474_9cfaba... : PS8, Line 72: tcapmsg = talloc_zero(as, struct TCAP_TCMessage); Can we avoid talloc-allocating a new chunk for every message? Simply "struct TCAP_TCMessage tcapmsg = {0};" should probably work?
If a talloc context is needed, then probably keeping it as a static variable and allocating it once is enough.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/3d5ac30f_dcb8c6... : PS8, Line 138: ssn ^= ((pc >> 24) & 0xff); A point code is 13 bits in ITU iirc, you seem to be dropping 5 bits here. Is this expected?
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/b27aa519_90a514... : PS8, Line 159: static struct osmo_ss7_asp *tcap_hlist_get(struct osmo_ss7_as *as, uint32_t pc, uint8_t ssn, uint32_t tid) const as?
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/d1b111de_c44288... : PS8, Line 273: /** Traffic from the TCAP ASP -> AS -> osmo-stp, only used to update transaction tracking I wonder whether this step is mandatory or should be enabled depending on setup?
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/1ec3f764_9c8a0a... : PS8, Line 316: /* TCAP transaction tracking requires point codes */ In theory at some point in the future we can do this also translating from GT to PC afaiu.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/d8c3a975_7fc4b3... : PS8, Line 379: /* FIXME: use UTDS */ This needs to be fixed?
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/cb9b2fd5_78615e... : PS8, Line 380: static int sent_back_utds(struct osmo_ss7_as *as, Please add a description for this function, I'm not getting what's its purpose.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/3125ceba_07bdf3... : PS8, Line 473: * \return return is not described.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/428f89d8_eab42c... : PS8, Line 475: static int asp_loadshare_tcap_sccp(struct osmo_ss7_asp **rasp, struct osmo_ss7_as *as, uint32_t opc, uint32_t dpc, imho the EPROTONOSUPPORT should be dropped, and return rasp instead.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/fec7012f_f0a9a7... : PS8, Line 487: /* decode SCCP and convert to a SUA/xUA representation */ I think you have similar code logic in one of the above functions, you may want to see if it makes sense to add a helper function to do all checks and fill data.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/60f7493c_f5914c... : PS8, Line 560: sent_back_utds(as, xua, sua, SCCP_RETURN_CAUSE_SUBSYSTEM_FAILURE); You may want to look at whether it makes more sense to send a routing failure: "sccp_rout_fail_enqueue(inst, xua, SCCP_RETURN_CAUSE_MTP_FAILURE, xua->hdr.msg_class == SUA_MSGC_CO);"
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/3b20b16a_02afe4... : PS8, Line 607: * @return 0 on success EPROTONOSUPPORT is not documented here. In case, I'd drop EPROTONOSUPPORT and return asp instead.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/2043cd0b_7b95e4... : PS8, Line 752: LOGPASP(asp, DLSS7, LOGL_ERROR, "TCAP routing message is too small\n"); I think we really want to add a new log category "TCAP" here.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/283d18cb_a977ae... : PS8, Line 921: /** Called when the ASP is going down or free'd Then you are currently calling it fromthe wrong place apparently.
File src/ss7_as_vty.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/218178ec_7c44d7... : PS8, Line 39: #include "ss7_as_loadshare_tcap.h" #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/10e26bb1_36d730... : PS8, Line 158: tcap_disable(as); #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/3d196d11_0484e1... : PS8, Line 165: "traffic-mode (loadshare|loadshare-tcap) [bindings] [sls] [opc-sls] [opc-shift] [<0-2>]", IMHO loadshare-tcap feature should have a separate command to enable/disable it. Leave alone the traffic-mode loadshare one, is already complex enough. Otherwise you are mixing concepts in different protocol layers. Eg: "[no] tcap-loadshare"
Then, drop all the tcap_disable/enable() from all "traffic-mode" commands.
File src/ss7_asp.h:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/e722613c_eacce9... : PS8, Line 96: bool enabled; is this really needed? Does it make sense to have tcap loadsharing enabled in only some ASPs of the AS? I doubt so.
File src/ss7_asp.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/0f7958cc_99510f... : PS8, Line 53: #include "ss7_as_loadshare_tcap.h" #ifdef WITH_TCAP_LOADSHARING
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/4f900a11_d2a366... : PS8, Line 1216: tcap_asp_down(asp); nooo way you are calling this here. Put it where it belongs in the FSM.
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/64863fbc_c5846a... : PS8, Line 1344: tcap_asp_down(asp); nooo way you are calling this here. Put it where it belongs in the FSM
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/fe6a2bb9_8dddf5... : PS8, Line 1435: /*! Received an unknown packet on asp Unrelated to this commit, please submit another one.
File src/ss7_internal.h:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/23fe8af9_75c77a... : PS8, Line 9: #include <osmocom/sigtran/mtp_sap.h> why is this now needed here? if needed, please submit another commit.
File src/tcap_transaction_tracking.h:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/2e560cc7_c465a3... : PS8, Line 36: struct tcap_trxn_track_entry *tcap_trxn_track_create_entry( We usually have the struct name and then the action, ie. tcap_trxn_track_entry_create()
Same applies for all the other APIs.
File src/tcap_transaction_tracking.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/6fcfc093_e8bd2c... : PS8, Line 122: LOGPASP(entry->asp, DLSS7, LOGL_DEBUG, "Creating tcap cache, entry (own) pc/ssn/tid %u/%u/%u -> %u/%u/%u\n", we always want to log PC in both numeric and configured format. See osmo_ss7_pointcode_print_buf().
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/d88f79ec_58c025... : PS8, Line 141: if (entry->own_tid.tid_valid) You can probably simplify this strcut removing these "valid" fields by initializing entry->own_tid.list and then checking if it is empty.
File src/xua_as_fsm.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/d78c708a_8df17e... : PS8, Line 204: asp = ss7_as_select_asp(as, xua); This could actually go in a preparatory patch to simplify this one a bit.
File stp/stp_main.c:
https://gerrit.osmocom.org/c/libosmo-sigtran/+/41309/comment/13b34653_b21f41... : PS8, Line 52: void *talloc_asn1_ctx; Why is this needed?