I'm not deeply familiar with the topic, but reading along, the below would be my bikeshed feedback =)
On Tue, Feb 27, 2024 at 02:38:11PM -0000, Mauro Levra wrote:
As you know, the current definition of the GSMTAP header has a version field that could enable new versions to be safely introduced. Unfortunately, Wireshark decoder does not check the value of the version field, it just presents the value in the decoded output. It seems that the only way to define a new GSMTAP version that does not break Wireshark decoder is to preserve the initial 16 header octets.
Just to understand the aim: we can of course fix the wireshark dissector to handle the version properly. The idea to keep v3 backwards compatible is only about making older wireshark installations "magically" compatible with the new gsmtap version? I find that this is just a nice-to-have, and i have often upgraded wireshark in order to support a specific protocol. What makes v3 to v2 compatibility such an important requirement? For me it is "just" a short transition phase until all users have gotten around to upgrade their wireshark installation? (If someone back in 2012 had fixed wireshark to handle the version number properly, then no-one would even mention it today. Seems silly to design a protocol around such a short-lived peculiarity of one specific program.)
The length field allows decoders to skip unknown tags. To improve storage performance, a subset of tags for values with a length less or equal to 2 octets can be defined (something like this was proposed in 2012 during the GSMTAPv3 session). Like this:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0| tag | value (padded to 16 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1| tag | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | value | ~ ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
We already have at least two TLV parsing APIs in osmocom, and having written one of them, I'd like to give this feedback: things become a lot easier when there is no variance in the sizes of the length field. If the aim of this is to save one byte of payload, I find this quite anachronistic. In 2024, we don't need to optimize for network bandwidth, instead we want to optimize for CPU load. It makes for far simpler (optimizable) decoders when T and L have a single fixed size. I guess there is a long history behind this proposal, but the voice in my head says: in 2024, Why not pick 16bit for both T and L from the start and be done with it. Allowing 64k tags of 64k octets without special implications seems more appropriate for the future than still dabbling with designs that have a limit of 256 on a very low protocol level.
For example, PFCP is one of the newer 3GPP protocols, and it features a T16L16V.
Looking at libosmo-gtlv as an example for a TLV coder, it is of course possible to have an osmo_gtlv_cfg.load_tl() function that knows which tag is longer and which is shorter -- it then will decide on a size for every single tag value that is encountered. Just, it seems to me more desirable that iterating tags should be fast, just copying fixed sizes without code branches.
It is then also easier to skip unknown tags.
In this last variant, we have 32k tags for "small" values and 32k tags for full TLV entries.
Finally, an additional bit can be used to distinguish between official Osmocom tag definitions, and application specific tags, that are valid only within the boundaries of a specific implementation and could be used to embed metadata that has not yet been assigned an official tag.
Slight nuance of definition:
I would recommend not assigning a specific single bit to a specific vendor, because it is too specific. Instead I'd reserve a tag range for gsmtapv3, and leave the rest "application specific" for any vendors to do what they please.
For example, in case of T16, we can reserve 0-0x7fff for future "native" protocol extensions. By defining this way around, instead of having 0x8000 to 0xffff reserved specifically for Osmocom, we have 0x8000-0xffff available to any vendor. The vendors can, within that non-reserved tag range, decide between each other to decide for a vendor specific tag range (or not).
It's essentially the same as you suggest, but it unloads the protocol definition from having to coordinate specific vendors' tag ranges.
For the record, TLV is excellent for allowing vendor specific extensions, because any parser can trivially skip parts of the message that it doesn't understand, without any prior knowledge.
There is still the possibility that two vendors do insufficient research and happen to use the same tag values for distinct meanings; if both vendors gain traction, then at some point wireshark will be confused between them. If we want to guard against this beyond doubt, there could be an officially defined tag indicating the nature of tag extensions. Like a simple string:
T: TAG_EXTENSION_VENDOR L: 9 V: "Osmocom-1"
For example, an Osmocom vendor could be required to include this TLV, indicating that extension tags should be interpreted as Osmocom version 1. In the lack of such, any parser can skip or error unknown tags.
At some point, "Osmocom-2" could reshuffle the used tag values, etc.
(This tag could itself be an osmocom specific tag, but it would be much more elegant protocol design if there is one common mechanism across all vendors to define usage of the vendor-specific range.)
Finally, I have yet another idea for vendor specific extensions: in PFCP, there is a concept of "grouped IEs". In essence, a nested TLV structure inside an outer V.
So we could define a *single* vendor specific extension tag, that can contain within it any number of vendor specific TLVs, like this:
T: TAG_EXTENSION L: 456 V: { octet 0: vendor_id_len=9 octet 1..n: "Osmocom-1" octet n+1..456: nested TLV structure
T: ... L: ... V: { ... }
T: ... L: ... V: { ... } }
In this variant, the "TAG_EXTENSION" is a fixed official tag, and it defines that the start of V defines a vendor id string.
Any parsers that don't know extensions skip the entire vendor-specific subsection in one jump (in above example, skip 456 octets once instead of each vendor tag one by one).
A parser that knows Osmocom extensions will look at the TAG_EXTENSION V, verify that it starts with a supported "Osmocom-1" string, and only then evaluate the TLV within.
(For libosmo-gtlv, this is already supported, because we already implemented this for PFCP parsing.)
Those are my two cents! Just some ideas from the off.
~N