Hello Mauro,
Thanks for this message. I was preparing my suggestion at https://
gist.github.com/peremen/f8f199bf89e8f4c3e0b5e3f9e141b275#file-gsmtapv3-md which
I targeted to migrate to the Osmocom's gitea some time ago.
It seems that the only way to
define a new GSMTAP version that does not break Wireshark decoder is to
preserve the initial 16 header octets.
That was open also in my suggestion, but if older Wireshark will continue to
parse the GSMTAP header for the whole 16 octets when the type is unknown, then
I am also aligning towards this idea. I was thinking allocating a "protective
type" on GSMTAPv2 to not parse GSMTAPv3 packets by GSMTAPv2-only
implementation.
This new header section could contain
a sequence of TLV entries with a 2 octet tag, a 2 octet length field, and a
value field padded to the nearest 32 bit boundary.
What I was thinking is defining a new data structure for each payload type and
optionally extend them in the future using TLV entries. See my proposal for
the type-specific data structure.
The compact storage format to cover both TV and TLV entries sounds also good.
Given that 1) existing GSM applications can still use GSMTAPv2 (correct me if
I am wrong - whether GSMTAPv2 and v3 will be coexisting) and 2) for 4G and 5G
RRC/NAS, payloads could be longer than 1000 octets, where 2 additional length
octets may not actually be relevant at all, using TLV exclusively might not be
so problematic regarding the performance.
Finally, an additional bit can be used to distinguish
between official
Osmocom tag definitions, and application specific tags, that are valid only
within the boundaries of a specific implementation and could be used to
embed metadata that has not yet been assigned an official tag.
I was thinking similar usage by reserving a range of types for this purpose.
Tags should be also included too.
In addition, I will going to present a talk in this OsmoDevCon [1], and I am
currently planning to finalize the proposal before the conference.
[1]
https://osmocom.org/projects/osmo-dev-con/wiki/OsmoDevCon2024
Shinjo
2024년 2월 27일 화요일 오후 3시 38분 11초 CET에 Mauro Levra 님이 쓴 글:
Hi,
Since no one has replied to my previous very generic message, I come forward
with a proposal hoping to trigger a discussion.
As you know, the current definition of the GSMTAP header has a version
field that could enable new versions to be safely introduced. Unfortunately,
Wireshark decoder does not check the value of the version field, it just
presents the value in the decoded output. It seems that the only way to
define a new GSMTAP version that does not break Wireshark decoder is to
preserve the initial 16 header octets.
This is the current gsmtap_hdr definition. Some fields are only relevant
in the context of a GSM capture, while the meaning of others (notably type,
sub_type) has changed over time to describe a payload of a newer technology.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| version | hdr_len | type | timeslot |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| arfcn | signal_dbm | snr_db |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| frame_number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sub_type | antenna_nr | sub_slot | reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~ ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload |
~ ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
When hdr_len is greater than 4 (the length is 4 * 4 = 16 octets), it is
possible to store additional information in the header, without breaking
any compliant decoder implementation. This new header section could contain
a sequence of TLV entries with a 2 octet tag, a 2 octet length field, and a
value field padded to the nearest 32 bit boundary.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| tag | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| value |
~ ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The length field allows decoders to skip unknown tags. To improve storage
performance, a subset of tags for values with a length less or equal to 2
octets can be defined (something like this was proposed in 2012 during the
GSMTAPv3 session). Like this:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| tag | value (padded to 16 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1| tag | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| value |
~ ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In this last variant, we have 32k tags for "small" values and 32k tags for
full TLV entries.
Finally, an additional bit can be used to distinguish between official
Osmocom tag definitions, and application specific tags, that are valid only
within the boundaries of a specific implementation and could be used to
embed metadata that has not yet been assigned an official tag.
What are your thoughts about this?
Mauro