[PATCH] rewritten gsm 7 bit encoding

List overview All Threads
Download

newer

older

[PATCH] * rewrite GSM 7bit default...

[PATCH] * rewrite...

Nico Golde

8 Jul 2010 8 Jul '10

5:56 p.m.

Hi, I've rewritten the GSM 7 bit default encoding as the current implementation results in wrong encoding for certain characters, e.g. '@'. The septet packing lacked the lookup and this only worked for characters that had a 1-1 mapping between their ascii value and the default alphabet table. I introduced a lookup table and now do the character translation before packing the data. Patch is attached.

I guess the decoding needs to be rewritten as well and I may provide a patch for that in the future as well. For now the encoding produced enough headaches for me :)

I adjusted the copyright stanza in these files but I'm not completely sure if this is appropriate or not, so feel free to remove, I won't be offended :)

Cheers Nico

Show replies by date

Holger Freyther

8 Jul 8 Jul

6:25 p.m.

On 07/08/2010 11:56 PM, Nico Golde wrote:

...

I adjusted the copyright stanza in these files but I'm not completely sure if this is appropriate or not, so feel free to remove, I won't be offended :)

The rule of thumb I heard is that anything above four lines of code is copyrightable, so your addition is totally fine!

I do have two nitpicks though. We do not indent the case labels inside a switch statement and the other is more important to me, and might ask you to write a decoder, could you please update the encoding/decoding and add the string that could not be decoded by the current code?

thanks

Nico Golde

8:45 p.m.

Hi, * Holger Freyther zecke@selfish.org [2010-07-08 20:04]:

...

On 07/08/2010 11:56 PM, Nico Golde wrote:

...
I adjusted the copyright stanza in these files but I'm not completely sure if this is appropriate or not, so feel free to remove, I won't be offended :)

The rule of thumb I heard is that anything above four lines of code is copyrightable, so your addition is totally fine!

Ok.

...

I do have two nitpicks though. We do not indent the case labels inside a switch statement

Fixed, attached.

...

and the other is more important to me, and might ask you to write a decoder, could you please update the encoding/decoding and add the string that could not be decoded by the current code?

I felt this was coming :) I didn't write the decoder yet as it was quite time consuming already to go through the encoding process and I personally didn't need the decoding yet. But I agree, of course a decoder that fits the encoder is necessary (even though what could be properly decoded before also can be properly decoded now, no change for this). And I'm also willing to write it as soon as I have more time.

As a string my example was '@' which should be encoded as 00 (1 byte hex) but is encoded as 40 instead by the current code.

Cheers Nico

Sylvain Munaut

10:58 p.m.

...

(even though what could be properly decoded before also can be properly decoded now, no change for this).

Well, I think that before

encode(decode(x)) == x

because the 'errors' matched in encode and decode. Therefore message sent from GSM to GSM actually worked fine and only if you involved the VTY interface then it failed.

But if they don't match anymore, then VTY -> GSM will work but GSM -> GSM will be broken.

Sylvain

Holger Hans Peter Freyther

9 Jul 9 Jul

11:19 a.m.

On 07/09/2010 04:58 AM, Sylvain Munaut wrote:

...

...
(even though what could be properly decoded before also can be properly decoded now, no change for this).

Well, I think that before

encode(decode(x)) == x

Hi all,

if the decoder is too much work right now, you can change the test to only test encoding. So make it encode(x) == expected_result and add one test case for the string that was failing, convert the old decoding to decode(x) == expected_result as well. and you might feel like adding an expected failure (because the code is missing).

The benefit of changing the test is that you can more easily convince me that the new code can do everything the old promised and is fixing something that didn't work with the old one.

Nico Golde

6:56 p.m.

Hi, * Holger Hans Peter Freyther holger@freyther.de [2010-07-09 13:00]:

...

On 07/09/2010 04:58 AM, Sylvain Munaut wrote:

...
...
(even though what could be properly decoded before also can be properly decoded now, no change for this).

Well, I think that before

encode(decode(x)) == x

if the decoder is too much work right now, you can change the test to only test encoding. So make it encode(x) == expected_result and add one test case for the string that was failing, convert the old decoding to decode(x) == expected_result as well. and you might feel like adding an expected failure (because the code is missing).

The benefit of changing the test is that you can more easily convince me that the new code can do everything the old promised and is fixing something that didn't work with the old one.

Alright, cleaned up the previous code a bit, implemented the decoder and added a test case which the previous code encoded incorrect. Please note that just using the new test case string in the old code will also pass the test even though the encoding is wrong. You need to compare the encoded values to e.g. output of pduspy[0]. Patch attached.

Cheers Nico [0] http://www.nobbi.com/download/pduspy.zip

Holger Hans Peter Freyther

19 Jul 19 Jul

9:06 p.m.

On 07/10/2010 12:56 AM, Nico Golde wrote:

...

Alright, cleaned up the previous code a bit, implemented the decoder and added a test case which the previous code encoded incorrect. Please note that just using the new test case string in the old code will also pass the test even though the encoding is wrong. You need to compare the encoded values to e.g. output of pduspy[0]. Patch attached.

Hi Nico,

I have finally applied your patch, thanks for poking me. I have also made two modifications. The first is to separate testing of encoding/decoding and comparing it to a byte array/string, the second is to fix a possible off by one.

I used the current result of the encoding and there seems to be at least one more bug with our encoding routines. We append zero byte(s) at the end that are not touched by the encoding routine but covered by the length.

A nice way to test this is to change the memset(something, 0 to memset(something, 23 and see how it is going to change.

Would you be interested in changing that?

Nico Golde

11:05 p.m.

Hi, * Holger Hans Peter Freyther holger@freyther.de [2010-07-19 22:13]:

...

On 07/10/2010 12:56 AM, Nico Golde wrote:

...
Alright, cleaned up the previous code a bit, implemented the decoder and added a test case which the previous code encoded incorrect. Please note that just using the new test case string in the old code will also pass the test even though the encoding is wrong. You need to compare the encoded values to e.g. output of pduspy[0]. Patch attached.

I have finally applied your patch, thanks for poking me. I have also made two modifications. The first is to separate testing of encoding/decoding and comparing it to a byte array/string, the second is to fix a possible off by one.

Good catch, I already had this fixed locally but was waiting with another patch until you have looked at this one. Will repost an update in case something like this occurs again to prevent this doubled work.

...

I used the current result of the encoding and there seems to be at least one more bug with our encoding routines. We append zero byte(s) at the end that are not touched by the encoding routine but covered by the length.

A nice way to test this is to change the memset(something, 0 to memset(something, 23 and see how it is going to change.

Would you be interested in changing that?

I just looked into this and if I'm not mistaken this is no bug in the encoding but just a bug when comparing the encoded result. The current code memcmp's n bytes where n is the result of gsm_7bit_encode() with the input string bytewise. The problem with this is that the function returns the number of septets not octets. Attached is a patch for the test code that should fix this.

Cheers Nico

Nico Golde

20 Jul 20 Jul

12:32 a.m.

Hi, * Nico Golde openbsc@ngolde.de [2010-07-20 00:29]:

...

Holger Hans Peter Freyther holger@freyther.de [2010-07-19 22:13]:

...
On 07/10/2010 12:56 AM, Nico Golde wrote:

[...]

...

...
I used the current result of the encoding and there seems to be at least one more bug with our encoding routines. We append zero byte(s) at the end that are not touched by the encoding routine but covered by the length.

A nice way to test this is to change the memset(something, 0 to memset(something, 23 and see how it is going to change.

Would you be interested in changing that?

I just looked into this and if I'm not mistaken this is no bug in the encoding but just a bug when comparing the encoded result. The current code memcmp's n bytes where n is the result of gsm_7bit_encode() with the input string bytewise. The problem with this is that the function returns the number of septets not octets. Attached is a patch for the test code that should fix this.

Sorry, forget what I wrote here, it's just wrong :)

Cheers Nico

Holger Hans Peter Freyther

10:02 a.m.

On 07/20/2010 05:05 AM, Nico Golde wrote:

...

I just looked into this and if I'm not mistaken this is no bug in the encoding but just a bug when comparing the encoded result. The current code memcmp's n bytes where n is the result of gsm_7bit_encode() with the input string bytewise. The problem with this is that the function returns the number of septets not octets. Attached is a patch for the test code that should fix this.

Hi Nico, I don't think using strlen is correct. The gsm_7bit_encode method makes no promise that the result will have a null byte at the end, in fact nothing in this method will add a null byte, and the null byte in the output is only present because of memset in the calling method.

I think gsm_7bit_encode should return the length of bytes it has written to, not more. :)

PS: The gsm_7bit_encode usage in gsm_04_08.c is wrong. :)

Nico Golde

1:04 p.m.

Hi, * Holger Hans Peter Freyther holger@freyther.de [2010-07-20 13:03]:

...

On 07/20/2010 05:05 AM, Nico Golde wrote:

...
I just looked into this and if I'm not mistaken this is no bug in the encoding but just a bug when comparing the encoded result. The current code memcmp's n bytes where n is the result of gsm_7bit_encode() with the input string bytewise. The problem with this is that the function returns the number of septets not octets. Attached is a patch for the test code that should fix this.

I don't think using strlen is correct. The gsm_7bit_encode method makes no promise that the result will have a null byte at the end, in fact nothing in this method will add a null byte, and the null byte in the output is only present because of memset in the calling method.

Yeah, realized this after writing, hence the second reply.

...

I think gsm_7bit_encode should return the length of bytes it has written to, not more. :)

Yes and I think this is part of the problem. Currently the function returns i instead of z while z is the counter for the bytes actually written and i the number of encoded septets. I will probably look into this at the weekend.

Cheers Nico

Nico Golde

3:49 p.m.

Hi, * Nico Golde openbsc@ngolde.de [2010-07-20 14:10]:

...

Holger Hans Peter Freyther holger@freyther.de [2010-07-20 13:03]:

[...]

...

...
On 07/20/2010 05:05 AM, Nico Golde wrote: I think gsm_7bit_encode should return the length of bytes it has written to, not more. :)

Yes and I think this is part of the problem. Currently the function returns i instead of z while z is the counter for the bytes actually written and i the number of encoded septets. I will probably look into this at the weekend.

Ok had some time now. Please check the attached patch.

Cheers Nico

Holger Hans Peter Freyther

3:53 p.m.

On 07/20/2010 09:49 PM, Nico Golde wrote:

...

Ok had some time now. Please check the attached patch.

looks fine, I will apply it by tonight.

Holger Hans Peter Freyther

9:15 p.m.

On 07/20/2010 09:53 PM, Holger Hans Peter Freyther wrote:

...

On 07/20/2010 09:49 PM, Nico Golde wrote:

...
Ok had some time now. Please check the attached patch.

looks fine, I will apply it by tonight.

I have pushed the patch, two small issues were left, please verify that I have done this correctly:

1.) y was unused in decode, I removed it 2.) rtext should be as big as septext_l (i made it one bigger for a null byte but that is not needed).

Nico Golde

10:32 p.m.

Hi, * Holger Hans Peter Freyther holger@freyther.de [2010-07-20 22:08]:

...

On 07/20/2010 09:53 PM, Holger Hans Peter Freyther wrote:

...
On 07/20/2010 09:49 PM, Nico Golde wrote:

...
Ok had some time now. Please check the attached patch.

looks fine, I will apply it by tonight.

I have pushed the patch, two small issues were left, please verify that I have done this correctly:

1.) y was unused in decode, I removed it

Meh I thought I already removed this :)

...

2.) rtext should be as big as septext_l (i made it one bigger for a null byte but that is not needed).

Correct.

Cheers Nico

Harald Welte

9 Jul 9 Jul

1:20 p.m.

On Thu, Jul 08, 2010 at 10:58:42PM +0200, Sylvain Munaut wrote:

...

...
(even though what could be properly decoded before also can be properly decoded now, no change for this).

Well, I think that before

encode(decode(x)) == x

because the 'errors' matched in encode and decode. Therefore message sent from GSM to GSM actually worked fine and only if you involved the VTY interface then it failed.

ACK.

...

But if they don't match anymore, then VTY -> GSM will work but GSM -> GSM will be broken.

The idea of storing a plaintext version and the binary-encoded TPDU inside the SQL database was exactly to solve this. the plaintext/UTF8 version is intended for human readers, but if we have the BLOB from a MO SMS (or an external application writing to the database), then that very same unmodified BLOB should be sent to the other MS as part of the MT SMS.

So whatever modifications we make, we should keep in mind that the transparent MO-SMS to MT-SMS path remains unharmed.

-- - Harald Welte laforge@gnumonks.org http://laforge.gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6)

5621

Age (days ago)

5622

Last active (days ago)

openbsc@lists.osmocom.org

15 comments

6 participants

tags (0)

participants (6)

Harald Welte
Holger Freyther
Holger Hans Peter Freyther
Nico Golde
Nico Golde
Sylvain Munaut