Hi, I've rewritten the GSM 7 bit default encoding as the current implementation results in wrong encoding for certain characters, e.g. '@'. The septet packing lacked the lookup and this only worked for characters that had a 1-1 mapping between their ascii value and the default alphabet table. I introduced a lookup table and now do the character translation before packing the data. Patch is attached.
I guess the decoding needs to be rewritten as well and I may provide a patch for that in the future as well. For now the encoding produced enough headaches for me :)
I adjusted the copyright stanza in these files but I'm not completely sure if this is appropriate or not, so feel free to remove, I won't be offended :)
Cheers Nico
On 07/08/2010 11:56 PM, Nico Golde wrote:
I adjusted the copyright stanza in these files but I'm not completely sure if this is appropriate or not, so feel free to remove, I won't be offended :)
The rule of thumb I heard is that anything above four lines of code is copyrightable, so your addition is totally fine!
I do have two nitpicks though. We do not indent the case labels inside a switch statement and the other is more important to me, and might ask you to write a decoder, could you please update the encoding/decoding and add the string that could not be decoded by the current code?
thanks
Hi, * Holger Freyther zecke@selfish.org [2010-07-08 20:04]:
On 07/08/2010 11:56 PM, Nico Golde wrote:
I adjusted the copyright stanza in these files but I'm not completely sure if this is appropriate or not, so feel free to remove, I won't be offended :)
The rule of thumb I heard is that anything above four lines of code is copyrightable, so your addition is totally fine!
Ok.
I do have two nitpicks though. We do not indent the case labels inside a switch statement
Fixed, attached.
and the other is more important to me, and might ask you to write a decoder, could you please update the encoding/decoding and add the string that could not be decoded by the current code?
I felt this was coming :) I didn't write the decoder yet as it was quite time consuming already to go through the encoding process and I personally didn't need the decoding yet. But I agree, of course a decoder that fits the encoder is necessary (even though what could be properly decoded before also can be properly decoded now, no change for this). And I'm also willing to write it as soon as I have more time.
As a string my example was '@' which should be encoded as 00 (1 byte hex) but is encoded as 40 instead by the current code.
Cheers Nico
(even though what could be properly decoded before also can be properly decoded now, no change for this).
Well, I think that before
encode(decode(x)) == x
because the 'errors' matched in encode and decode. Therefore message sent from GSM to GSM actually worked fine and only if you involved the VTY interface then it failed.
But if they don't match anymore, then VTY -> GSM will work but GSM -> GSM will be broken.
Sylvain
On 07/09/2010 04:58 AM, Sylvain Munaut wrote:
(even though what could be properly decoded before also can be properly decoded now, no change for this).
Well, I think that before
encode(decode(x)) == x
Hi all,
if the decoder is too much work right now, you can change the test to only test encoding. So make it encode(x) == expected_result and add one test case for the string that was failing, convert the old decoding to decode(x) == expected_result as well. and you might feel like adding an expected failure (because the code is missing).
The benefit of changing the test is that you can more easily convince me that the new code can do everything the old promised and is fixing something that didn't work with the old one.
Hi, * Holger Hans Peter Freyther holger@freyther.de [2010-07-09 13:00]:
On 07/09/2010 04:58 AM, Sylvain Munaut wrote:
(even though what could be properly decoded before also can be properly decoded now, no change for this).
Well, I think that before
encode(decode(x)) == x
if the decoder is too much work right now, you can change the test to only test encoding. So make it encode(x) == expected_result and add one test case for the string that was failing, convert the old decoding to decode(x) == expected_result as well. and you might feel like adding an expected failure (because the code is missing).
The benefit of changing the test is that you can more easily convince me that the new code can do everything the old promised and is fixing something that didn't work with the old one.
Alright, cleaned up the previous code a bit, implemented the decoder and added a test case which the previous code encoded incorrect. Please note that just using the new test case string in the old code will also pass the test even though the encoding is wrong. You need to compare the encoded values to e.g. output of pduspy[0]. Patch attached.
Cheers Nico [0] http://www.nobbi.com/download/pduspy.zip
On 07/10/2010 12:56 AM, Nico Golde wrote:
Alright, cleaned up the previous code a bit, implemented the decoder and added a test case which the previous code encoded incorrect. Please note that just using the new test case string in the old code will also pass the test even though the encoding is wrong. You need to compare the encoded values to e.g. output of pduspy[0]. Patch attached.
Hi Nico,
I have finally applied your patch, thanks for poking me. I have also made two modifications. The first is to separate testing of encoding/decoding and comparing it to a byte array/string, the second is to fix a possible off by one.
I used the current result of the encoding and there seems to be at least one more bug with our encoding routines. We append zero byte(s) at the end that are not touched by the encoding routine but covered by the length.
A nice way to test this is to change the memset(something, 0 to memset(something, 23 and see how it is going to change.
Would you be interested in changing that?
Hi, * Holger Hans Peter Freyther holger@freyther.de [2010-07-19 22:13]:
On 07/10/2010 12:56 AM, Nico Golde wrote:
Alright, cleaned up the previous code a bit, implemented the decoder and added a test case which the previous code encoded incorrect. Please note that just using the new test case string in the old code will also pass the test even though the encoding is wrong. You need to compare the encoded values to e.g. output of pduspy[0]. Patch attached.
I have finally applied your patch, thanks for poking me. I have also made two modifications. The first is to separate testing of encoding/decoding and comparing it to a byte array/string, the second is to fix a possible off by one.
Good catch, I already had this fixed locally but was waiting with another patch until you have looked at this one. Will repost an update in case something like this occurs again to prevent this doubled work.
I used the current result of the encoding and there seems to be at least one more bug with our encoding routines. We append zero byte(s) at the end that are not touched by the encoding routine but covered by the length.
A nice way to test this is to change the memset(something, 0 to memset(something, 23 and see how it is going to change.
Would you be interested in changing that?
I just looked into this and if I'm not mistaken this is no bug in the encoding but just a bug when comparing the encoded result. The current code memcmp's n bytes where n is the result of gsm_7bit_encode() with the input string bytewise. The problem with this is that the function returns the number of septets not octets. Attached is a patch for the test code that should fix this.
Cheers Nico
Hi, * Nico Golde openbsc@ngolde.de [2010-07-20 00:29]:
- Holger Hans Peter Freyther holger@freyther.de [2010-07-19 22:13]:
On 07/10/2010 12:56 AM, Nico Golde wrote:
[...]
I used the current result of the encoding and there seems to be at least one more bug with our encoding routines. We append zero byte(s) at the end that are not touched by the encoding routine but covered by the length.
A nice way to test this is to change the memset(something, 0 to memset(something, 23 and see how it is going to change.
Would you be interested in changing that?
I just looked into this and if I'm not mistaken this is no bug in the encoding but just a bug when comparing the encoded result. The current code memcmp's n bytes where n is the result of gsm_7bit_encode() with the input string bytewise. The problem with this is that the function returns the number of septets not octets. Attached is a patch for the test code that should fix this.
Sorry, forget what I wrote here, it's just wrong :)
Cheers Nico
On 07/20/2010 05:05 AM, Nico Golde wrote:
I just looked into this and if I'm not mistaken this is no bug in the encoding but just a bug when comparing the encoded result. The current code memcmp's n bytes where n is the result of gsm_7bit_encode() with the input string bytewise. The problem with this is that the function returns the number of septets not octets. Attached is a patch for the test code that should fix this.
Hi Nico, I don't think using strlen is correct. The gsm_7bit_encode method makes no promise that the result will have a null byte at the end, in fact nothing in this method will add a null byte, and the null byte in the output is only present because of memset in the calling method.
I think gsm_7bit_encode should return the length of bytes it has written to, not more. :)
PS: The gsm_7bit_encode usage in gsm_04_08.c is wrong. :)
Hi, * Holger Hans Peter Freyther holger@freyther.de [2010-07-20 13:03]:
On 07/20/2010 05:05 AM, Nico Golde wrote:
I just looked into this and if I'm not mistaken this is no bug in the encoding but just a bug when comparing the encoded result. The current code memcmp's n bytes where n is the result of gsm_7bit_encode() with the input string bytewise. The problem with this is that the function returns the number of septets not octets. Attached is a patch for the test code that should fix this.
I don't think using strlen is correct. The gsm_7bit_encode method makes no promise that the result will have a null byte at the end, in fact nothing in this method will add a null byte, and the null byte in the output is only present because of memset in the calling method.
Yeah, realized this after writing, hence the second reply.
I think gsm_7bit_encode should return the length of bytes it has written to, not more. :)
Yes and I think this is part of the problem. Currently the function returns i instead of z while z is the counter for the bytes actually written and i the number of encoded septets. I will probably look into this at the weekend.
Cheers Nico
Hi, * Nico Golde openbsc@ngolde.de [2010-07-20 14:10]:
- Holger Hans Peter Freyther holger@freyther.de [2010-07-20 13:03]:
[...]
On 07/20/2010 05:05 AM, Nico Golde wrote: I think gsm_7bit_encode should return the length of bytes it has written to, not more. :)
Yes and I think this is part of the problem. Currently the function returns i instead of z while z is the counter for the bytes actually written and i the number of encoded septets. I will probably look into this at the weekend.
Ok had some time now. Please check the attached patch.
Cheers Nico
On 07/20/2010 09:53 PM, Holger Hans Peter Freyther wrote:
On 07/20/2010 09:49 PM, Nico Golde wrote:
Ok had some time now. Please check the attached patch.
looks fine, I will apply it by tonight.
I have pushed the patch, two small issues were left, please verify that I have done this correctly:
1.) y was unused in decode, I removed it 2.) rtext should be as big as septext_l (i made it one bigger for a null byte but that is not needed).
z.
Hi, * Holger Hans Peter Freyther holger@freyther.de [2010-07-20 22:08]:
On 07/20/2010 09:53 PM, Holger Hans Peter Freyther wrote:
On 07/20/2010 09:49 PM, Nico Golde wrote:
Ok had some time now. Please check the attached patch.
looks fine, I will apply it by tonight.
I have pushed the patch, two small issues were left, please verify that I have done this correctly:
1.) y was unused in decode, I removed it
Meh I thought I already removed this :)
2.) rtext should be as big as septext_l (i made it one bigger for a null byte but that is not needed).
Correct.
Cheers Nico
On Thu, Jul 08, 2010 at 10:58:42PM +0200, Sylvain Munaut wrote:
(even though what could be properly decoded before also can be properly decoded now, no change for this).
Well, I think that before
encode(decode(x)) == x
because the 'errors' matched in encode and decode. Therefore message sent from GSM to GSM actually worked fine and only if you involved the VTY interface then it failed.
ACK.
But if they don't match anymore, then VTY -> GSM will work but GSM -> GSM will be broken.
The idea of storing a plaintext version and the binary-encoded TPDU inside the SQL database was exactly to solve this. the plaintext/UTF8 version is intended for human readers, but if we have the BLOB from a MO SMS (or an external application writing to the database), then that very same unmodified BLOB should be sent to the other MS as part of the MT SMS.
So whatever modifications we make, we should keep in mind that the transparent MO-SMS to MT-SMS path remains unharmed.