Dear Pablo,
getaddrinfo does not work for the combination of AF_INET, SOCK_RAW and IPPROTO_GRE. I have attached an example application that can be compiled with:
$ gcc -o fr fr.c `pkg-config --cflags --libs libosmocore libosmogb`
this prints: getaddrinfo returned NULL: Success FAILED
gettadrinfo returns -8 which should be this: # define EAI_SERVICE -8 /* SERVICE not supported for `ai_socktype'. */
I am not sure what is the most clever way to resolve this. Make SOCK_RAW branch out early and do the socket/bind(/listen) manually, use getaddrinfo twice with some more unspefici options, just deal with SOCK_RAW differently now? The attached code has the benefit of at least handling INET and INET6 inside the getaddrinfo result.
any ideas? holger
On Wed, Nov 07, 2012 at 03:21:14PM +0100, Holger Hans Peter Freyther wrote:
Dear Pablo,
Hi,
I added the frame for a unit test to zecke/fr-gre-test. The reasoning is quite simple.. the frame-relay code is seldomly used during development and I really don't want to look stupid to have this error again.. :)
cheers holger
Hi Holger!
On Wed, Nov 07, 2012 at 03:21:14PM +0100, Holger Hans Peter Freyther wrote:
Dear Pablo,
getaddrinfo does not work for the combination of AF_INET, SOCK_RAW and IPPROTO_GRE. I have attached an example application that can be compiled with:
$ gcc -o fr fr.c `pkg-config --cflags --libs libosmocore libosmogb`
this prints: getaddrinfo returned NULL: Success FAILED
$ strace ./fr [...] socket(PF_NETLINK, SOCK_RAW, 0) = 3 bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0 getsockname(3, {sa_family=AF_NETLINK, pid=5936, groups=00000000}, [12]) = 0 sendto(3, "\24\0\0\0\26\0\1\3+\230\256P\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"0\0\0\0\24\0\2\0+\230\256P0\27\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 108 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"@\0\0\0\24\0\2\0+\230\256P0\27\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 128 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\24\0\0\0\3\0\2\0+\230\256P0\27\0\0\0\0\0\0\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 20
The interesting thing is that the function calls rtnetlink to obtain address information from the kernel, and it seems it gets it right.
Let me decipher that netlink trace:
1) bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0 ^^ bind this socket to rtnetlink
2) Send the request message to ask for the address information.
sendto(3, "\24\0\0\0\26\0\1\3+\230\256P\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
3) We get the multipart message with the information that we requested:
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"0\0\0\0\24\0\2\0+\230\256P0\27\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 108 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"@\0\0\0\24\0\2\0+\230\256P0\27\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 128 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
The problem seems to be in glibc, at sysdeps/posix/getaddrinfo.c, line 125:
{ SOCK_RAW, 0, GAI_PROTO_PROTOANY|GAI_PROTO_NOSERVICE, true, "raw" }
The GAI_PROTO_NOSERVICE flag is set, while interating over the list of addresses that it has obtained from the kernel to return the addrinfo object, it seems to skip the raw protocol and it returns EAI_SERVICE.
I'd need to investigate further the reason why they are doing like that.
gettadrinfo returns -8 which should be this: # define EAI_SERVICE -8 /* SERVICE not supported for `ai_socktype'. */
I am not sure what is the most clever way to resolve this. Make SOCK_RAW branch out early and do the socket/bind(/listen) manually, use getaddrinfo twice with some more unspefici options, just deal with SOCK_RAW differently now? The attached code has the benefit of at least handling INET and INET6 inside the getaddrinfo result.
any ideas?
The quick thing would be to workaround libosmocore. I'll try to investigate this issue a bit more but it may take me a while.
Hope it helps.
On Thu, Nov 22, 2012 at 10:36:34PM +0100, Pablo Neira Ayuso wrote:
Hi Holger!
On Wed, Nov 07, 2012 at 03:21:14PM +0100, Holger Hans Peter Freyther wrote:
Dear Pablo,
getaddrinfo does not work for the combination of AF_INET, SOCK_RAW and IPPROTO_GRE. I have attached an example application that can be compiled with:
$ gcc -o fr fr.c `pkg-config --cflags --libs libosmocore libosmogb`
this prints: getaddrinfo returned NULL: Success FAILED
$ strace ./fr [...] socket(PF_NETLINK, SOCK_RAW, 0) = 3 bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0 getsockname(3, {sa_family=AF_NETLINK, pid=5936, groups=00000000}, [12]) = 0 sendto(3, "\24\0\0\0\26\0\1\3+\230\256P\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"0\0\0\0\24\0\2\0+\230\256P0\27\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 108 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"@\0\0\0\24\0\2\0+\230\256P0\27\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 128 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\24\0\0\0\3\0\2\0+\230\256P0\27\0\0\0\0\0\0\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 20
The interesting thing is that the function calls rtnetlink to obtain address information from the kernel, and it seems it gets it right.
Let me decipher that netlink trace:
bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0 ^^ bind this socket to rtnetlink
Send the request message to ask for the address information.
sendto(3, "\24\0\0\0\26\0\1\3+\230\256P\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
- We get the multipart message with the information that we
requested:
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"0\0\0\0\24\0\2\0+\230\256P0\27\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 108 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"@\0\0\0\24\0\2\0+\230\256P0\27\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 128 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000},
The problem seems to be in glibc, at sysdeps/posix/getaddrinfo.c, line 125:
{ SOCK_RAW, 0, GAI_PROTO_PROTOANY|GAI_PROTO_NOSERVICE, true, "raw" }
The GAI_PROTO_NOSERVICE flag is set, while interating over the list of addresses that it has obtained from the kernel to return the addrinfo object, it seems to skip the raw protocol and it returns EAI_SERVICE.
To be more precise, it returns EAI_SERVICE before even iterating over the list of address.
This behaviour seems a bit inconsistent to me, ie. requst data from the kernel, get it all right and discard it :-)
On Thu, Nov 22, 2012 at 10:36:34PM +0100, Pablo Neira Ayuso wrote:
Hi Holger!
Hey!
The GAI_PROTO_NOSERVICE flag is set, while interating over the list of addresses that it has obtained from the kernel to return the addrinfo object, it seems to skip the raw protocol and it returns EAI_SERVICE.
I'd need to investigate further the reason why they are doing like that.
one way or another filing a bug report in the glibc bugzilla might be a good idea.
The quick thing would be to workaround libosmocore. I'll try to investigate this issue a bit more but it may take me a while.
we will need a workaround as the LTS of CentOS/RHEL/Debian/Ubuntu are unlikely to receive the patch. I am currently using this[1] patch but I think we can make it less ugly (e.g. try to call getaddrinfo twice, and if ai_socktype is unspecified use the caller provided one).
holger
[1] https://build.opensuse.org/package/view_file?expand=1&file=raw-socket.pa...
On Fri, Nov 23, 2012 at 11:06:56AM +0100, Holger Hans Peter Freyther wrote:
On Thu, Nov 22, 2012 at 10:36:34PM +0100, Pablo Neira Ayuso wrote:
Hi Holger!
Hi Pablo,
this is still an issue for running the SGSN/GBproxy. Do you have an uptime in regard to the proper fix? I am going to land the workaround soon but I am interested in getting the real fix into glibc. Can you help?
holger
On Tue, Jan 08, 2013 at 07:40:46PM +0100, Holger Hans Peter Freyther wrote:
On Fri, Nov 23, 2012 at 11:06:56AM +0100, Holger Hans Peter Freyther wrote:
On Thu, Nov 22, 2012 at 10:36:34PM +0100, Pablo Neira Ayuso wrote:
Hi Holger!
Hi Pablo,
this is still an issue for running the SGSN/GBproxy. Do you have an uptime in regard to the proper fix? I am going to land the workaround soon but I am interested in getting the real fix into glibc. Can you help?
Just filed the bug to glibc, you can track it here:
http://sourceware.org/bugzilla/show_bug.cgi?id=15015
Sorry for lagging with this.
On Sun, Jan 13, 2013 at 12:30:33AM +0100, Pablo Neira Ayuso wrote:
Just filed the bug to glibc, you can track it here:
http://sourceware.org/bugzilla/show_bug.cgi?id=15015
Sorry for lagging with this.
thanks! any idea of a non ugly work-around. Even if it is glibc bug, and it will be fixed.. RHEL, Debian 6.0 will unlikely patch it.
holger
Holger Hans Peter Freyther wrote:
thanks! any idea of a non ugly work-around. Even if it is glibc bug, and it will be fixed.. RHEL, Debian 6.0 will unlikely patch it.
I would fall back to the same solution as for the other things that RHEL, Debian & co don't do the way I prefer: Build a replacement.
But at least for Ubuntu I would expect them to be happy to apply a patch onto of their glibc package to fix this, as long as there is a launchpad bug demonstrating some concrete need - ideally backed by some paying canonical customers. :)
I believe you can get any change you want into Red Hat as long as you have support and sponsorship from one of their employees. For minimum effort make friends with their glibc package maintainer.
I have unsuccessfully been trying to communicate with the debian glibc package maintainer for some 8 months by now and he hasn't replied, so I think you are right that debian will "never" add a patch for fixing this, and even if they do I guess it takes some five years before that arrives to users.
//Peter
On Sun, Jan 13, 2013 at 01:24:13AM +0100, Holger Hans Peter Freyther wrote:
On Sun, Jan 13, 2013 at 12:30:33AM +0100, Pablo Neira Ayuso wrote:
Just filed the bug to glibc, you can track it here:
http://sourceware.org/bugzilla/show_bug.cgi?id=15015
Sorry for lagging with this.
thanks! any idea of a non ugly work-around. Even if it is glibc bug, and it will be fixed.. RHEL, Debian 6.0 will unlikely patch it.
Attached a workaround patch. It's a bit of cheating, but it allows us to obtain the address information from the kernel, which is what you need. As Peter mentioned, let's find a fast path to resolve this until some fix lands on glibc.
Let me know!
Regards.
On Mon, Jan 14, 2013 at 12:25:37AM +0100, Pablo Neira Ayuso wrote:
On Sun, Jan 13, 2013 at 01:24:13AM +0100, Holger Hans Peter Freyther wrote:
On Sun, Jan 13, 2013 at 12:30:33AM +0100, Pablo Neira Ayuso wrote:
Just filed the bug to glibc, you can track it here:
http://sourceware.org/bugzilla/show_bug.cgi?id=15015
Sorry for lagging with this.
thanks! any idea of a non ugly work-around. Even if it is glibc bug, and it will be fixed.. RHEL, Debian 6.0 will unlikely patch it.
Attached a workaround patch. It's a bit of cheating, but it allows us to obtain the address information from the kernel, which is what you need. As Peter mentioned, let's find a fast path to resolve this until some fix lands on glibc.
Let me know!
Patch was incomplete, sorry. New version attached.
On Mon, Jan 14, 2013 at 12:25:37AM +0100, Pablo Neira Ayuso wrote:
Hi,
this appears to be only half of the fix.
- hints.ai_socktype = type;
- hints.ai_flags = 0;
- hints.ai_protocol = proto;
- if (type == SOCK_RAW) {
/* Workaround for glibc, that returns EAI_SERVICE (-8) if* SOCK_RAW and IPPROTO_GRE is used.*/hints.ai_socktype = SOCK_DGRAM;hints.ai_protocol = IPPROTO_UDP;- } else {
hints.ai_socktype = type;hints.ai_protocol = proto;- }
now rp->ai_socktype will be SOCK_DGRAM and rp->ai_protocol UDP. So the 'raw' socket for GRE will be a datagram socket for UDP.
E.g. you need the second hunk from my workaround[1]. I just wondered if you could think about a better way (one that can be easily dumped or ifdefed without putting the special case in the middle).
holger
[1] https://build.opensuse.org/package/view_file?expand=1&file=raw-socket.pa...
On Mon, Jan 14, 2013 at 08:37:17AM +0100, Holger Hans Peter Freyther wrote:
On Mon, Jan 14, 2013 at 12:25:37AM +0100, Pablo Neira Ayuso wrote:
Hi,
this appears to be only half of the fix.
- hints.ai_socktype = type;
- hints.ai_flags = 0;
- hints.ai_protocol = proto;
- if (type == SOCK_RAW) {
/* Workaround for glibc, that returns EAI_SERVICE (-8) if* SOCK_RAW and IPPROTO_GRE is used.*/hints.ai_socktype = SOCK_DGRAM;hints.ai_protocol = IPPROTO_UDP;- } else {
hints.ai_socktype = type;hints.ai_protocol = proto;- }
now rp->ai_socktype will be SOCK_DGRAM and rp->ai_protocol UDP. So the 'raw' socket for GRE will be a datagram socket for UDP.
E.g. you need the second hunk from my workaround[1]. I just wondered if you could think about a better way (one that can be easily dumped or ifdefed without putting the special case in the middle).
Indeed. I noticed just after waking up in the morning while having breakfast. Please, check the new patch I sent you.
Thanks.