Dear list members, dear Mr. Tsou,
we had a couple of problems with the SSE support in Transceiver52 of osmo-trx. The problem was that the -march=native, that is used when compiling the x86 code in Transceiver52M/x86/. This would create a platform dependent binary that can not be easily moved to another platform. (e.g. binary is compiled on a machine that does support SSE4.1, but used on a machine that only supports SSE3)
In order to solve the problem, we have added logic to detect the CPU type at runtime and to switch between the base implementation (Transceiver52M/common) and between the platform specific implementation.
Since there might be different compilers out there, which do not support the -msse3 and -msse4.1 options, we also modified the build system to detect if the compiler supports -msse3 and -msse4.1. If no support can be detected, there is a hard fallback to the generic implementation.
The patches are currently in the review process, you can find them below: https://gerrit.osmocom.org/2098 buildenv: Turn off native architecture builds https://gerrit.osmocom.org/2099 cosmetic: Make parameter lists uniform https://gerrit.osmocom.org/2100 ssedetect: Add runtime CPU detection https://gerrit.osmocom.org/2101 cosmetic: Add info about SSE support https://gerrit.osmocom.org/2102 buildenv: Make build CPU invariant https://gerrit.osmocom.org/2103 cosmetic: remove code duplication https://gerrit.osmocom.org/2104 Add test program to verify convolution implementation https://gerrit.osmocom.org/2134 buildenv: Split up SSE3 and SSE4.1 code
There is also a readmine ticket concerning this change: https://osmocom.org/issues/1869
It would be very kind if you could have a look at the changes and, if possible, to approve them.
regards, Philipp Maier
Hi Philipp,
On Thu, Mar 23, 2017 at 7:27 AM, Philipp Maier pmaier@sysmocom.de wrote:
we had a couple of problems with the SSE support in Transceiver52 of osmo-trx. The problem was that the -march=native, that is used when compiling the x86 code in Transceiver52M/x86/. This would create a platform dependent binary that can not be easily moved to another platform. (e.g. binary is compiled on a machine that does support SSE4.1, but used on a machine that only supports SSE3)
In order to solve the problem, we have added logic to detect the CPU type at runtime and to switch between the base implementation (Transceiver52M/common) and between the platform specific implementation.
Thank you for the contributions. I have been travelling recently and am catching up with a number of recent osmo-trx patch submissions. I am aware of the SSE concerns and will try to review and merge the patches in a timely manner.
-TT