osmith has submitted this change. ( https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/40140?usp=email )
Change subject: testenv: support fetching coredumps in jenkins ......................................................................
testenv: support fetching coredumps in jenkins
The Osmocom jenkins nodes run inside LXCs. When we get a coredump it appears on the host, fetch it from there via testenv-coredump-helper, which gets added to the hosts in the related patch.
Related: osmo-ci I7e66c98106b7028a393e3b873e96ae2dcb412c48 Related: OS#6769 Change-Id: I3784b4cbcef08b26f77b6f6f7a70a830d9c81a18 --- M _testenv/README.md M _testenv/testenv/coredump.py 2 files changed, 62 insertions(+), 1 deletion(-)
Approvals: Jenkins Builder: Verified fixeria: Looks good to me, approved pespin: Looks good to me, but someone else must approve
diff --git a/_testenv/README.md b/_testenv/README.md index 11cd8bd..66158c4 100644 --- a/_testenv/README.md +++ b/_testenv/README.md @@ -215,6 +215,15 @@ available but doesn't work in podman. QEMU runs a bit slower when this is set.
+* `TESTENV_COREDUMP_FROM_LXC_HOST`: + Instead of using coredumpctl to retrieve the coredump, assume testenv is + running inside an LXC and try to retrieve the coredump from the LXC host + (OS#6769). This is used in jenkins. + +* `TESTENV_COREDUMP_FROM_LXC_HOST_IP`: + Instead of attempting to automatically detect the LXC host IP, use this IP. + This can be set to 127.0.0.1 for testing. + ## Troubleshooting
### Timeout waiting for RESET-ACK after sending RESET diff --git a/_testenv/testenv/coredump.py b/_testenv/testenv/coredump.py index 0d56328..0343c67 100644 --- a/_testenv/testenv/coredump.py +++ b/_testenv/testenv/coredump.py @@ -11,9 +11,54 @@ import testenv import testenv.daemons import testenv.testdir +import urllib +import urllib.request
executable_path = None
+lxc_netdev = "eth0" +lxc_ip_pattern = "10.0.*" +lxc_port = 8042 + + +def find_lxc_host_ip(): + cmd = ["ip", "-j", "-o", "-4", "addr", "show", "dev", lxc_netdev] + p = testenv.cmd.run(cmd, check=False, no_podman=True, capture_output=True, text=True) + ret = json.loads(p.stdout)[0]["addr_info"][0]["local"] + if fnmatch.fnmatch(ret, lxc_ip_pattern): + ret = ret.split(".") + ret = f"{ret[0]}.{ret[1]}.{ret[2]}.1" + return ret + return None + + +def get_from_coredumpctl_lxc_host(): + # Server implementation: osmo-ci, ansible/roles/testenv-coredump-helper + global executable_path + + logging.info("Looking for a coredump on lxc host") + + ip = os.environ.get("TESTENV_COREDUMP_FROM_LXC_HOST_IP") or find_lxc_host_ip() + if not ip: + logging.warning("Failed to get lxc host ip, can't look for coredump") + return + + try: + with urllib.request.urlopen(f"http://%7Bip%7D:%7Blxc_port%7D/core") as response: + executable_path = dict(response.getheaders())["X-Executable-Path"] + with open(f"{testenv.testdir.testdir}/core", "wb") as h: + shutil.copyfileobj(response, h) + logging.debug("Coredump found and copied to log dir") + except urllib.error.HTTPError as e: + executable_path = None + if e.code == 404: + logging.debug("No coredump found") + else: + logging.error(f"Unexpected error while attempting to fetch the coredump: {e}") + except urllib.error.URLError as e: + executable_path = None + logging.error(f"Unexpected error while attempting to fetch the coredump: {e}") +
def executable_is_relevant(exe): if testenv.args.binary_repo: @@ -32,7 +77,7 @@ return False
-def get_from_coredumpctl(): +def get_from_coredumpctl_local(): global executable_path
logging.info("Looking for a coredump") @@ -68,6 +113,13 @@ executable_path = coredump["exe"]
+def get_from_coredumpctl(): + if os.environ.get("TESTENV_COREDUMP_FROM_LXC_HOST"): + get_from_coredumpctl_lxc_host() + else: + get_from_coredumpctl_local() + + def get_backtrace(): global executable_path