osmith has submitted this change. (
https://gerrit.osmocom.org/c/osmo-ci/+/40138?usp=email
)
Change subject: ansible: build-hosts: add testenv-coredump-helper
......................................................................
ansible: build-hosts: add testenv-coredump-helper
The Osmocom jenkins nodes run inside LXCs. When we get a coredump it
appears on the host. Add a helper script to the hosts so the jenkins
jobs can fetch the coredumps in case an Osmocom program crashes while
running a ttcn3 testsuite.
The helper script has the following safety features to ensure jenkins
can't just fetch any coredump:
* Only fetch coredumps within the last 3 seconds and only if the
executable matches osmo-* or open5gs-*
* Only listen on the lxc IP
Related: OS#6769
Change-Id: I7e66c98106b7028a393e3b873e96ae2dcb412c48
---
A ansible/roles/testenv-coredump-helper/README.md
A ansible/roles/testenv-coredump-helper/files/testenv-coredump-helper.py
A ansible/roles/testenv-coredump-helper/files/testenv-coredump-helper.service
A ansible/roles/testenv-coredump-helper/handlers/main.yml
A ansible/roles/testenv-coredump-helper/tasks/main.yml
M ansible/setup-build-host.yml
6 files changed, 210 insertions(+), 0 deletions(-)
Approvals:
pespin: Looks good to me, but someone else must approve
Jenkins Builder: Verified
fixeria: Looks good to me, approved
diff --git a/ansible/roles/testenv-coredump-helper/README.md
b/ansible/roles/testenv-coredump-helper/README.md
new file mode 100644
index 0000000..d27f457
--- /dev/null
+++ b/ansible/roles/testenv-coredump-helper/README.md
@@ -0,0 +1,49 @@
+# testenv-coredump-helper
+
+A simple webserver to make Osmocom related coredumps available in LXCs.
+
+## Architecture
+
+```
+.-----------------------------------------------------------------------------.
+| build host (build4) .--------------------------.|
+| | LXC (deb12build-ansible) ||
+| | ||
+| shell | HTTP ||
+| coredumpctl --------- testenv-coredump-helper ------------- testenv ||
+| |__________________________||
+|_____________________________________________________________________________|
+```
+
+## What this script does
+
+This role installs a systemd service running the script in
+`files/testenv-coredump-helper.py`, which runs a HTTP server on port `8042` of
+the `lxcbr0`'s IP (e.g. `10.0.3.1`) on the build host. The IP is detected
+dynamically as it is random on each build host.
+
+The HTTP server provides one GET endpoint `/core`. When it is requested (by
+testenv running inside the LXC), the script runs `coredumpctl` with parameters
+to check for any coredump within the last three seconds that was created for
+any Osmocom specific program (starting with `osmo-*` or `open5gs-*`).
+
+* If no matching coredump was found, it returns HTTP status code `404`.
+
+* If a matching coredump was found, it returns HTTP status code `200`, sends
+ the path to the executable in an `X-Executable-Path` header and sends the
+ coredump itself as body.
+
+The coredump and path to the executable are retrieved from `coredumpctl`. The
+coredump is stored in a temporary file for the duration of the transfer.
+
+## Client implementation
+
+The clientside implementation is in `osmo-ttcn3-hacks.git`,
+`_testenv/testenv/coredump.py` in the `get_from_coredumpctl_lxc_host()`
+function.
+
+## Maximum coredump size
+
+The `testenv-coredump-helper` script does not limit the size of the coredump,
+however a maximum size that `systemd-coredump` accepts can be configured in
+`/etc/systemd/coredump.conf`.
diff --git a/ansible/roles/testenv-coredump-helper/files/testenv-coredump-helper.py
b/ansible/roles/testenv-coredump-helper/files/testenv-coredump-helper.py
new file mode 100644
index 0000000..a56bd0f
--- /dev/null
+++ b/ansible/roles/testenv-coredump-helper/files/testenv-coredump-helper.py
@@ -0,0 +1,112 @@
+#!/usr/bin/env python3
+# Copyright 2025 sysmocom - s.f.m.c. GmbH
+# SPDX-License-Identifier: GPL-3.0-or-later
+# Simple webserver to make Osmocom related coredumps available in LXCs. See
+# ../README.md and OS#6769 for details.
+import datetime
+import fnmatch
+import http.server
+import json
+import os
+import shutil
+import signal
+import socket
+import socketserver
+import subprocess
+import sys
+import tempfile
+
+
+NETDEV = "lxcbr0"
+IP_PATTERN = "10.0.*"
+PORT = 8042
+
+
+def find_lxc_ip():
+ cmd = ["ip", "-j", "-o", "-4",
"addr", "show", "dev", NETDEV]
+ p = subprocess.run(cmd, capture_output=True, text=True, check=True)
+ ret = json.loads(p.stdout)[0]["addr_info"][0]["local"]
+ if not fnmatch.fnmatch(ret, IP_PATTERN):
+ print(f"ERROR: IP doesn't match pattern {IP_PATTERN}: {ret}")
+ sys.exit(1)
+ return ret
+
+
+def executable_is_relevant(exe):
+ basename = os.path.basename(exe)
+ patterns = [
+ "open5gs-*",
+ "osmo-*",
+ ]
+
+ for pattern in patterns:
+ if fnmatch.fnmatch(basename, pattern):
+ return True
+
+ return False
+
+
+class CustomRequestHandler(http.server.SimpleHTTPRequestHandler):
+ def do_GET(self):
+ if self.path == "/core":
+ # Check for any coredump within last 3 seconds
+ since = (datetime.datetime.now() -
datetime.timedelta(seconds=3)).strftime("%Y-%m-%d %H:%M:%S")
+ cmd = ["coredumpctl", "-q", "-S", since,
"--json=short", "-n1"]
+
+ p = subprocess.run(cmd, capture_output=True, text=True)
+ if p.returncode != 0:
+ self.send_error(404, "No coredump found")
+ return None
+
+ # Check if the coredump executable is from osmo-*, open5gs-*, etc.
+ coredump = json.loads(p.stdout)[0]
+ if not executable_is_relevant(coredump["exe"]):
+ self.send_error(404, "No coredump found")
+ return None
+
+ # Put coredump into a temporary file and return it
+ with tempfile.TemporaryDirectory() as tmpdirname:
+ core_path = os.path.join(tmpdirname, "core")
+ cmd = [
+ "coredumpctl",
+ "dump",
+ "-q",
+ "-S",
+ since,
+ "-o",
+ core_path,
+ str(coredump["pid"]),
+ coredump["exe"],
+ ]
+ subprocess.run(cmd, stdout=subprocess.DEVNULL, check=True)
+
+ with open(core_path, "rb") as f:
+ self.send_response(200)
+ self.send_header("X-Executable-Path",
coredump["exe"])
+ self.end_headers()
+ self.wfile.write(f.read())
+ else:
+ self.send_error(404, "File Not Found")
+
+
+def signal_handler(sig, frame):
+ sys.exit(0)
+
+
+def main():
+ if not shutil.which("coredumpctl"):
+ print("ERROR: coredumpctl not found!")
+ sys.exit(1)
+
+ ip = os.environ.get("LXC_HOST_IP") or find_lxc_ip()
+ print(f"Listening on {ip}:{PORT}")
+ signal.signal(signal.SIGINT, signal_handler)
+ with socketserver.TCPServer((ip, PORT), CustomRequestHandler, False) as httpd:
+ httpd.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+ httpd.server_bind()
+ httpd.server_activate()
+ httpd.serve_forever()
+
+
+if __name__ == "__main__":
+ main()
diff --git a/ansible/roles/testenv-coredump-helper/files/testenv-coredump-helper.service
b/ansible/roles/testenv-coredump-helper/files/testenv-coredump-helper.service
new file mode 100644
index 0000000..ef5a851
--- /dev/null
+++ b/ansible/roles/testenv-coredump-helper/files/testenv-coredump-helper.service
@@ -0,0 +1,12 @@
+[Unit]
+Description=testenv coredump helper
+After=lxc.service
+
+[Service]
+Environment="PYTHONUNBUFFERED=1"
+Type=simple
+Restart=always
+ExecStart=/opt/testenv-coredump-helper/testenv-coredump-helper
+
+[Install]
+WantedBy=multi-user.target
diff --git a/ansible/roles/testenv-coredump-helper/handlers/main.yml
b/ansible/roles/testenv-coredump-helper/handlers/main.yml
new file mode 100644
index 0000000..0f7ef57
--- /dev/null
+++ b/ansible/roles/testenv-coredump-helper/handlers/main.yml
@@ -0,0 +1,5 @@
+---
+- name: restart testenv-coredump-helper
+ service:
+ name: testenv-coredump-helper
+ state: restarted
diff --git a/ansible/roles/testenv-coredump-helper/tasks/main.yml
b/ansible/roles/testenv-coredump-helper/tasks/main.yml
new file mode 100644
index 0000000..9ff2769
--- /dev/null
+++ b/ansible/roles/testenv-coredump-helper/tasks/main.yml
@@ -0,0 +1,31 @@
+---
+- name: install coredumpctl
+ apt:
+ name:
+ - systemd-coredump
+ cache_valid_time: 3600
+ update_cache: yes
+
+- name: mkdir /opt/testenv-coredump-helper
+ ansible.builtin.file:
+ path: /opt/testenv-coredump-helper
+ state: directory
+
+- name: install testenv-coredump-helper
+ ansible.builtin.copy:
+ src: testenv-coredump-helper.py
+ dest: /opt/testenv-coredump-helper/testenv-coredump-helper
+ mode: '0755'
+ notify: restart testenv-coredump-helper
+
+- name: install testenv-coredump-helper service
+ ansible.builtin.copy:
+ src: testenv-coredump-helper.service
+ dest: /etc/systemd/system/testenv-coredump-helper.service
+ mode: '0644'
+ notify: restart testenv-coredump-helper
+
+- name: enable testenv-coredump-helper service
+ ansible.builtin.systemd_service:
+ name: testenv-coredump-helper
+ enabled: true
diff --git a/ansible/setup-build-host.yml b/ansible/setup-build-host.yml
index ed8def5..d1d9874 100644
--- a/ansible/setup-build-host.yml
+++ b/ansible/setup-build-host.yml
@@ -18,3 +18,4 @@
update_cache: yes
roles:
- name: apt-allow-relinfo-change
+ - name: testenv-coredump-helper
--
To view, visit
https://gerrit.osmocom.org/c/osmo-ci/+/40138?usp=email
To unsubscribe, or for help writing mail filters, visit
https://gerrit.osmocom.org/settings?usp=email
Gerrit-MessageType: merged
Gerrit-Project: osmo-ci
Gerrit-Branch: master
Gerrit-Change-Id: I7e66c98106b7028a393e3b873e96ae2dcb412c48
Gerrit-Change-Number: 40138
Gerrit-PatchSet: 2
Gerrit-Owner: osmith <osmith(a)sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: fixeria <vyanitskiy(a)sysmocom.de>
Gerrit-Reviewer: osmith <osmith(a)sysmocom.de>
Gerrit-Reviewer: pespin <pespin(a)sysmocom.de>