osmith submitted this change.

View Change

Approvals: fixeria: Looks good to me, but someone else must approve pespin: Looks good to me, approved Jenkins Builder: Verified
testenv: run podman command with logfile

In very rare cases it seems podman is just crashing with no reason in
jenkins. Add logging to the main script we run inside podman, and run
podman with a logfile attached to figure out why.

Related: OS#6607
Change-Id: Ife3c0ae559c94f7df8b5912bb0e338ae6283cb7f
---
M _testenv/data/scripts/testenv-podman-main.sh
M _testenv/testenv/podman.py
2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/_testenv/data/scripts/testenv-podman-main.sh b/_testenv/data/scripts/testenv-podman-main.sh
index d0bc586..4b080a6 100755
--- a/_testenv/data/scripts/testenv-podman-main.sh
+++ b/_testenv/data/scripts/testenv-podman-main.sh
@@ -5,16 +5,20 @@
# This ensures the podman container stops a few seconds after a jenkins job was
# aborted, or if a test is stuck in a loop for hours.

+echo "Running testenv-podman-main.sh"
+
stop_time=$(($(date +%s) + 3600 * 4))

while [ $(date +%s) -lt $stop_time ]; do
sleep 10

if ! [ -e /tmp/watchdog ]; then
- break
+ echo "ERROR: /tmp/watchdog was not created, exiting"
+ exit 1
fi

rm /tmp/watchdog
done

+echo "ERROR: timeout reached!"
exit 1
diff --git a/_testenv/testenv/podman.py b/_testenv/testenv/podman.py
index faa1837..f849408 100644
--- a/_testenv/testenv/podman.py
+++ b/_testenv/testenv/podman.py
@@ -185,6 +185,27 @@
pass


+def wait_until_started():
+ for i in range(100):
+ time.sleep(0.1)
+ if is_running():
+ return
+ raise RuntimeError("Podman failed to start")
+
+
+def start_in_background(cmd):
+ log_dir = os.path.join(testenv.testdir.testdir_topdir, "podman")
+ os.makedirs(log_dir, exist_ok=True)
+
+ logging.debug(f"+ {cmd}")
+ subprocess.Popen(cmd, env=testenv.cmd.generate_env())
+
+ wait_until_started()
+
+ feed_watchdog_process = multiprocessing.Process(target=feed_watchdog_loop)
+ feed_watchdog_process.start()
+
+
def start():
global container_name
global feed_watchdog_process
@@ -202,7 +223,10 @@
"--rm",
"--name",
container_name,
- "--detach",
+ "--log-driver",
+ "json-file",
+ "--log-opt",
+ f"path={testdir_topdir}/podman/{container_name}.log",
f"--security-opt=seccomp={seccomp}",
"--cap-add=NET_ADMIN", # for dumpcap, tun devices, osmo-pcap-client
"--cap-add=NET_RAW", # for dumpcap, osmo-pcap-client
@@ -243,10 +267,7 @@
os.path.join(testenv.data_dir, "scripts/testenv-podman-main.sh"),
]

- testenv.cmd.run(cmd, no_podman=True)
-
- feed_watchdog_process = multiprocessing.Process(target=feed_watchdog_loop)
- feed_watchdog_process.start()
+ start_in_background(cmd)

exec_cmd(["rm", "/etc/apt/apt.conf.d/docker-clean"])


To view, visit change 38580. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-MessageType: merged
Gerrit-Project: osmo-ttcn3-hacks
Gerrit-Branch: master
Gerrit-Change-Id: Ife3c0ae559c94f7df8b5912bb0e338ae6283cb7f
Gerrit-Change-Number: 38580
Gerrit-PatchSet: 2
Gerrit-Owner: osmith <osmith@sysmocom.de>
Gerrit-Reviewer: Jenkins Builder
Gerrit-Reviewer: fixeria <vyanitskiy@sysmocom.de>
Gerrit-Reviewer: osmith <osmith@sysmocom.de>
Gerrit-Reviewer: pespin <pespin@sysmocom.de>