hnbgw counter threads: yet another complication

13 May 2024

more hnbgw counter implementation details:

I just realized that there is a race condition when we have two separate
threads for adding/removing nft rules and for retrieving counters:

When adding a new rule, the counters will read back starting from zero. The
rate_ctrs in hnbgw should persistently grow across reconnects, so we need to
cache the last seen counter value and accumulate the difference.

- This cache needs to be reset to zero at the time a new rule is set up.
  (hnbgw cache needs to tightly correllate to the nftables state.)

- The counters being retrieved in a separate thread are then ambiguous: they
  could reflect either a count of the new rule, or of a previously existing
  rule.

Example of rapid disconnect and re-attach,
which would add 12345 ghost counts that need to be avoided:

  main()         nft-maintenance-thread
   |*              |
   |--add-hnb----->|
   |<-done---------|
 [prev_ctr=0]
   |
   | ...lots of traffic...
   |
 [prev_ctr=12345]
   |                                          counters-thread
   |                                            |
   |-----get-counters-------------------------->|
   |                                          [work]
   |                                            |
   |X                                         [reading: ctr=12345]
   |--del-hnb----->|                            |
   |<-done---------|                            |  correct:
   |                                            |  ctr - prev_ctr = 0
   |*                                           |  (no additional counts)
   |--add-hnb----->|                            |
   |<-done---------|                            |
 [prev_ctr=0]                                   |
   |                                            |
   |<----ctr=12345------------------------------|
 [update counter state]
 [add ctr - prev_ctr = 12345] incorrect!
 [prev_ctr=12345]
   |
   | ...
   |
   |-----get-counters-------------------------->|
   |<----ctr=0----------------------------------|
 [update counter state]
 [add ctr - prev_ctr = -12345] rate_ctr must not go backwards!

The thread reading the counters has no way of telling which reference state nft
is using for its report. AFAICT this can only be fixed by doing hnb rule
maintenance strictly sequentially to retrieving counters.

So I am going to reduce the implementation to a single nft thread that does
both hnb maintenance as well as counter retrieval.

Agree?

~N

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

hnbgw counter threads: yet another complication