Problem:
The counter increment task in the network monitor currently re-syncs the wallet account from the RPC after some consecutive failures. However, if the accounts themselves are fundamentally outdated or corrupted (e.g. after a network reset or protocol upgrade), re-syncing doesn't help and the monitor keeps failing indefinitely and reports the NTX services as unhealthy.
This is a common issue in our current environment.
Proposed Solution
After N consecutive failures where re-sync also fails, the monitor should automatically:
- Redeploy the counter account via
deploy_counter_account()
- Recreate and save the wallet and counter account files
- Re-initialize the increment task state (
setup_increment_task)
- Reset failure counters and resume normal operation
This should be bounded (e.g. max 1 regeneration attempt per hour) to avoid infinite loops if the network itself is down.
Problem:
The counter increment task in the network monitor currently re-syncs the wallet account from the RPC after some consecutive failures. However, if the accounts themselves are fundamentally outdated or corrupted (e.g. after a network reset or protocol upgrade), re-syncing doesn't help and the monitor keeps failing indefinitely and reports the NTX services as unhealthy.
This is a common issue in our current environment.
Proposed Solution
After N consecutive failures where re-sync also fails, the monitor should automatically:
deploy_counter_account()setup_increment_task)This should be bounded (e.g. max 1 regeneration attempt per hour) to avoid infinite loops if the network itself is down.