Skip to content

Alerting Ops Server

Alexandre Lamarre edited this page Feb 17, 2023 · 8 revisions

Alerting Ops Server

Summary:

The alerting ops server is used to:

  • propagate user configurations of the Alerting backend to the Alerting backend
  • periodically sync user configurations to concrete routing models

Table of contents

Architecture:

Alerting Ops Server Breakdown(1)

Description

Alerting Ops Server interacts with a cluster driver that handles the Alerting backend deployment based on the infrastructure available (k8s versus non-k8s).

Responsibilities

  • Propagate configuration and deployment updates to the Alerting Backend via a cluster driver
  • Query status of Alerting Backend
  • Reconcile opni user configurations to concrete Alerting backend configurations (e.g. AlertManager configurations)

Corresponding UI element(s)

  • Alerting Tab / Main Page

Description

Allows the User to configure & scale their Alerting Backend.

Screenshots

Alerting backend

Restrictions & Limitations

  • the cluster drivers the ops server implement are infrastructure specific, so testing is split across different drivers (e.g. integration -- local driver, e2e -- kubernetes-driver)

Scale and performance:

In order to ensure the routing persistent data updates perform at scale, we:

  • periodically run a user config sync, which checks if the user configurations have changed in a manner that updates the loaded routing model
  • if they have changed in such a manner, batch the changes into a request broadcast to all connected clients.

High availability:

Tied to Opni Gateway High Availability.

Testing:

Testplan

Unit tests

  • N/A

Integration tests

  • Covers install / uninstall, see Alerting Backend tests

e2e tests

  • N/A

Manual testing

  • Verify install / uninstall Alerting Backend, see Alerting Backend tests
Clone this wiki locally