By default the Testground infrastructure playbooks auto-provision a set of dashboards that provide visibility into the Testground infrastructure:
Currently provisioned dashboards in Grafana
As Testground matures, these dashboards are likely to change.
Cluster-wide resources utilisation
You can view aggregated resources usage on the whole Cluster with the USE Method / Cluster dashboard.
Worker node resources utilisation
You can view CPU, memory, network, disk utilisation per node at the USE Method / Nodeedashboard.
Application / Test run monitoring
In order to understand what your test run is doing, you can use Grafana and view some of the metrics emitted by it to InfluxDB, while it is running, such as the:
Life-cycle events
Diagnostics (i.e. go runtime metrics)
Redis monitoring
Redis is an integral part of the sync service used to provide synchronisation and coordination between test plan instances. You can check it's utilisation on the Redis dashboard.
WeaveNet monitoring
WeaveNet is used for the data plane in Testground - all test plan instances communicate with each other over WeaveNet. You can check statistics from the usage of the network at the WeaveNet dashboards.