# Monitoring and Observability

## Infrastructure monitoring

In order to understand what your Kubernetes cluster is doing, you can use Grafana and check the dashboards.

### Access Grafana

Port forward the Grafana service to a local port. Then access Grafana on <http://localhost:3000>

Default admin credentials are `username: admin` ; `password: admin`

```
$ kubectl port-forward service/prometheus-operator-grafana 3000:80
```

### Dashboards

By default the Testground infrastructure playbooks auto-provision a set of dashboards that provide visibility into the Testground infrastructure:

![Currently provisioned dashboards in Grafana](https://2269362904-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M3Ca3OQSAlIMUyorSwi%2F-M6_RYdHHk4yOR4uQV8z%2F-M6_T4mtoPMmaN6rbVyd%2FScreenshot%202020-05-05%20at%2017.31.21.png?alt=media\&token=35f53b18-7ab4-4b02-87bb-9006d9ddf440)

As Testground matures, these dashboards are likely to change.

### Cluster-wide resources utilisation

You can view aggregated resources usage on the whole Cluster with the `USE Method / Cluster` dashboard.

![](https://2269362904-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M3Ca3OQSAlIMUyorSwi%2F-M6_XASS1g1vxiTm74oo%2F-M6___Vih6uaRGBL_pCX%2FScreenshot%202020-05-05%20at%2018.05.39.png?alt=media\&token=a154f525-0fad-4964-9e64-be3f35290061)

### Worker node resources utilisation

You can view CPU, memory, network, disk utilisation per node at the `USE Method / Node`edashboard.

![](https://2269362904-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M3Ca3OQSAlIMUyorSwi%2F-M6_RYdHHk4yOR4uQV8z%2F-M6_Te_3DmZQhlDm50xz%2FScreenshot%202020-05-05%20at%2017.34.53.png?alt=media\&token=4139ca10-5cf8-4c54-820e-818d5b4f066c)

## Application / Test run monitoring

In order to understand what your `test run` is doing, you can use Grafana and view some of the metrics emitted by it to InfluxDB, while it is running, such as the:

* Life-cycle events
* Diagnostics (i.e. go runtime metrics)

![](https://2269362904-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M3Ca3OQSAlIMUyorSwi%2F-M6_RYdHHk4yOR4uQV8z%2F-M6_UjYDnQ4uUcffupK1%2FScreenshot%202020-05-05%20at%2017.39.53.png?alt=media\&token=2e12781a-dffe-45e3-aac8-23ab0eb7aa34)

## Redis monitoring

Redis is an integral part of the `sync service` used to provide synchronisation and coordination between test plan instances. You can check it's utilisation on the `Redis` dashboard.

![](https://2269362904-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M3Ca3OQSAlIMUyorSwi%2F-M6_XASS1g1vxiTm74oo%2F-M6__xAzxteRkE7yrAb3%2FScreenshot%202020-05-05%20at%2018.03.09.png?alt=media\&token=48712a5b-f8e0-4133-a836-f20ce3d2e47f)

## WeaveNet monitoring

WeaveNet is used for the `data` plane in Testground - all test plan instances communicate with each other over WeaveNet. You can check statistics from the usage of the network at the `WeaveNet` dashboards.

![](https://2269362904-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M3Ca3OQSAlIMUyorSwi%2F-M6_XASS1g1vxiTm74oo%2F-M6_aAMei_u22P6dGpEC%2FScreenshot%202020-05-05%20at%2018.02.42.png?alt=media\&token=90e163d4-5690-47f2-8361-920d9a22085c)

###
