This is the multi-page printable view of this section. Click here to print.
Visualizing data with Grafana
1 - Managing Grafana
Grafana displays graphs based on data from Prometheus. A default deployment of Grafana is running in a container alongside ESB3024 Router.
Grafana’s configuration and runtime files are stored under
/opt/edgeware/acd/grafana
. It comes with default dashboards that are
documented at Grafana dashboards.
Accessing Grafana
Grafana’s web interface is listening for HTTP connections on port
3000. It has two default accounts, edgeware
and admin
.
The edgeware
account can only view graphs, while the admin
account can also
edit graphs. The accounts with default passwords are shown in the table below.
Account | Default password |
---|---|
edgeware | edgeware |
admin | edgeware |
Starting / Stopping Grafana
Grafana can be managed via systemd, under the service unit acd-grafana
.
systemctl start acd-grafana
Logging
The container logs are automatically published to the system journal, under
the same unit descriptor, and can be viewed using journalctl
journalctl -u acd-grafana
2 - Grafana Dashboards
Grafana will be populated with pre-configured graphs which present some metrics on a time scale. Below is a comprehensive list of those dashboards, along with short descriptions.
Router Monitoring dashboard
This dashboard is by default set as home
directory - it’s what user will see
after logging in.
Number Of Initial Routing Decisions
HTTP Status Codes
Total number of responses sent back to incoming requests, shown by their status codes. Metric: client-response-status
Incoming HTTP and HTTPS Requests
Total number of incoming requests that were deemed valid, divided into SSL
and Unencrypted
categories.
Metric: num_valid_http_requests
Debugging Information dashboard
Number of Lua Exceptions
Number of exceptions encountered so far while evaluating Lua rules. Metric: lua_num_errors
Number of Lua Contexts
Number of active Lua interpreters, both running and idle. Metric: lua_num_evaluators
Time Spent In Lua
Number of microseconds the Lua interpreters were running. Metric: lua_time_spent
Router Latencies
Histogram-like graph showing how many responses were sent within the given latency interval. Metric: orc_latency_bucket
Internal debugging
A folder that contains dashboards intended for internal use.
ACD: Incoming Internet Connections dashboard
SSL Warnings
Rate of warnings logged during TLS connections Metric: num_ssl_warnings_total
SSL Errors
Rate of errors logged during TLS connections Metric: num_ssl_errors_total
Valid Internet HTTPS Requests
Rate of incoming requests that were deemed valid, HTTPS only. Metric: num_valid_http_requests
Invalid Internet HTTPS Requests
Rate of incoming requests that were deemed invalid, HTTPS only. Metric: num_invalid_http_requests
Valid Internet HTTP Requests
Rate of incoming requests that were deemed valid, HTTP only. Metric: num_valid_http_requests
Invalid Internet HTTP Requests
Rate of incoming requests that were deemed invalid, HTTP only. Metric: num_invalid_http_requests
Prometheus: ACD dashboard
Logged Warnings
Rate of logged warnings since the router has started, divided into CDN-related and CDN-unrelated. Metric: num_log_warnings_total
Logged Errors
Rate of logged errors since the router has started. Metric: num_log_errors_total
HTTP Requests
Rate of responses sent to incoming connections. Metric: orc_latency_count
Number Of Active Sessions
Number of sessions opened on router that are still active. Metric: num_sessions
Total Number Of Sessions
Total number of sessions opened on router. Metric: num_sessions
Session Type Counts (Non-Stacked)
Number of active sessions divided by type; see metric documentation linked below for up-to-date list of types. Metric: num_sessions
Prometheus/ACD: Subrunners
Client Connections
Number of currently open client connections per subrunner. Metric: subrunner_client_conns
Asynchronous Queues (Current)
Number of queued events per subrunner, roughly corresponding to load. Metric: subrunner_async_queue
Used <Send/receive> Data Blocks
Number of send or receive data blocks currently in use per subrunner, as decided by the “Send/receive” drop down box. Metric: subrunner_used_send_data_blocks and subrunner_used_receive_data_blocks
Asynchronous Queues (Max)
Maximum number of events waiting in queue. Metric: subrunner_max_async_queue
Total <Send/receive> Data Blocks
Number of send or receive data blocks allocated per subrunner, as decided by the “Send/receive” drop down box. Metric: subrunner_total_send_data_blocks and subrunner_total_receive_data_blocks
Low Queue (Current)
Number of low priority events queued per subrunner. Metric: subrunner_low_queue
Medium Queue (Current)
Number of medium priority events queued per subrunner. Metric: subrunner_medium_queue
High Queue (Current)
Number of high priority events queued per subrunner. Metric: subrunner_high_queue
Low Queue (Max)
Maximum number of events waiting in low priority queue. Metric: subrunner_max_low_queue
Medium Queue (Max)
Maximum number of events waiting in medium priority queue. Metric: subrunner_max_medium_queue
High Queue (Max)
Maximum number of events waiting in high priority queue. Metric: subrunner_max_high_queue
Wakeups
The number of times a subrunner has been waken up from sleep. Metric: subrunner_io_wakeups
Overloaded
The number of times the number of queued events for a subrunner exceeded its maximum. Metric: subrunner_times_worker_overloaded
Autopause
Number of sockets that have been automatically paused. This happens when the work manager is under heavy load. Metric: subrunner_io_autopause_sockets