This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Monitoring

1: Access logging
2: System troubleshooting
3: Scraping data with Prometheus
4: Visualizing data with Grafana

4.1: Managing Grafana
4.2: Grafana Dashboards

5: Alarms and Alerting
6: Monitoring multiple routers
7: Routing Rule Evaluation Metrics
8: Metrics

8.1: Internal Metrics

1 - Access logging

Where to find access logs and how to configure acccess log rotation

Access logging is activated by default and can be enabled/disabled by running

$ confcli services.routing.tuning.general.accessLog true
$ confcli services.routing.tuning.general.accessLog false

Requests are logged in the combined log format and can be found at /var/log/acd-router/access.log. Additionally, the symbolic link /opt/edgeware/acd/router/log points to /var/log/acd-router, allowing the access logs to also be found at /opt/edgeware/acd/router/log/access.log.

Example Output

$ cat /var/log/acd-router/access.log
May 29 07:20:00 router[52236]: ::1 - - [29/May/2023:07:20:00 +0000] "GET /vod/batman.m3u8 HTTP/1.1" 302 0 "-" "curl/7.61.1"

Access Log Rotation

Access logs are rotated and compressed once the access log file reaches a size of 100 MB. By default, 10 rotated logs are stored before being rotated out. These rotation parameters can be reconfigured by editing the lines

size 100M
rotate 10

in /etc/logrotate.d/acd-router-access-log. For more log rotation configuration possibilites, refer to the Logrotate documentation.

2 - System troubleshooting

Using ew-sysinfo to monitor and troubleshoot ESB3024

ESB3024 contains the tool ew-sysinfo that gives an overview of how the system is doing. Simply use the command and the tool will output information about the system and the installed ESB3024 services.

The output format can be changed using the --format flag, possible values are human (default) and json, e.g.:

$ ew-sysinfo
system:
   os: ['5.4.17-2136.321.4.el8uek.x86_64', 'Oracle Linux Server 8.8']
   cpu_cores: 2
   cpu_load_average: [0.03, 0.03, 0.0]
   memory_usage: 478 MB
   memory_load_average: [0.03, 0.03, 0.0]
   boot_time: 2023-09-08T08:30:57Z
   uptime: 6 days, 3:43:44.640665
   processes: 122
   open_sockets:
      ipv4: 12
      ipv6: 18
      ip_total: 30
      tcp_over_ipv4: 9
      tcp_over_ipv6: 16
      tcp_total: 25
      udp_over_ipv4: 3
      udp_over_ipv6: 2
      udp_total: 5
      total: 145
system_disk (/):
   total: 33271 MB
   used: 7978 MB (24.00%)
   free: 25293 MB
journal_disk (/run/log/journal):
   total: 1954 MB
   used: 217 MB (11.10%)
   free: 1736 MB
vulnerabilities:
   meltdown: Mitigation: PTI
   spectre_v1: Mitigation: usercopy/swapgs barriers and __user pointer sanitization
   spectre_v2: Mitigation: Retpolines, STIBP: disabled, RSB filling, PBRSB-eIBRS: Not affected
processes:
   orc-re:
      pid: 177199
      status: sleeping
      cpu_usage_percent: 1.0%
      cpu_load_average: 131.11%
      memory_usage: 14 MB (0.38%)
      num_threads: 10
hints:
   get_raw_router_config: cat /opt/edgeware/acd/router/cache/config.json
   get_confd_config: cat /opt/edgeware/acd/confd/store/__active
   get_router_logs: journalctl -u acd-router
   get_edns_proxy_logs: journalctl -u acd-edns-proxy
   check_firewall_status: systemctl status firewalld
   check_firewall_config: iptables -nvL

# For --format=json, it's recommended to pipe the output to a JSON interpreter
# such as jq

$ ew-sysinfo --format=json | jq
{
  "system": {
    "os": [
      "5.4.17-2136.321.4.el8uek.x86_64",
      "Oracle Linux Server 8.8"
    ],
    "cpu_cores": 2,
    "cpu_load_average": [
      0.01,
      0.0,
      0.0
    ],
    "memory_usage": "479 MB",
    "memory_load_average": [
      0.01,
      0.0,
      0.0
    ],
    "boot_time": "2023-09-08 08:30:57",
    "uptime": "6 days, 5:12:24.617114",
    "processes": 123,
    "open_sockets": {
      "ipv4": 13,
      "ipv6": 18,
      "ip_total": 31,
      "tcp_over_ipv4": 10,
      "tcp_over_ipv6": 16,
      "tcp_total": 26,
      "udp_over_ipv4": 3,
      "udp_over_ipv6": 2,
      "udp_total": 5,
      "total": 146
    }
  },
  "system_disk (/)": {
    "total": "33271 MB",
    "used": "7977 MB (24.00%)",
    "free": "25293 MB"
  },
  "journal_disk (/run/log/journal)": {
    "total": "1954 MB",
    "used": "225 MB (11.50%)",
    "free": "1728 MB"
  },
  "vulnerabilities": {
    "meltdown": "Mitigation: PTI",
    "spectre_v1": "Mitigation: usercopy/swapgs barriers and __user pointer sanitization",
    "spectre_v2": "Mitigation: Retpolines, STIBP: disabled, RSB filling, PBRSB-eIBRS: Not affected"
  },
  "processes": {
    "orc-re": {
      "pid": 177199,
      "status": "sleeping",
      "cpu_usage_percent": "0.0%",
      "cpu_load_average": "137.63%",
      "memory_usage": "14 MB (0.38%)",
      "num_threads": 10
    }
  }
}

Note that your system might have different monitored processes and field names.

The field hints is different from the rest. It lists common commands that can be used to further monitor system performance, useful for quickly troubleshooting a faulty system.

3 - Scraping data with Prometheus

Prometheus is a third-party data scraper which is installed as a containerized service in the default installation of ESB3024 Router. It periodically reads metrics data from different services, such as acd-router, aggregates it and makes it available to other services that visualize the data. Those services include Grafana and Alertmanager.

The Prometheus configuration file can be found on the host at /opt/edgeware/acd/prometheus/prometheus.yaml.

Accessing Prometheus

Prometheus has a web interface that is listening for HTTP connections on port 9090. There is no authentication, so anyone who has access to the host that is running Prometheus can access the interface.

Starting / Stopping Prometheus

After the service is configured, it can be managed via systemd, under the service unit acd-prometheus.

systemctl start acd-prometheus

Logging

The container logs are automatically published to the system journal, under the same unit descriptor, and can be viewed using journalctl

journalctl -u acd-prometheus

4 - Visualizing data with Grafana

4.1 - Managing Grafana

Grafana displays graphs based on data from Prometheus. A default deployment of Grafana is running in a container alongside ESB3024 Router.

Grafana’s configuration and runtime files are stored under /opt/edgeware/acd/grafana. It comes with default dashboards that are documented at Grafana dashboards.

Accessing Grafana

Grafana’s web interface is listening for HTTP connections on port 3000. It has two default accounts, edgeware and admin.

The edgeware account can only view graphs, while the admin account can also edit graphs. The accounts with default passwords are shown in the table below.

Account	Default password
`edgeware`	`edgeware`
`admin`	`edgeware`

Starting / Stopping Grafana

Grafana can be managed via systemd, under the service unit acd-grafana.

systemctl start acd-grafana

Logging

The container logs are automatically published to the system journal, under the same unit descriptor, and can be viewed using journalctl

journalctl -u acd-grafana

4.2 - Grafana Dashboards

Dashboards in default Grafana installation

Grafana will be populated with pre-configured graphs which present some metrics on a time scale. Below is a comprehensive list of those dashboards, along with short descriptions.

Router Monitoring dashboard

This dashboard is by default set as home directory - it’s what user will see after logging in.

Number Of Initial Routing Decisions

HTTP Status Codes

Total number of responses sent back to incoming requests, shown by their status codes. Metric: client-response-status

Incoming HTTP and HTTPS Requests

Total number of incoming requests that were deemed valid, divided into SSL and Unencrypted categories. Metric: num_valid_http_requests

Debugging Information dashboard

Number of Lua Exceptions

Number of exceptions encountered so far while evaluating Lua rules. Metric: lua_num_errors

Number of Lua Contexts

Number of active Lua interpreters, both running and idle. Metric: lua_num_evaluators

Time Spent In Lua

Number of microseconds the Lua interpreters were running. Metric: lua_time_spent

Router Latencies

Histogram-like graph showing how many responses were sent within the given latency interval. Metric: orc_latency_bucket

Internal debugging

A folder that contains dashboards intended for internal use.

ACD: Incoming Internet Connections dashboard

SSL Warnings

Rate of warnings logged during TLS connections Metric: num_ssl_warnings_total

SSL Errors

Rate of errors logged during TLS connections Metric: num_ssl_errors_total

Valid Internet HTTPS Requests

Rate of incoming requests that were deemed valid, HTTPS only. Metric: num_valid_http_requests

Invalid Internet HTTPS Requests

Rate of incoming requests that were deemed invalid, HTTPS only. Metric: num_invalid_http_requests

Valid Internet HTTP Requests

Rate of incoming requests that were deemed valid, HTTP only. Metric: num_valid_http_requests

Invalid Internet HTTP Requests

Rate of incoming requests that were deemed invalid, HTTP only. Metric: num_invalid_http_requests

Prometheus: ACD dashboard

Logged Warnings

Rate of logged warnings since the router has started, divided into CDN-related and CDN-unrelated. Metric: num_log_warnings_total

Logged Errors

Rate of logged errors since the router has started. Metric: num_log_errors_total

HTTP Requests

Rate of responses sent to incoming connections. Metric: orc_latency_count

Number Of Active Sessions

Number of sessions opened on router that are still active. Metric: num_sessions

Total Number Of Sessions

Total number of sessions opened on router. Metric: num_sessions

Session Type Counts (Non-Stacked)

Number of active sessions divided by type; see metric documentation linked below for up-to-date list of types. Metric: num_sessions

Prometheus/ACD: Subrunners

Client Connections

Number of currently open client connections per subrunner. Metric: subrunner_client_conns

Asynchronous Queues (Current)

Number of queued events per subrunner, roughly corresponding to load. Metric: subrunner_async_queue

Used <Send/receive> Data Blocks

Number of send or receive data blocks currently in use per subrunner, as decided by the “Send/receive” drop down box. Metric: subrunner_used_send_data_blocks and subrunner_used_receive_data_blocks

Asynchronous Queues (Max)

Maximum number of events waiting in queue. Metric: subrunner_max_async_queue

Total <Send/receive> Data Blocks

Number of send or receive data blocks allocated per subrunner, as decided by the “Send/receive” drop down box. Metric: subrunner_total_send_data_blocks and subrunner_total_receive_data_blocks

Low Queue (Current)

Number of low priority events queued per subrunner. Metric: subrunner_low_queue

Medium Queue (Current)

Number of medium priority events queued per subrunner. Metric: subrunner_medium_queue

High Queue (Current)

Number of high priority events queued per subrunner. Metric: subrunner_high_queue

Low Queue (Max)

Maximum number of events waiting in low priority queue. Metric: subrunner_max_low_queue

Medium Queue (Max)

Maximum number of events waiting in medium priority queue. Metric: subrunner_max_medium_queue

High Queue (Max)

Maximum number of events waiting in high priority queue. Metric: subrunner_max_high_queue

Wakeups

The number of times a subrunner has been waken up from sleep. Metric: subrunner_io_wakeups

Overloaded

The number of times the number of queued events for a subrunner exceeded its maximum. Metric: subrunner_times_worker_overloaded

Autopause

Number of sockets that have been automatically paused. This happens when the work manager is under heavy load. Metric: subrunner_io_autopause_sockets

5 - Alarms and Alerting

Configuring alarms and alerting

Alerts are generated by the third-party service Prometheus, which sends them to the Alertmanager service. A default containerized instance of Alertmanager is deployed alongside ESB3024 Router. Out of the box, Alertmanager ships with only a sample configuration file, and will require manual configuration prior to enabling the alerting functionality. Due to the many different possible configurations for how alerts are both detected and where they are pushed, the official Alertmanager documentation should be followed for how to configure the service.

The router ships with Alertmanager 0.25, the documentation for which can be found at prometheus.io. The Alertmanager configuration file can be found on the host at /opt/edgeware/acd/alertmanager/alertmanager.yml.

Accessing Alertmanager

Alertmanager has a web interface that is listening for HTTP connections on port 9093. There is no authentication, so anyone who has access to the host that is running Alertmanager can access the interface.

Starting / Stopping Alertmanager

After the service is configured, it can be managed via systemd, under the service unit acd-alertmanager.

systemctl start acd-alertmanager

Logging

The container logs are automatically published to the system journal, under the same unit descriptor, and can be viewed using journalctl

journalctl -u acd-alertmanager

6 - Monitoring multiple routers

By default an instance of Prometheus only monitors the ESB3024 Router that is installed on the same host as where Prometheus is installed. It is possible to make it monitor other router instances and visualize all instances on one Grafana instance.

Configuring of Prometheus

This is configured in the scraping configuration of Prometheus, which is found in the file /opt/edgeware/acd/prometheus/prometheus.yaml, which typically looks like this:

global:
  scrape_interval:     15s

rule_files:
  - recording-rules.yaml

# A scrape configuration for router metrics
scrape_configs:
  - job_name: 'router-scraper'
    scheme: https
    tls_config:
      insecure_skip_verify: true
    static_configs:
    - targets:
      - acd-router-1:5001
    metrics_path: /m1/v1/metrics
    honor_timestamps: true
  - job_name: 'edns-proxy-scraper'
    scheme: http
    static_configs:
    - targets:
      - acd-router-1:8888
    metrics_path: /metrics
    honor_timestamps: true

More routers can be added to the scrape configuration by simply adding more routers under targets in the scraper jobs.

For instance, to monitor acd-router-2 and acd-router-3 along acd-router-1, the configuration file needs to be modified like this:

global:
  scrape_interval:     15s

rule_files:
  - recording-rules.yaml

# A scrape configuration for router metrics
scrape_configs:
  - job_name: 'router-scraper'
    scheme: https
    tls_config:
      insecure_skip_verify: true
    static_configs:
    - targets:
      - acd-router-1:5001
      - acd-router-2:5001
      - acd-router-3:5001
    metrics_path: /m1/v1/metrics
    honor_timestamps: true
  - job_name: 'edns-proxy-scraper'
    scheme: http
    static_configs:
    - targets:
      - acd-router-1:8888
      - acd-router-2:8888
      - acd-router-3:8888
    metrics_path: /metrics
    honor_timestamps: true

After the file has been modified, Prometheus needs to be restarted by typing

systemctl restart acd-prometheus

It is possible to use the same configuration on multiple routers, so that all routers in a deployment can monitor each other.

Selecting Router in Grafana

In the top left corner the Grafana dashboards have a drop-down menu labeled “ACD Router”, which allows to choose which router to monitor.

7 - Routing Rule Evaluation Metrics

Node Visit counters

ESB3024 Router counts the number of times a node and any of its children is selected in the routing table.

The visit counters can be retrieved with the following end points:

`/v1/node_visits`

Returns visit counters for each node as a flat list of host:counter pairs in JSON.

Example output:

{
  "node1": "1",
  "node2": "1",
  "node3": "1",
  "top": "3"
}

`/v1/node_visits_graph`

Returns a full graph of nodes with their respective visit counters in GraphML.

Example output:

<?xml version="1.0"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns
http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
  <key id="visits" for="node" attr.name="visits" attr.type="string" />
  <graph id="G" edgedefault="directed">
    <node id="routing_table">
      <data key="visits">5</data>
    </node>
    <node id="cdn1">
      <data key="visits">1</data>
    </node>
    <node id="node1">
      <data key="visits">1</data>
    </node>
    <node id="cdn2">
      <data key="visits">2</data>
    </node>
    <node id="node2">
      <data key="visits">2</data>
    </node>
    <node id="cdn3">
      <data key="visits">2</data>
    </node>
    <node id="node3">
      <data key="visits">2</data>
    </node>
    <edge id="e0" source="cdn1" target="node1" />
    <edge id="e1" source="routing_table" target="cdn1" />
    <edge id="e2" source="cdn2" target="node2" />
    <edge id="e3" source="routing_table" target="cdn2" />
    <edge id="e4" source="cdn3" target="node3" />
    <edge id="e5" source="routing_table" target="cdn3" />
  </graph>
</graphml>

To receive the graph as JSON, specify Accept:application/json in the request headers.

Example output:

{
  "edges": [
    {
      "source": "cdn1",
      "target": "node1"
    },
    {
      "source": "routing_table",
      "target": "cdn1"
    },
    {
      "source": "cdn2",
      "target": "node2"
    },
    {
      "source": "routing_table",
      "target": "cdn2"
    },
    {
      "source": "cdn3",
      "target": "node3"
    },
    {
      "source": "routing_table",
      "target": "cdn3"
    }
  ],
  "nodes": [
    {
      "id": "routing_table",
      "visits": "5"
    },
    {
      "id": "cdn1",
      "visits": "1"
    },
    {
      "id": "node1",
      "visits": "1"
    },
    {
      "id": "cdn2",
      "visits": "2"
    },
    {
      "id": "node2",
      "visits": "2"
    },
    {
      "id": "cdn3",
      "visits": "2"
    },
    {
      "id": "node3",
      "visits": "2"
    }
  ]
}

Resetting Visit Counters

A node visit counter with an id not matching any node id of a newly applied routing table is destroyed.

Reset all counters to zero by momentarily applying a configuration with a placeholder routing root node, that has unique id and an empty members list, e.g:

"routing": {
  "id": "empty_routing_table",
  "members": []
}

… and immediately reapply the desired configuration.

8 - Metrics

Metrics endpoint

ESB3024 Router collects a large number of metrics that can give insight into it’s condition at runtime. Those metrics are available in Prometheus’ text-based exposition format at endpoint :5001/m1/v1/metrics.

Below is the description of these metrics along with their labels.

`client_response_status`

Number of responses sent back to incoming requests.

Type: counter

`lua_num_errors`

Number of errors encountered when evaluating Lua rules.

Type: counter

`lua_num_evaluators`

Number of Lua rules evaluators (active interpreters).

Type: gauge

`lua_time_spent`

Time spent by running Lua evaluators, in microseconds.

Type: counter

`num_configuration_changes`

Number of times configuration has been changed since the router has started.

Type: counter

`num_endpoint_requests`

Number of requests redirected per CDN endpoint.

Type: counter
Labels:
- endpoint - CDN endpoint address.
- selector - whether the request was counted during initial or instream selection.

`num_invalid_http_requests`

Number of client requests that either use wrong method or wrong URL path. Also number of all requests that cannot be parsed as HTTP.

Type: counter
Labels:
- source - name of internal filter function that classified request as invalid. Probably not of much use outside debugging.
- type - whether the request was HTTP (Unencrypted) or HTTPS (SSL).

`num_log_errors_total`

Number of logged errors since the router has started.

Type: counter

`num_log_warnings_total`

Number of logged warnings since the router has started.

Type: counter

`num_managed_redirects`

Number of redirects to the router itself, which allows session management.

Type: counter

`num_manifests`

Number of cached manifests.

Type: gauge
Labels:
- count - state of manifest in cache, can be either lru, evicted or total.

`num_qoe_losses`

Number of “lost” QoE decisions per CDN.

Type: counter
Labels:
- cdn_id - ID of CDN that loose QoE battle.
- cdn_name - name of CDN that loose QoE battle.
- selector - whether the decision was taken during initial or instream selection.

`num_qoe_wins`

Number of “won” QoE decisions per CDN.

Type: counter
Labels:
- cdn_id - ID of CDN that won QoE battle.
- cdn_name - name of CDN that won QoE battle.
- selector - whether the decision was taken during initial or instream selection.

`num_rejected_requests`

Deprecated, should always be at 0.

Type: counter
Labels:
- selector - whether the request was counted during initial or instream selection.

`num_requests`

Total number of requests received by the router.

Type: counter
Labels:
- selector - whether the request was counted during initial or instream selection.

`num_sessions`

Number of sessions opened on router.

Type: gauge
Labels:
- state - either active or inactive.
- type - one of: initial, instream, qoe_on, qoe_off, qoe_agent or sp_agent.

`num_ssl_errors_total`

Number of all errors logged during TLS connections, both incoming and outgoing.

Type: counter

`num_ssl_warnings_total`

Number of all warnings logged during TLS connections, both incoming and outgoing.

Type: counter
Labels:
- category - which kind of TLS connection triggered the warning. Can be one of: cdn, content, generic, repeated_session or empty.

`num_unhandled_requests`

Number of requests for which no CDN could be found.

Type: counter
Labels:
- selector - whether the request was counted during initial or instream selection.

`num_unmanaged_redirects`

Number of redirects to “outside” the router - usually to CDN.

Type: counter
Labels:
- cdn_id - ID of CDN picked for redirection.
- cdn_name - name of CDN picked for redirection.
- selector - whether the redirect was result of initial or instream selection.

`num_valid_http_requests`

Number of received requests that were not deemed invalid, see num_invalid_http_requests.

Type: counter
Labels:
- source - name of internal filter function that classified request as invalid. Probably not of much use outside debugging.
- type - whether the request was HTTP (Unencrypted) or HTTPS (SSL).

`orc_latency_bucket`

Total number of responses sorted into “latency buckets” - labels denoting latency interval.

Type: counter
Labels:
- le - latency bucket that given response falls into.
- orc_status_code - HTTP status code of given response.

`orc_latency_count`

Total number of responses.

Type: counter
Labels:
- tls - whether the response was sent via SSL/TLS connection or not.
- orc_status_code - HTTP status code of given response.

`ssl_certificate_days_remaining`

Number of days until a SSL certificate expires.

Type: gauge
Labels:
- domain - the common name of the domain that the certificate authenticates.
- not_valid_after - the expiry time of the certificate.
- not_valid_before - when the certificate starts being valid.
- usable - if the certificate is usable to the router, see the ssl_certificate_usable_count metric for an explanation.

`ssl_certificate_usable_count`

Number of usable SSL certificates. A certificate is usable if it is valid and authenticates a domain name that points to the router.

Type: gauge

8.1 - Internal Metrics

Internal Metrics

A subrunner is an internal module of ESB3024 Router which handles routing requests. The subrunner metrics are technical and mainly of interest for AgileTV. These metrics will be briefly described here.

`subrunner_async_queue`

Number of queued events per subrunner, roughly corresponding to load.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_client_conns`

Number of currently open client connections per subrunner.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_high_queue`

Number of high priority events queued per subrunner.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_io_autopause_sockets`

Number of sockets that have been automatically paused. This happens when the work manager is under heavy load.

Type: counter
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_io_send_data_fast_attempts`

A fast data path was added that in many cases increases the performance of the router. This metric was added to verify that the fast data path is taken.

Type: counter
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_io_wakeups`

The number of times a subrunner has been waken up from sleep.

Type: counter
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_low_queue`

Number of low priority events queued per subrunner.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_max_async_queue`

Maximum number of events waiting in queue.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_max_high_queue`

Maximum number of events waiting in high priority queue.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_max_low_queue`

Maximum number of events waiting in low priority queue.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_max_medium_queue`

Maximum number of events waiting in medium priority queue.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_medium_queue`

Number of medium priority events queued per subrunner.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_times_worker_overloaded`

Number of times when queued events for given subrunner exceeded the tuning.overload_threshold value (defaults to 32).

Type: counter
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_total_receive_data_blocks`

Number of receive data blocks allocated per subrunner.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_total_send_data_blocks`

Number of send data blocks allocated per subrunner.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_used_receive_data_blocks`

Number of receive data blocks currently in use per subrunner. Same as subrunner_total_receive_data_blocks.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.

`subrunner_used_send_data_blocks`

Number of send data blocks currently in use per subrunner. Same as subrunner_total_send_data_blocks.

Type: gauge
Labels:
- subrunner_id - ID of given subrunner.