Session Groups and Classification

How to classify clients into session groups and use them in routing

ESB3024 Router provides a flexible classification engine, allowing the assignment of clients into session groups that can then be used to base routing decisions on.

Session Groups

A session group is a subset of clients satisfying certain conditions. Define session groups reflecting the environment and objectives.

A client can be part of multiple session groups.

Classification

The session_groups field of the configuration comprises a list of sessions groups, each with their respective classification criteria specified in the corresponding classifiers field.

Each session group consists of the fields

  • id (number) - ID of the session group, must be globally unique from all other session group IDs
  • name (string) - name of the session group
  • classifiers (array of array of objects) - list of the session group’s classifiers.

Classification Logic

Session groups support combining classifiers using AND and OR semantics. This is achieved by nesting classifiers within evaluation groups, as seen in the pseudo-JSON below:

{
  "session_groups": [
    {
      "id": 1,
      "name": "example_AND_vs_OR",
      "classifiers": [
        [ // All classifiers within evaluation group are AND evaluated
          {
            // classifier 1
          },
          {
            // classifier 2
          }
        ],
        // Separate evaluation groups are OR evaluated
        [
          {
            // classifier 3
          }
        ]
      ]
    }
  ]
}

This particular session group defines an evaluation group containing classifier 1 and classifier 2. Both classifiers have to evaluate to true for the whole classifier group to evaluate to true.

This session group also defines another evaluation group containing classifier 3. Separate evaluation groups are OR evaluated, meaning that either the first or second evaluation group needs to evaluate to true for a client to assigned to the session group.

Logically, this session group’s classification criterion can be visualized as
( classifier 1 AND classifier 2 ) OR classifier 3.

Classifier Structure

Each classifier consists of the fields

  • id (number): ID of the classifier, must be unique within the same session group
  • name (string): name of the classifier
  • inverted (boolean): the resulting classification should be inverted (negated) before evaluation
  • rule (object): the classifier logic

rule contains different fields defining classifier behaviour. The first mandatory field is rule_type which defines the classification method and can be one of string_match_rule, geoip_rule, asn_ids_rule, regex_rule, subnet_rule and ip_ranges_rule. The second mandatory field is source, defining the data source on which the classification will be evaluated and can be one of session/content_url_path, session/content_url_query_params, session/user_agent, session/client_ip or session/hostname.

Examples of all rule types and sources can be seen in the JSON below.

{
  "session_groups": [
    {
      "id": 1,
      "name": "example_string_match_rule_live_and_subtitle",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": false,
            "name": "live_wildcard_match",
            "rule": {
              "rule_type": "string_match_rule",
              "source": "session/content_url_path",
              "pattern": "*live*"
            }
          },
          {
            "id": 2,
            "inverted": false,
            "name": "subtitle_wildcard_match",
            "rule": {
              "rule_type": "string_match_rule",
              "source": "session/content_url_query_params",
              "pattern": "*subtitle=eng*"
            }
          }
        ]
      ]
    },
    {
      "id": 2,
      "name": "example_string_match_rule_vod_or_hostname",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": false,
            "name": "live_wildcard_match",
            "rule": {
              "rule_type": "string_match_rule",
              "source": "session/content_url_path",
              "pattern": "*vod*"
            }
          }
        ],
        [
          {
            "id": 3,
            "inverted": false,
            "name": "hostname_match",
            "rule": {
              "rule_type": "string_match_rule",
              "source": "session/hostname",
              "pattern": "mycdn.com"
            }
          }
        ]
      ]
    },
    {
      "id": 3,
      "name": "example_string_match_rule_ipv4",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": false,
            "name": "ipv4_wildcard_match",
            "rule": {
              "rule_type": "string_match_rule",
              "source": "session/client_ip",
              "pattern": "*.*"
            }
          }
        ]
      ]
    },
    {
      "id": 4,
      "name": "example_string_match_rule_ipv6",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": false,
            "name": "ipv6_wildcard_match",
            "rule": {
              "rule_type": "string_match_rule",
              "source": "session/client_ip",
              "pattern": "*:*"
            }
          }
        ]
      ]
    },
    {
      "id": 5,
      "name": "example_geo_location",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": false,
            "name": "geo_location_specific",
            "rule": {
              "rule_type": "geoip_rule",
              "source": "session/client_ip",
              "continent": "Europe",
              "country": "Norway",
              "region": "Innlandet",
              "cities": ["Elverum"],
              "geoname_id": 3144096,
              "asn": "Telenor Norge AS"
            }
          }
        ]
      ]
    },
    {
      "id": 6,
      "name": "example_geo_location_asn_wildcard",
      "classifiers": [
        [
          {
          "id": 1,
          "inverted": false,
          "name": "asn_wildcard_match",
          "rule": {
            "rule_type": "geoip_rule",
            "source": "session/client_ip",
            "asn": "Telia*"
            }
          }
        ]
      ]
    },
    {
      "id": 7,
      "name": "example_regex",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": false,
            "name": "ios_version_5.1.1",
            "rule": {
              "rule_type": "regex_rule",
              "source": "session/user_agent",
              "pattern": "/OS ((\\d+_?){2,3})\\s/"
            }
          }
        ]
      ]
    },
    {
      "id": 8,
      "name": "example_ip_ranges",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": true,
            "name": "offload_no_peering_subnet",
            "rule": {
              "rule_type": "ip_ranges_rule",
              "source": "session/client_ip",
              "ip_ranges": ["158.174.0.0/16", "95.192.0.0/12"]
            }
          }
        ]
      ]
    },
    {
      "id": 9,
      "name": "example_subnets",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": false,
            "name": "subnets_match",
            "rule": {
              "rule_type": "subnet_rule",
              "source": "session/client_ip",
              "pattern": "stock*"
            }
          }
        ]
      ]
    },
    {
      "id": 10,
      "name": "example_asn_list",
      "classifiers": [
        [
          {
            "id": 1,
            "inverted": false,
            "name": "asn_list_match",
            "rule": {
              "rule_type": "asn_ids_rule",
              "source": "session/client_ip",
              "asn_ids": [1, 2, 3]
            }
          }
        ]
      ]
    }
  ]
}

As seen in "pattern": "/OS ((\\d+_?){2,3})\\s/", backslash characters must always be escaped in JSON. If a tool was used to construct the JSON code, escaping should be done automatically and it must not be included it manually.

The rule types work as following:

  • string_match_rule flexible string matching on any source
    Tip: Supports wildcard matching
  • regex_rule - regex matching on the source, based on C++ regex implementation
    Tip: For basic IPv4 vs. IPv6 differentiation of client IP addresses, a simple string_match_rule on session/client_ip is more efficient and maintainable than a more expressive regex_rule (which may come with the additional benefit of validating the format of the matched field).
  • ip_ranges_rule - a list of IP ranges on the form ‘0.0.0.0/24’. Any client with an IP within any of the specified ranges will be matched by the classifier
  • subnet_rule - string matching of the pattern on all subnets the client’s IP matches. Only supports the source session/client_ip. If a client matches into multiple subnets, only one of them needs to match the pattern.
    Tip: Supports wildcard matching
  • geoip_rule allows geolocation based matching. The client’s IP address is used to poll the MaxMind GeoIP2 database for geolocations. Only supports the source session/client_ip. The possible fields within this rule are
    • continent (string) - the desired continent
    • country (string) - the desired country
    • region (string) - the desired region
    • cities (array of strings) - the desired cities
      • at least one city must match for successful classification
    • asn (string) - the desired ASN
      Tip: Supports wildcard matching
    • geoname_id (number) - MaxMind’s internal id matching
      Note: Using geoname_id requires insight into a MaxMind GeoIP2 database to find correct IDs

All fields within the GeoIP rule are optional. For successful classification, all defined fields must match.

If a client is successfully classified, it will labeled as part of that session group. The session group can be then be accessed in Lua contexts, e.g.

{
  "hosts": [
    {
      "id": "geo-location-host",
      "cdn_id": "basic-cdn",
      "host": "geo-location-host.example"
    },
    {
      "id": "ip-ranges-host",
      "cdn_id": "basic-cdn",
      "host": "ip-ranges-host.example"
    }
  ],
  "routing": {
    "id": "session_group_routing",
    "member_order": "sequential",
    "members": [
      {
        "id": "geo_location_session_group",
        "host_id": "geo-location-host",
        "weight_function": "return session_groups.example_geo_location and 1 or 0"
      },
      {
        "id": "ip_ranges_session_group",
        "host_id": "ip-ranges-host",
        "weight_function": "return session_groups.example_ip_ranges and 1 or 0"
      }
    ]
  }
}