Skip to content

Tips for Deployments

Andre Nathan edited this page Aug 11, 2021 · 25 revisions

Introduction

These are notes on a Gatekeeper deployment consisting of one Gatekeeper server and two Grantor servers. They assume Ubuntu 20.04 servers with Gatekeeper installed via packages.

This small deployment is meant to help new users to get started with Gatekeeper, so they can evaluate Gatekeeper, write their policy, and incrementally grow their deployment from this first step.

The network topology is shown below, where the Gatekeeper server has its front port connected to a data center uplink and its back port connected to a router. The router works as a gateway for a number of servers which provide services to the Internet via the external network, while the internal network is used for administrative purposes. The Gatekeeper server uses a patched version of the Bird routing daemon to establish a full-routing BGP session with the uplink provider and an iBGP session with the router. The Grantor servers have their front port connected to the external network. Grantor servers do not have a back port configuration in Gatekeeper, and the internal network link is used solely for administrator access.

                                external network
                    +-------------------+-----------+------------+
                    |                   |           | front      | front
              +-----+------+       +----+---+  +----+----+  +----+----+
              |            |       |        |  |         |  |         |
uplink -------+ gatekeeper +-------+ router |  | grantor |  | grantor |
        front |            | back  |        |  |         |  |         |
              +-----+------+       +----+---+  +----+----+  +----+----+
                    |                   |           |            |
                    +-------------------+-----------+------------+
                                internal network

Gatekeeper front IPv4: 10.1.0.1/30
Gatekeeper front IPv6: 2001:db8:1::1/126

Gatekeeper back IPv4: 10.2.0.1/30
Gatekeeper back IPv6: fd00:2::1/126

Router IPv4 on Gatekeeper link: 10.2.0.2/30
Router IPv6 on Gatekeeper link: fd00:2::2/126

External network IPv4 CIDR: 1.2.3.0/20
External network IPv6 CIDR: 2001:db8:123::/48

Grantor front IPv4: 1.2.3.4 and 1.2.3.5
Grantor front IPv6: 2001:db8:123::4 and 2001:db8:123::5

Basic configuration

These steps can be performed for both Gatekeeper and Grantor servers, with the caveat that Grantors only have a front port, so any references to the back port can be ignored.

  1. Setup huge pages

The Gatekeeper server in this deployment has 256 GB of RAM. We reserve 16 GB for the kernel and allocate the remaining 240 GB in 1 GB huge pages. To pass the appropriate command line parameters to the kernel, edit /etc/default/grub and set GRUB_CMDLINE_LINUX_DEFAULT, running update-grub afterwards.

GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=240"
  1. Rename front and back ports

It's useful to have friendly interface names in machines with many NICs. We're going to call the Gatekeeper front and back ports, appropriately, "front" and "back". This will be done with systemd link files. In the link file, it's important to specify a Match section option that doesn't cause the kernel to rename back the interface once it has been taken control of by Gatekeeper.

For this deployment, we have used the PCI addresses of the interfaces. It can be obtained via udevadm:

# udevadm info /sys/class/net/<front port name> | grep ID_PATH=
E: ID_PATH=pci-0000:01:00.0

# udevadm info /sys/class/net/<back port name> | grep ID_PATH=
E: ID_PATH=pci-0000:02:00.0

Create systemd link files for the front and back interfaces (the latter only in the Gatekeeper server) and run update-initramfs -u afterwards. An example using the output from the above udevadm commands is given below:

# /etc/systemd/network/10-front.link
[Match]
Property=ID_PATH=pci-0000:01:00.0
[Link]
Name=front

# /etc/systemd/network/10-back.link
[Match]
Property=ID_PATH=pci-0000:02:00.0
[Link]
Name=back

Once these two changes are in place, reboot the machine for them to take effect. It's also important to remember that DPDK won't take over an interface that is in the UP state, so it's advised to remove the front and back interfaces from the operating system's network configuration (e.g. /etc/network/interfaces in Ubuntu).

Gatekeeper server configuration

Environment variables

The first step is to edit the /etc/gatekeeper/envvars and set the GATEKEEPER_INTERFACES variable with the PCI addresses of the front and back interfaces:

GATEKEEPER_INTERFACES="01:00.0 02:00.0"

Main configuration

For the Gatekeeper server, set gatekeeper_server to true in /etc/gatekeeper/main_config.lua:

local gatekeeper_server = true

Gatekeeper is composed of multiple functional blocks, each one with its own Lua configuration script located in /etc/gatekeeper.

GK block: /etc/gatekeeper/gk.lua

In this file, the following variables have been set as below:

local log_level = staticlib.c.RTE_LOG_NOTICE
local flow_ht_size = 250000000
local max_num_ipv4_rules = 1600000
local num_ipv4_tbl8s = 1000
local max_num_ipv6_rules = 200000
local num_ipv6_tbl8s = 50000

To calculate these values, we first generated IPv4 and IPv6 routing table dumps from full routing BGP sessions, creating, respectively, the ipv4-ranges and ipv6-ranges text files, each containing one CIDR per line. The max_num_ipv[46]_rules and num_ipv[46]_tbl8s variables are set to a round number that is approximately twice the values given by the gtctl tool as described in the project's README file, using the gtcl estimate command, where the ipv4-ranges and ipv6-ranges files are lists of prefixes obtained from routing table dumps of full routing IPv4 and IPv6 BGP sessions.

$ gtctl estimate -4 ipv4-ranges
ipv4: rules=849892, tbl8s=451

$ gtctl estimate -6 ipv6-ranges
ipv6: rules=119871, tbl8s=30881

The flow_ht_size variable is set close to the largest number that enables Gatekeeper to boot up. The larger the flow table, the better Gatekeeper can deal with complex attacks since it can keep state for more flows. To estimate how much memory a given value will consume, multiply flow_ht_size by the number of NUMA nodes, two (i.e. the default number of instances of GK blocks per NUMA node), and 256 bytes. The Gatekeeper server in this deployment has two Intel Xeon processors, that is, two NUMA nodes, so our setting consumes 250000000 * 2 * 2 * 256 bytes ~ 238GB. Notice that this value is an upper bound, so it, in fact, consumes less memory than this estimate. Finally, it is worth pointing out that this setup tracks 250000000 * 2 * 2 = 1 billion flows.

Solicitor block: /etc/gatekeeper/sol.lua

By default, Gatekeeper limits the request bandwitdh to 5% of the link capacity. In our deployment, we are using 10 Gbps interfaces for the Gatekeeper server and router, but the external network runs on 1 Gbps ethernet. With this configuration, 5% of the link capacity would amount to 50% of the external network bandwitdh, so we reduce the request bandwidth rate to 0.5% of the Gatekeeper link capacity:

local req_bw_rate = 0.005

Network block configuration: /etc/gatekeeper/net.lua

In this file, set the variables below according to your network setup. Examples have been given below for a front port named front and a back port named back. In this deployment, the front port belongs to a VLAN and uses LACP, so we set the appropriate VLAN tags for IPv4 and IPv6, and the bonding mode to staticlib.c.BONDING_MODE_8023AD. In our environment, the back port is not in a VLAN, nor does it use link aggregation. The back_mtu variable is set to a high value to account for IP-IP encapsulation in packets sent to the Grantor servers. Note that the router interface connected to the external network should have a matching MTU for packets sent from the Gatekeeper to the Grantor servers.

local user = "gatekeeper"

local front_ports = {"front"}
local front_ips = {"10.1.0.1/30", "2001:db8:1::1/126"}
local front_bonding_mode = staticlib.c.BONDING_MODE_8023AD
local front_ipv4_vlan_tag = 1234
local front_ipv6_vlan_tag = 1234
local front_vlan_insert = true
local front_mtu = 1500

local back_ports = {"back"}
local back_ips = {"10.2.0.1/30", "fd00:2::1/126"}
local back_bonding_mode = staticlib.c.BONDING_MODE_ROUND_ROBIN
local back_ipv4_vlan_tag = 0
local back_ipv6_vlan_tag = 0
local back_vlan_insert = false
local back_mtu = 2048

Other functional blocks

In the remaining Lua configuration files, we simply set the log_level variable. For production use, we specify the WARNING level:

local log_level = staticlib.c.RTE_LOG_WARNING

Configuring grantors in Gatekeeper

The Grantor servers must be configured using Gatekeeper's dynamic configuration mechanism.

As illustrated in the network topology description, the two Grantor servers have external IPv4 addresses 1.2.3.4 and 1.2.3.5 and external IPv6 addresses 2001:db8:123::4 and 2001:db8:123::5. The router's addresses in the interface connected to the Gatekeeper server's back port are 10.2.0.2 and fd00:2::2, and the external network IPv4 and IPv6 CIDR blocks are 1.2.3.0/20 and 2001:db8:123::/48, respectively.

Create the /etc/gatekeeper/grantors.lua file with the following script:

require "gatekeeper/staticlib"

local dyc = staticlib.c.get_dy_conf()

local addrs = {
  { gt_ip = '1.2.3.4', gw_ip = '10.2.0.2' },
  { gt_ip = '1.2.3.5', gw_ip = '10.2.0.2' },
}
dylib.add_grantor_entry_lb('1.2.3.0/20', addrs, dyc.gk)

local addrs = {
  { gt_ip = '2001:db8:123::4', gw_ip = 'fd00:2::2' },
  { gt_ip = '2001:db8:123::5', gw_ip = 'fd00:2::2' },
}
dylib.add_grantor_entry_lb('2001:db8:123::/48', addrs, dyc.gk)

return "gk: successfully configured grantors for 2 prefixes"

In other words, gt_ip corresponds to the public IP address associated to the Grantor server's front port, and gw_ip is the IP address in the router interface that is connected to the Gatekeeper server.

This script must be sent to Gatekeeper via the gkctl tool after Gatekeeper is started. The best way to do this is to configure a systemd override with an ExecStartPost command that runs gkctl, with a long enough timeout to account for the Gatekeeper startup delay. Run systemctl edit gatekeeper and insert the following content:

[Service]
ExecStartPost=/usr/sbin/gkctl -t 300 /etc/gatekeeper/grantors.lua
TimeoutStartSec=300

Start Gatekeeper

Simply start and enable Gatekeeper via systemd:

# systemctl start gatekeeper
# systemctl enable gatekeeper

Grantor server configuration

Main configuration

For the Grantor server, set gatekeeper_server to false in /etc/gatekeeper/main_config.lua:

local gatekeeper_server = false

GT block: /etc/gatekeeper/gt.lua

In this file, the following variables have been set as below:

local n_lcores = 2
local lua_policy_file = "policy.lua"
local lua_base_directory = "/etc/gatekeeper"

Network block configuration: /etc/gatekeeper/net.lua

For Grantor servers, the network configuration is analogous to the one for the Gatekeeper servers, with the exception that there's no back port when running Gatekeeper in Grantor mode.

Here we assume no link aggregation and no VLAN configuration.

local user = "gatekeeper"

local front_ports = {"front"}
local front_ips = {"1.2.3.4/20", "2001:db8:123::4/48"}
local front_bonding_mode = staticlib.c.BONDING_MODE_ROUND_ROBIN
local front_ipv4_vlan_tag = 0
local front_ipv6_vlan_tag = 0
local front_vlan_insert = false
local front_mtu = 1500

Other functional blocks

In the remaining Lua configuration files, we simply set the log_level variable. For production use, we specify the WARNING level:

local log_level = staticlib.c.RTE_LOG_WARNING

The policy script

The Grantor configuration in gt.lua points to a Lua policy script, a fundamental element of the Gatekeeper architecture. It is run by the Grantor server, and is responsible for deciding whether to grant or decline access to packet flows, as well as the maximum bandwidth for the granted flows and the duration of each decision.

In its simplest form, the policy script defines a single function called lookup_policy, which receives as arguments a packet information object, which allows policy decisions to be made based on layer 2, 3 and 4 header fields, and a policy object, which can be used to set bandwidth and duration limits to the policy decision. This function must return a boolean value to indicate whether the policy decision is to grant or decline the flow. In practice, we can use the decision_granted and decision_declined functions and their variations from the policylib Lua package to set the policy parameters and return the appropriate value. These functions have the following signatures:

function policylib.decision_granted(
  policy,          -- the policy object
  tx_rate_kib_sec, -- maximum bandwidth in KiB/s
  cap_expire_sec,  -- policy decision (capability) duration, in seconds
  next_renewal_ms, -- how long until sending a renewal request for this flow, in milliseconds
  renewal_step_ms  -- when sending renewal requests, don't send more than one per this duration,
)                  -- in milliseconds.

function policylib.decision_declined(
  policy,    -- the policy object
  expire_sec -- policy decision (capability) duration, in seconds
)

As a practical example, we show below a policy script that is able to perform the following decisions:

  • Grant or decline flows depending on their source IPv4 addresses, based on labeled prefixes loaded from an external file;
  • Grant or decline TCP segments depending on their destination port;
  • Decline malformed packets;
  • Grant packets not matching the rules above, with limited bandwidth.

We start by requiring the libraries policylib from Gatekeeper and ffi from LuaJIT. Requiring policylib also gives us access to the lpmlib package, which contains functions to manipulate LPM (Longest Prefix Match) tables.

local policylib = require("gatekeeper/policylib")
local ffi = require("ffi")

Next, we define helper functions that represent our policy decisions. These functions take a policy argument, which has type struct ggu_policy, but which can be considered as an opaque object for our purposes, as it's simply forwarded to the functions policylib.decision_granted or policylib.decision_declined, described above.

-- Decline flows with malformed packets.
local function decline_malformed_packet(policy)
  return policylib.decision_declined(policy, 10)
end

-- Decline flows by policy decision.
local function decline(policy)
  return policylib.decision_declined(policy, 60)
end

-- Grant flow by policy decision.
local function grant(policy)
  return policylib.decision_granted(policy,
    2048,   -- tx_rate_kib_sec
    600,    -- cap_expire_sec
    540000, -- next_renewal_ms
    3000    -- renewal_step_ms
  )
end

-- Grant flow not matching any policy, with reduced bandwidth.
local function grant_unmatched(policy)
  return policylib.decision_granted(policy,
    1024,   -- tx_rate_kib_sec
    300,    -- cap_expire_sec
    240000, -- next_renewal_ms
    3000    -- renewal_step_ms
  )
end

We then define a Lua table that maps its indices to policy decisions. The indices in this table correspond to the label that is associated to a network prefix when inserted in an LPM table to be created below. Therefore, when inspecting a packet, we can perform a lookup for its source and/or destination IP addresses in this LPM table, using the returned label to obtain the function that will grant or decline this flow.

In the table below, flows from sources labeled 1 in the LPM table will be granted, whereas those labeled 2 will be declined.

local policy_decision_by_label = {
  [1] = grant,
  [2] = decline,
}

The policy script continues with the definition of the aforementioned LPM table, with the use of the helper function new_lpm_from_file. The fact that the src_lpm_ipv4 variable is global (i.e. its definition does not use the local keyword) is relevant, because it allows the LPM table to be accessed by other scripts. This is useful, for example, to update the LPM table, or to print it for inspection.

The new_lpm_from_file function, given below, assumes the input file is in a two-column format, where the first column is a network prefix in CIDR notation, and the second column is its label. The function use functions in the lpmlib package to create and populate the LPM table. Given the policy_decision_by_label table above, the input file should use label 1 for the prefixes we want to grant, and label 2 for the prefixes we want to decline.

src_lpm_ipv4 = new_lpm_from_file("/path/to/lpm/input/file")

function new_lpm_from_file(path)
  -- Find minimum values for num_rules and num_tbl8s.
  local num_rules = 0
  local num_tbl8s = 0

  local prefixes = {}
  for line in io.lines(path) do
    local prefix, label = string.match(line, "^(%S+)%s+(%d+)$")
    if not prefix or not label then
      error(path .. ": invalid line: " .. line)
    end
    -- Convert string in CIDR notation to IP address and prefix length.
    local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
    num_rules = num_rules + 1
    num_tbl8s = num_tbl8s + lpmlib.lpm_add_tbl8s(ip_addr, prefix_len, prefixes)
  end

  -- Adjust parameters.
  local scaling_factor_rules = 2
  local scaling_factor_tbl8s = 2
  num_rules = math.max(1, scaling_factor_rules * num_rules)
  num_tbl8s = math.max(1, scaling_factor_tbl8s * num_tbl8s)

  -- Create and populate LPM table.
  local lpm = lpmlib.new_lpm(num_rules, num_tbl8s)
  for line in io.lines(path) do
    local prefix, label = string.match(line, "^(%S+)%s+(%d+)$")
    if not prefix or not label then
      error(path .. ": invalid line: " .. line)
    end
    -- Convert string in CIDR notation to IP address and prefix length.
    local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
    lpmlib.lpm_add(lpm, ip_addr, prefix_len, tonumber(label))
  end

  return lpm
end

We also define a Lua table that maps port numbers to policy decision functions. The purpose of this table is to provide an example of policy decisions which are not based on LPM table lookups.

The table below maps ports 80 and 443 to the grant function and port 23 to the decline function. The calls to policylib.c.gt_cpu_to_be_16 are necessary as the destination port is stored in network byte order (big endian) in the packet headers.

local policy_decision_by_destination_port = {
  [policylib.c.gt_cpu_to_be_16(23)]  = decline,
  [policylib.c.gt_cpu_to_be_16(80)]  = grant,
  [policylib.c.gt_cpu_to_be_16(443)] = grant,
}

Finally, we implement the lookup_policy function. As described above, this is the entry point of the policy script, i.e., the function called by the Grantor server to obtain a policy decision for a given packet.

The function receives two arguments. The first is pkt_info, which is a gt_packet_headers struct, accessible from the policy script via the ffi module. These are the headers of the IP-in-IP encapsulated packet sent from Gatekeeper to Grantor. The second argument is policy, which we will simply pass along to the policy decision functions.

The lookup_policy function starts by checking if the inner packet is an IPv4 packet. In production we have IPv6-specific LPM tables and other policies, but for simplicity, in this example we will just apply the default policy for non-IPv4 traffic. The function then proceeds with an LPM policy lookup, which, if successful, will return a policy decision function that is then applied. Otherwise, the policy script attempts to obtain a policy by inspecting a TCP packet's destination port. These two steps are performed by the helper functions lookup_src_lpm_ipv4_policy and lookup_tcp_destination_port_policy, respectively, which are given below. Finally, if no policy is found, we apply the default policy decision function.

function lookup_policy(pkt_info, policy)
  if pkt_info.inner_ip_ver ~= policylib.c.IPV4 then
    return grant_unmatched(policy)
  end

  local fn = lookup_src_lpm_ipv4_policy(pkt_info)
  if fn then
    return fn(policy)
  end

  local fn = lookup_tcp_destination_port_policy(pkt_info)
  if fn then
    return fn(policy)
  end

  return grant_unmatched(policy)
end

The lookup_src_lpm_ipv4_policy function performs a lookup on the src_lpm_ipv4 table populated with network prefixes loaded from a file, as described above. We use the ffi.cast function to obtain an IPv4 header, so that we can access the packet's source IP address and look it up in the LPM table, with lpmlib.lpm_lookup. This function returns the matching label for the network prefix to which the flow's source address belongs, which will be used to obtain its associated policy decision function via the mapping in the policy_decision_by_label Lua table. Note that lpmlib.lpm_lookup returns a negative number if no match is found, and since the policy_decision_by_label table has no negative indices, the table lookup will return nil, and the lookup_policy function will proceed with the TCP destination port lookup.

function lookup_src_lpm_ipv4_policy(pkt_info)
  local ipv4_header = ffi.cast("struct rte_ipv4_hdr *", pkt_info.inner_l3_hdr)
  local label = lpmlib.lpm_lookup(src_lpm_ipv4, ipv4_header.src_addr)
  return policy_decision_by_label[label]
end

The TCP destination port policy lookup starts by checking if this is indeed a TCP segment, returning the default policy otherwise. We then check for malformed packets by ensuring the packet is large enough to accommodate the TCP headers. Finally, we use ffi.cast to obtain the TCP headers and use the destination port as a key to the policy decision lookup against the policy_decision_by_destination_port table.

function lookup_tcp_destination_port_policy(pkt_info)
  if pkt_info.l4_proto ~= policylib.c.TCP then
    return grant_unmatched
  end

  if pkt_info.upper_len < ffi.sizeof("struct rte_tcp_hdr") then
    return decline_malformed_packet
  end

  local tcp_header = ffi.cast("struct rte_tcp_hdr *", pkt_info.l4_hdr)
  return policy_decision_by_destination_port[tcp_header.dst_port]
end

Finally, we add two helper functions to the policy script. These functions are not used by the policy itself, but by the dynamic configuration script that keeps the LPM table up to date. The add_src_v4_prefix takes a prefix string in CIDR format and an integer label and inserts it in the LPM table. The del_src_v4_prefix takes a prefix string in the same format and removes it from the LPM table.

More details about dynamically updating the LPM table are given below.

function add_src_v4_prefix(prefix, label)
  local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
  lpmlib.lpm_add(src_lpm_ipv4, ip_addr, prefix_len, label)
end

function del_src_v4_prefix(prefix)
  local ip_addr, prefix_len = lpmlib.str_to_prefix(prefix)
  lpmlib.lpm_del(src_lpm_ipv4, ip_addr, prefix_len)
end

Updating LPM tables with with drib and gtctl

Fetching IP prefixes

The example policy script given above loads network prefixes and labels from a file. In practice, these prefixes are usually assembled from multiple online sources of unwanted source networks, such as Spamhaus' EDROP or Team Cymru's Bogon prefixes to decline flows whose source address belongs to these prefixes.

These online unwanted prefix lists are continuously updated, and may contain intersecting network blocks, so it makes sense to use a tool designed to fetch, merge and label them automatically, generating a file that can be consumed by the policy script. The Drib tool has been developed with this purpose.

This tool aggregates IP prefixes from configurable online and offline sources and allows each source to be labeled with its own "class", which is just an arbitrary string. Once the prefixes are aggregated, Drib can render a template, feeding it with the prefixes and their respective class. We use the source class configuration in Drib as the label to be associated with a prefix when inserted in the policy's LPM table.

Going back to the policy script, recall the definition of the policy_decision_by_label variable:

local policy_decision_by_label = {
  [1] = grant,
  [2] = decline,
}

This means prefixes labeled with 1 will be granted and those labeled with 2 will be declined. Below we show a Drib configuration file, /etc/drib/drib.yaml, that labels network blocks fetched from the EDROP and Bogons lists with a class value of 2. To make the example more complete, we also add a static network block labeled with a class value of 1 as an "office" network from which we always want to accept traffic.

Note that Drib supports specifying a group-scoped kind setting, which is a tag shared by all prefixes in a given group. We define the grant and decline groups, both with kind src, and use that in templates that will generate Lua scripts that manipulate the src_lpm_ipv4 LPM table.

log_level: "warn"

ipv4: {
  grant: {
    priority: 20,
    kind: "src",

    office: {
      range: "100.90.80.0/24",
      class: "1",
    },
  },

  decline: {
    priority: 10,
    kind: "src",

    edrop: {
      remote: {
        url: "https://www.spamhaus.org/drop/edrop.txt",
        check_interval: "12h",
        parser: {ranges: {one_per_line: {comment: ";"}}},
      },
      class: "2",
    },

    fullbogons: {
      remote: {
        url: "https://www.team-cymru.org/Services/Bogons/fullbogons-ipv4.txt",
        check_interval: "1d",
        parser: {ranges: {one_per_line: {comment: "#"}}},
      },
      class: "2",
    },
  },
}

Given this configuration, the following bootstrap template file, /etc/drib/bootstrap.tpl, is used to generate an input file in the format expected by the policy script, that is, a two-column file with a network prefix in CIDR format in the first column, and an integer label in the second one:

{% for entry in ranges -%}
{{entry.range}} {{entry.class}}
{% endfor -%}

A cron job is set up to run the drib aggregate command, which will download the EDROP and Bogon prefixes, merge them, exclude the office network range from the resulting set, and save a serialization of the result in what is called an aggregate file.

We tie everything together by calling the drib bootstrap --no-download command in a systemd override ExecStartPre command. This will make Drib read an existing aggregate file (generated by the aforementioned cron job) and render the above template. When Gateekeeper runs in Grantor mode, it will run the policy script, which will then read the recently-rendered template with the set of prefixes obtained from Drib.

The systemd override can be created with the systemd edit gatekeeper command in the Grantor servers. Add the following content to the override file:

[Service]
ExecStartPre=/usr/sbin/drib bootstrap --no-download

This ensures the policy script will load up to date data when Gatekeeper starts in Grantor mode.

Updating LPM tables incrementally

The setup described above works well for the generation of an initial (bootstrap) list of prefixes on Gatekeeper startup. However, the EDROP and Bogons lists, as well as similar online unwanted prefix lists, are continually updated, and Gatekeeper's in-memory LPM tables should be kept up to date.

To do this, we use the gtctl tool. This is a tool that is able to parse Drib's aggregate files (generated in the cron job mentioned in the previous section) and compare it to an aggregate file saved from a previous run, generating sets of newly inserted and removed IP addresses. These sets are used as inputs to render policy update scripts, which gtctl then feeds into Gatekeeper via its dynamic configuration mechanism.

The policy update template, /etc/gtctl/policy_update.lua.tpl simply generates calls to the add_src_v4_prefix and del_src_v4_prefix functions defined in the policy script. Note that even though we only have a source address policy, and therefore all prefixes from Drib are tagged with the src kind, we still assemble the function name using the template variables as an example of what is possible to do with a gtctl template.

local function update_lpm_tables()
{%- for entry in ipv4.remove %}
  del_{{entry.kind}}_v4_prefix("{{entry.range}}")
{%- endfor %}

{%- for entry in ipv4.insert %}
  add_{{entry.kind}}_v4_prefix("{{entry.range}}", {{entry.class}})
{%- endfor %}
end

local dyc = staticlib.c.get_dy_conf()
dylib.update_gt_lua_states_incrementally(dyc.gt, update_lpm_tables, false)

Depending on the number of updates, it might be necessary to create a new LPM table that is able to accommodate the new set of prefixes. For this case, gtctl uses a policy replacement, /etc/gtctl/policy_replace.lua.tpl template to generate the script:

{{lpm_table}} = nil
collectgarbage()

{{lpm_table}} = {{lpm_table_constructor}}({{params.num_rules}}, {{params.num_tbl8s}})

local function update_lpm_tables()
{%- for entry in ipv4.insert %}
  add_{{entry.kind}}_v4_prefix("{{entry.range}}", {{entry.class}})
{%- endfor %}
end

local dyc = staticlib.c.get_dy_conf()
dylib.update_gt_lua_states_incrementally(dyc.gt, update_lpm_tables, false)

The template above mentions the params variable. This variable is created by gtctl after running a parameters estimation script, /etc/gtctl/lpm_params.lua.tpl, which is also rendered from a template:

require "gatekeeper/staticlib"
require "gatekeeper/policylib"

local dyc = staticlib.c.get_dy_conf()

if dyc.gt == nil then
  return "Gatekeeper: failed to run as Grantor server\n"
end

local function get_lpm_params()
  local lcore = policylib.c.gt_lcore_id()
  local num_rules, num_tbl8s = {{lpm_params_function}}({{lpm_table}})
  return lcore .. ":" .. num_rules .. "," .. num_tbl8s .. "\n"
end

dylib.update_gt_lua_states_incrementally(dyc.gt, get_lpm_params, false)

Given these templates, the gtctl configuration file, /etc/gtctl/gtctl.yaml, which references them, is shown below.

log_level: "warn"
remove_rendered_scripts: true
socket: "/var/run/gatekeeper/dyn_cfg.socket"
state_dir: "/var/lib/gtctl"

replace: {
  input: "/etc/gtctl/policy_replace.lua.tpl",
  output: "/var/lib/gtctl/policy_replace_{proto}_{kind}.{2i}.lua",
  max_ranges_per_file: 1500,
}

update: {
  input: "/etc/gtctl/policy_update.lua.tpl",
  output: "/var/lib/gtctl/policy_update_{proto}_{kind}.{2i}.lua",
  max_ranges_per_file: 1500,
}

lpm: {
  table_format: "{kind}_lpm_{proto}", # for this example's drib.yaml, yields "src_lpm_ipv4"

  parameters_script: {
    input: "/etc/gtctl/lpm_params.lua.tpl",
    output: "/var/lib/gtctl/lpm_params_{proto}_{kind}.lua",
  },

  ipv4: {
    lpm_table_constructor: "lpmlib.new_lpm",
    lpm_get_params_function: "lpmlib.lpm_get_paras",
  },

  ipv6: {
    lpm_table_constructor: "lpmlib.new_lpm6",
    lpm_get_params_function: "lpmlib.lpm6_get_paras",
  },
}

The only missing piece is a way to run gtctl once a new aggregate file has been generated by Drib. Our current solution is to rely on our configuration management tool, Puppet, to detect this and trigger the gtctl execution:

file { '/var/lib/gtctl/aggregate.new':
  ensure => 'present',
  source => 'puppet:///drib/aggregate',
  owner  => 'root',
  group  => 'root',
  mode   => '0644',
  notify => Exec['gtctl'],
}

exec { 'gtctl':
  command     => 'gtctl dyncfg -a /var/lib/gtctl/aggregate.new',
  onlyif      => 'systemctl is-active gatekeeper',
  refreshonly => true,
}
Clone this wiki locally