-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-10975] Add QoS packet rate limit #9947
base: master
Are you sure you want to change the base?
Conversation
Add packet rate limiting QoS control implementation, using iptables (using '-m limit' and a new mark) and nftables (using 'limit rate over') rules to limit ingress and/or egress packet rate based on workload QoSControls (from pod annotations in k8s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Just a few minor points to address.
In terms of dataplane support - something that I should have checked before! -
- Is it correct that we're supporting iptables and nftables, but not eBPF?
- Is that the case for the bandwidth control as well?
felix/dataplane/driver.go
Outdated
@@ -123,6 +123,8 @@ func StartDataplaneDriver( | |||
markScratch0, _ = markBitsManager.NextSingleBitMark() | |||
markScratch1, _ = markBitsManager.NextSingleBitMark() | |||
|
|||
markLimitPacketRate, _ = markBitsManager.NextSingleBitMark() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be absolutely fine, but can you check with the relevant people - probably @fasaxc and @tomastigera - in case we need to be more careful with using up new mark bits? It might be possible to reuse an existing one, if the places where they are used, for different features, do not overlap with each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering we invariably use the mark in 2 consecutive iptables rules, I suppose we could reuse an existing 'scratch' mark instead of allocating a new one, but then I think we'd need a 3rd rule to clear the mark (considering it could/would be used for other purposes as well, as we're only using the mark to "bypass" the iptables -m limit
module limitation of not being able to drop packets above the limit, we could clear the mark in a rule immediately after the one dropping the packets without the mark). @nelljerram do you think that's a better approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer @fasaxc or @tomastigera to give an opinion about this, as I don't know how careful we need to be with mark bit usage. One possible situation is that there isn't any real shortage in OSS, but there might be when we get around to merging this into calico-private...
felix/fv/qos_controls_test.go
Outdated
} | ||
|
||
It("should limit packet rate correctly", func() { | ||
format.MaxLength = 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is going to affect all following tests. Personally, I'm a fan of putting format.MaxLength = 0
somewhere, like in libcalico-go/lib/clientv3/tier_e2e_test.go:
func init() {
// Stop Gomega from chopping off diffs in logs.
format.MaxLength = 0
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(IOW, my only concern is that it looks like this might be a local-only setting, but in fact it won't be.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point! I blindly copied what happened in another test case I saw, but I agree this should be improved. Will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would personally favour setting it globally for the whole FV test. I don't think there's any significant downside - because tests aren't supposed to fail! - and it's just annoying when you do get a failure and then the output is truncated.
felix/fv/qos_controls_test.go
Outdated
By("Running iperf3 client on workload 1") | ||
out, err := w[1].ExecOutput("iperf3", "-c", w[0].IP, "-O5", "-M1000", "-J") | ||
Expect(err).NotTo(HaveOccurred()) | ||
baselineRate, err := getRateFromJsonOutput(out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, this is in bits per sec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is noted in the Info
log a few lines below it:
calico/felix/fv/qos_controls_test.go
Line 189 in 483cc0e
log.Infof("iperf client rate with no packet rate limit (bps): %v", baselineRate) |
should I add a comment in addition to that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add a comment to explain the 800000 just below, I don't think we need a comment for this point as well. What was confusing was the combination of not explaining the 800000 and not indicating the units of getRateFromJsonOutput
.
felix/fv/qos_controls_test.go
Outdated
|
||
By("Waiting for the config to appear in 'iptables-save' on workload 0") | ||
// ingress config should be present | ||
Eventually(getRules(0), "10s", "1s").Should(And(MatchRegexp(`-A cali-tw-`+regexp.QuoteMeta(w[0].InterfaceName)+` .* -m comment --comment "Mark packets within ingress packet rate limit" -m limit --limit `+regexp.QuoteMeta("100/sec")+` -j MARK --set-xmark 0x\d+/0x\d+`), MatchRegexp(`-A cali-tw-`+regexp.QuoteMeta(w[0].InterfaceName)+` .* -m comment --comment "Drop packets over ingress packet rate limit" -m mark ! --mark 0x\d+/0x\d+ -j DROP`))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this check, iptables-save -c
will be called only once, returning a string, and then that same string will be analysed up to 10 times. If it fails the first time, it won't make any difference to keep trying for the next 10s.
What you really want is:
getRules := func(felixId int) func() string {
return func() {
out, err := tc.Felixes[felixId].ExecOutput("iptables-save", "-c")
log.Infof("iptables-save -c output:\n%v", out)
Expect(err).NotTo(HaveOccurred())
return out
}
}
felix/fv/qos_controls_test.go
Outdated
ingressLimitedRate, err := getRateFromJsonOutput(out) | ||
Expect(err).NotTo(HaveOccurred()) | ||
log.Infof("iperf client rate with ingress packet rate limit on server (bps): %v", ingressLimitedRate) | ||
// Expect the limited rate to be below an estimated desired rate (1000 byte packets * 8 bits/byte * 100 packets/s = 800000 bps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice. Please explain that relation above as well.
felix/fv/qos_controls_test.go
Outdated
Expect(err).NotTo(HaveOccurred()) | ||
log.Infof("iperf client rate with ingress packet rate limit on server (bps): %v", ingressLimitedRate) | ||
// Expect the limited rate to be below an estimated desired rate (1000 byte packets * 8 bits/byte * 100 packets/s = 800000 bps) | ||
Expect(ingressLimitedRate).To(BeNumerically("<=", 800000.0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a bit of leeway here, e.g. * 1.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, this upperbound was never even close to hitting when I ran the test (the max was typically 640kbps), but I suppose it doesn't hurt to have some margin, will add!
felix/rules/endpoints_test.go
Outdated
@@ -1359,6 +1369,82 @@ var _ = Describe("Endpoints", func() { | |||
}, | |||
})) | |||
}) | |||
|
|||
It("should render a workload endpoint with packet rate limiting QoSControls", func() { | |||
format.MaxLength = 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment about this as in FV test
manifests/ocp/custom-resources.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I think this showed up after running make generate
, will remove
Thanks so much! Yes, currently all 3 planned controls will only support iptables and nftables, but not eBPF. This control (and the future connection limit control) relies on iptables/nftables itself, and will need to be implemented differently on eBPF. The bandwidth control has an incompatibility between the |
… client) Also address review comments: - fix usage of format.MaxLength - remove manifests/ocp/custom-resources.yaml
args := []string{ | ||
"-v", os.Getenv("CERTS_PATH") + ":/home/user/certs", // Mount in location of certificates. | ||
"-v", pwd + "/../../libcalico-go/config/crd:/crds", // Mount in location of CRDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure but I was having failures almost every time when trying to copy the CRDs into the container (maybe related to low disk space on my machine), mounting as volume fixed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK with me if it doesn't cause other problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks great. One place where a little more commenting would be useful.
args := []string{ | ||
"-v", os.Getenv("CERTS_PATH") + ":/home/user/certs", // Mount in location of certificates. | ||
"-v", pwd + "/../../libcalico-go/config/crd:/crds", // Mount in location of CRDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK with me if it doesn't cause other problems.
felix/fv/qos_controls_test.go
Outdated
if err != nil { | ||
continue | ||
} | ||
time.Sleep(retryInterval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some commenting here to explain why we retry even when an interation succeeds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, to be honest, we don't need to, I'll move the sleep so it only happens when we're continue
ing (i.e. an actual retry)
thanks for catching this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I wondered that too 😄
Description
Add packet rate limiting QoS control implementation, using iptables (using '-m limit' and a new mark) and nftables (using 'limit rate over') rules to limit ingress and/or egress packet rate based on workload QoSControls (from pod annotations in k8s).
Related issues/PRs
Todos
Release Note
Reminder for the reviewer
Make sure that this PR has the correct labels and milestone set.
Every PR needs one
docs-*
label.docs-pr-required
: This change requires a change to the documentation that has not been completed yet.docs-completed
: This change has all necessary documentation completed.docs-not-required
: This change has no user-facing impact and requires no docs.Every PR needs one
release-note-*
label.release-note-required
: This PR has user-facing changes. Most PRs should have this label.release-note-not-required
: This PR has no user-facing changes.Other optional labels:
cherry-pick-candidate
: This PR should be cherry-picked to an earlier release. For bug fixes only.needs-operator-pr
: This PR is related to install and requires a corresponding change to the operator.