-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow snapshot tap changes #4731
base: main
Are you sure you want to change the base?
Allow snapshot tap changes #4731
Conversation
7991d9f
to
8d1a0a9
Compare
Hi @andrewla thank you for your contribution! We would like to understand the use case better in case it can be resolved through other means first. We recommend using a network namespace where you can create TAP devices with the same name, but that probably requires Could you elaborate on your use case? Is there a way you could create the namespace in a privileged setting and then use something like |
That assessment is correct -- basically to run the jailer in a network namespace you need the setns syscall which requires CAP_SYS_ADMIN. So nsenter is not an option. Our particular case is running in a containerized environment where our privileges are limited by the nature of the general environment. Once we're in our particular container we have lost all relevant privileges. |
3415816
to
03c3be9
Compare
03c3be9
to
265ea94
Compare
Hi again @andrewla, we have been talking internally about this PR and we may need to spend some time to decide on the API aspects of it to make sure it doesn't conflict with other efforts. In the meantime, we thought of another workaround. The For example we imagine the tool would work like this: snapshot-editor edit-vmstate rename-network eth0 tap1 Would this work within your environment? |
This was our initial approach as it required minimal changes. But we found that the performance cost of making the copy (as opposed to hardlinking) during the operation (plus serde costs) were more expensive than we were willing to tolerate in our environment. |
Hi @pb8o -- is there anything we can do to help move this forward? |
Hi @andrewla I haven't had time to look at this, but this is next on my list now. Thanks for your patience! |
On a related note, another reason why renaming the tap device is a better approach than namespaced NAT from the "Network for Clones" guide is that the namespaced NAT imposes measurable overhead onto the host kernel due to the addition of about 5 more Even though I made an effort to support namespaced NAT in fcnet, it increased complexity by a factor of 4-5x in comparison to regular NAT only to support one usecase: two simultaneous microVM clones. So I'd be in favor of this change, or a |
Hello @andrewla ! I apologize for the long time between updates, but some other stuff came up. So we have decided to go ahead with this. I gave a first initial review and I only have some minor comments, but mostly looks good to me. I just have a question if the |
Re: config -- currently there is no config support for snapshots (https://github.com/firecracker-microvm/firecracker/blob/main/src/vmm/src/resources.rs) -- the snapshot configuration and restore has to be done with a running firecracker instance |
d8c5a44
to
ea62e9a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It generally looks good and thanks for the contribution Andrew.
A few comments/questions from me.
Also, I've commented this for the documentation changes, but could you please squash as well the commits for the test changes into a single commit?
|
||
This may require reconfiguration of the networking inside the VM so that it is | ||
still routable externally. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add an example here of how this can be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some sample code and more detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if the example include the steps to start a microVM with a certain network config (it could be the one from "Getting started") then take a snapshot and, finally, load this snapshot in a different host with a different TAP device (show the snapshot load command with the override).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this in principle is that it does require that the VM reconfigure its networking to match the expectations of the external routing. This complexity is easily managed in the real world where you will have some channel of communication from host to guest (vsocks or swapping block devices or mmds or vmgenid) but hard to capture in a small example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fine. I think we can still add something useful here. The guest configuration you have already is fine. I would add something like that:
For example, if we have a network interface named `eth0` in the snapshotted microVM. We can override it during snapshot resume, like this:
curl --unix-socket /tmp/firecracker.socket -i \
-X PUT 'http://localhost/snapshot/load' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"snapshot_path": "./snapshot_file",
"mem_backend": {
"backend_path": "./mem_file",
"backend_type": "File"
},
"enable_diff_snapshots": true,
"resume_vm": false,
"network_overrides": [
{
iface_id: "eth0",
host_dev_name": "vmtap01"
}
]
}'
after this bit:
In this case you can use the
network_overrides
parameter to snapshot restore
to specify which network device (based on the name inside the VM, such as
"eth0") maps to which host tap device (e.g. "vmtap01").
0be3ff7
to
bf47436
Compare
/// The network devices to override on load. | ||
pub network_overrides: Vec<NetworkOverride>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the integration tests failures, LoadSnapshotConfig
just below also needs to be updated with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ideas on what the problem is here? The network_overrides field is present in both structs, and for the life of me I can't see where this error message is originating -- the field is in the yaml, it's in both the config and params struct. I don't have an easy callstack for the firecracker side of the failure, but that's where I'll start looking next, but any insights here would be appreciated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @andrewla I think what is happening is that the test that fails: test_check_vulnerability_files_ab
is an A/B test, and is running the new tests with the Firecracker from current main
. A workaround would be to not pass network_overrides
if it's empty. I will write a comment where I think it should be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not match what I'm seeing in CI, I see a number of tests failing with the messages
E RuntimeError: ('An error occurred when deserializing the json body of a request: unknown field
`network_overrides`, expected one of `snapshot_path`, `mem_file_path`, `mem_backend`,
`enable_diff_snapshots`, `resume_vm` at line 1 column 173.', {'fault_message': 'An error occurred when
deserializing the json body of a request: unknown field `network_overrides`, expected one of
`snapshot_path`, `mem_file_path`, `mem_backend`, `enable_diff_snapshots`, `resume_vm` at line 1 column
173.'}, <Response [400]>)
Mostly these are in integration_tests/security/test_vulnerabilities.py
, but not all tests in that file fail consistently. The _ab
tests fail but not consistently.
I am unable to reproduce any of this locally -- the tests pass when I run them specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think it's that. Try it with the microvm.py patch I mentioned in #4731 (comment)
To reproduce locally try this:
BUILDKITE_PULL_REQUEST=true BUILDKITE_PULL_REQUEST_BASE_BRANCH=main ./tools/devtool test -- -n8 --dist worksteal integration_tests/security/test_vulnerabilities.py::test_check_vulnerability_files_ab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try applying the patch to verify. Trying the command you give has the tests fail with a different error:
framework/microvm.py:206: in __init__
assert fc_binary_path.exists()
E AssertionError
I don't know exactly where I've gone wrong that the firecracker binary is not in the right place in the test framework. I'll try from scratch, and in the meantime I'll submit the suggested patch to see if it works in CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A clean build does not fix this; I get this error when running from scratch. All of the old firecracker binaries are present under build/img/x86_64/firecracker; I have no idea why this doesn't work, or why even in CI some of the ab tests work.
The other failing test is the spectre/meltdown test which also appears to have an ab
mode based on whether it is a PR or not, so likely failing for the same reason.
Is there anything I have to do to get buildkite to run the CI? After pushing changes it goes into a "blocked" mode; a 24 hr turnaround on CI is useless when I can't repro the failure locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, this is because devtool test
apparently only tried to build the "B" binary, but never the "A" binary. In CI, it works because there's a separate step that builds the binary so that they can be shared across all runners. Can you try running ./tools/devtool build --rev main
and then rerun the test command pablo posted? That should create the firecracker binaries compiled from main in the locations where the test script looks for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it can find the binaries (under build/main/..., I see now) but it fails elsewhere still. Error looks like
framework/microvm.py:678: in _wait_create
os.stat(self.jailer.api_socket_path())
E FileNotFoundError: [Errno 2] No such file or directory: '/srv/jailer/firecracker/594eec23-7064-4306-9acd-6a6e137c3d30/root/run/firecracker.socket'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it needs to be ./tools/devtool build --rev main --release
(although it should also work without --release
, wonder why that breaks 🤔), sorry!
I've also opened a PR to make devtool do that automatically + add some documentation: #4998
eb815a2
to
97838cb
Compare
97838cb
to
07a8237
Compare
In some scenarios it is not possible to use the jailer, especially in limited privilege environments where the security is external to firecracker itself. But in these cases a snapshot may have to use a different tap device than the one that it was using when it was snapshotted. Signed-off-by: Andrew Laucius <[email protected]>
Test that we can correctly parse configuration and API calls in a backwards compatible way. Signed-off-by: Andrew Laucius <[email protected]>
Documenting the ability to rename network interfaces on snapshot restore. Signed-off-by: Andrew Laucius <[email protected]>
07a8237
to
837e744
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4731 +/- ##
==========================================
- Coverage 83.06% 83.03% -0.04%
==========================================
Files 244 244
Lines 26658 26671 +13
==========================================
+ Hits 22144 22146 +2
- Misses 4514 4525 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
7e2c75f
to
5bcdf11
Compare
let net_devices = &mut microvm_state.device_states.net_devices; | ||
if let Some(device) = net_devices | ||
.iter_mut() | ||
.find(|x| x.device_state.id == entry.iface_id) | ||
{ | ||
device | ||
.device_state | ||
.tap_if_name | ||
.clone_from(&entry.host_dev_name); | ||
} else { | ||
return Err(SnapshotStateFromFileError::UnknownNetworkDevice.into()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are uncovered from unit tests? Can we add one (or extend an existing) to try to override a non-existing interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm investigating this (once I get the tests to pass) because this is really the core of the change, and if the tests are actually running then this code should be exercised. I'm concerned that my test is accidentally a noop but I need to validate this.
Passing in the new flag breaks tests that compare behavior to main. Signed-off-by: Andrew Laucius <[email protected]>
5bcdf11
to
ff7fabd
Compare
Changes
Allow renaming of tap devices on snapshot restore
Reason
In some scenarios it is not possible to use the jailer, especially in limited privilege environments where the security is external to firecracker itself. But in these cases a snapshot may have to use a different tap device than the one that it was using when it was snapshotted.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
PR.
CHANGELOG.md
.TODO
s link to an issue.contribution quality standards.
rust-vmm
.