Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow snapshot tap changes #4731

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

andrewla
Copy link

@andrewla andrewla commented Aug 15, 2024

Changes

Allow renaming of tap devices on snapshot restore

Reason

In some scenarios it is not possible to use the jailer, especially in limited privilege environments where the security is external to firecracker itself. But in these cases a snapshot may have to use a different tap device than the one that it was using when it was snapshotted.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this
    PR.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.
  • Commits meet
    contribution quality standards.

  • This functionality cannot be added in rust-vmm.

@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from 7991d9f to 8d1a0a9 Compare August 15, 2024 21:24
@pb8o
Copy link
Contributor

pb8o commented Aug 19, 2024

Hi @andrewla thank you for your contribution! We would like to understand the use case better in case it can be resolved through other means first. We recommend using a network namespace where you can create TAP devices with the same name, but that probably requires CAP_SYS_ADMIN, which I understand is what you mean with "limited privilege environments".

Could you elaborate on your use case? Is there a way you could create the namespace in a privileged setting and then use something like nsenter firecracker ...?

@andrewla
Copy link
Author

That assessment is correct -- basically to run the jailer in a network namespace you need the setns syscall which requires CAP_SYS_ADMIN. So nsenter is not an option.

Our particular case is running in a containerized environment where our privileges are limited by the nature of the general environment. Once we're in our particular container we have lost all relevant privileges.

@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch 5 times, most recently from 3415816 to 03c3be9 Compare August 26, 2024 20:43
@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from 03c3be9 to 265ea94 Compare August 27, 2024 13:24
@andrewla andrewla marked this pull request as ready for review August 27, 2024 13:25
@pb8o
Copy link
Contributor

pb8o commented Aug 29, 2024

Hi again @andrewla, we have been talking internally about this PR and we may need to spend some time to decide on the API aspects of it to make sure it doesn't conflict with other efforts.

In the meantime, we thought of another workaround. The snapshot-editor could be enhanced to rename the tap devices in an snapshot file. That would be an easier decision for us, but we want to make sure it would handle your use case.

For example we imagine the tool would work like this:

snapshot-editor edit-vmstate rename-network eth0 tap1

Would this work within your environment?

@andrewla
Copy link
Author

andrewla commented Sep 3, 2024

This was our initial approach as it required minimal changes. But we found that the performance cost of making the copy (as opposed to hardlinking) during the operation (plus serde costs) were more expensive than we were willing to tolerate in our environment.

@andrewla
Copy link
Author

Hi @pb8o -- is there anything we can do to help move this forward?

@pb8o
Copy link
Contributor

pb8o commented Oct 16, 2024

Hi @andrewla I haven't had time to look at this, but this is next on my list now. Thanks for your patience!

@kanpov
Copy link
Contributor

kanpov commented Nov 1, 2024

On a related note, another reason why renaming the tap device is a better approach than namespaced NAT from the "Network for Clones" guide is that the namespaced NAT imposes measurable overhead onto the host kernel due to the addition of about 5 more iptables/nft rules, plus an RTNETLINK route for forwarding the guest IP out of the netns.

Even though I made an effort to support namespaced NAT in fcnet, it increased complexity by a factor of 4-5x in comparison to regular NAT only to support one usecase: two simultaneous microVM clones. So I'd be in favor of this change, or a snapshot-editor equivalent.

@pb8o
Copy link
Contributor

pb8o commented Jan 8, 2025

Hello @andrewla ! I apologize for the long time between updates, but some other stuff came up. So we have decided to go ahead with this. I gave a first initial review and I only have some minor comments, but mostly looks good to me. I just have a question if the network_overrides field also works when starting from a JSON config file.

tests/framework/microvm.py Outdated Show resolved Hide resolved
tests/framework/microvm.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
docs/snapshotting/network-for-clones.md Outdated Show resolved Hide resolved
@andrewla andrewla requested a review from Manciukic as a code owner January 8, 2025 19:28
@andrewla
Copy link
Author

andrewla commented Jan 9, 2025

Re: config -- currently there is no config support for snapshots (https://github.com/firecracker-microvm/firecracker/blob/main/src/vmm/src/resources.rs) -- the snapshot configuration and restore has to be done with a running firecracker instance

@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch 2 times, most recently from d8c5a44 to ea62e9a Compare January 9, 2025 19:00
Copy link
Contributor

@bchalios bchalios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It generally looks good and thanks for the contribution Andrew.

A few comments/questions from me.

Also, I've commented this for the documentation changes, but could you please squash as well the commits for the test changes into a single commit?

src/firecracker/swagger/firecracker.yaml Outdated Show resolved Hide resolved
src/vmm/src/builder.rs Outdated Show resolved Hide resolved
tests/framework/microvm.py Outdated Show resolved Hide resolved
tests/framework/microvm.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved

This may require reconfiguration of the networking inside the VM so that it is
still routable externally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add an example here of how this can be used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some sample code and more detail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if the example include the steps to start a microVM with a certain network config (it could be the one from "Getting started") then take a snapshot and, finally, load this snapshot in a different host with a different TAP device (show the snapshot load command with the override).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this in principle is that it does require that the VM reconfigure its networking to match the expectations of the external routing. This complexity is easily managed in the real world where you will have some channel of communication from host to guest (vsocks or swapping block devices or mmds or vmgenid) but hard to capture in a small example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine. I think we can still add something useful here. The guest configuration you have already is fine. I would add something like that:

For example, if we have a network interface named `eth0` in the snapshotted microVM. We can override it during snapshot resume, like this:


curl --unix-socket /tmp/firecracker.socket -i \
    -X PUT 'http://localhost/snapshot/load' \
    -H  'Accept: application/json' \
    -H  'Content-Type: application/json' \
    -d '{
            "snapshot_path": "./snapshot_file",
            "mem_backend": {
                "backend_path": "./mem_file",
                "backend_type": "File"
            },
            "enable_diff_snapshots": true,
            "resume_vm": false,
            "network_overrides": [
                 {
                     iface_id: "eth0",
                     host_dev_name": "vmtap01"
                 }
            ]
    }'

after this bit:

In this case you can use the network_overrides parameter to snapshot restore
to specify which network device (based on the name inside the VM, such as
"eth0") maps to which host tap device (e.g. "vmtap01").

CHANGELOG.md Show resolved Hide resolved
@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch 4 times, most recently from 0be3ff7 to bf47436 Compare January 10, 2025 15:02
Comment on lines +73 to +74
/// The network devices to override on load.
pub network_overrides: Vec<NetworkOverride>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the integration tests failures, LoadSnapshotConfig just below also needs to be updated with this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any ideas on what the problem is here? The network_overrides field is present in both structs, and for the life of me I can't see where this error message is originating -- the field is in the yaml, it's in both the config and params struct. I don't have an easy callstack for the firecracker side of the failure, but that's where I'll start looking next, but any insights here would be appreciated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andrewla I think what is happening is that the test that fails: test_check_vulnerability_files_ab is an A/B test, and is running the new tests with the Firecracker from current main. A workaround would be to not pass network_overrides if it's empty. I will write a comment where I think it should be done.

Copy link
Author

@andrewla andrewla Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not match what I'm seeing in CI, I see a number of tests failing with the messages

E   RuntimeError: ('An error occurred when deserializing the json body of a request: unknown field 
`network_overrides`, expected one of `snapshot_path`, `mem_file_path`, `mem_backend`, 
`enable_diff_snapshots`, `resume_vm` at line 1 column 173.', {'fault_message': 'An error occurred when 
deserializing the json body of a request: unknown field `network_overrides`, expected one of 
`snapshot_path`, `mem_file_path`, `mem_backend`, `enable_diff_snapshots`, `resume_vm` at line 1 column 
173.'}, <Response [400]>)

Mostly these are in integration_tests/security/test_vulnerabilities.py, but not all tests in that file fail consistently. The _ab tests fail but not consistently.

I am unable to reproduce any of this locally -- the tests pass when I run them specifically.

Copy link
Contributor

@pb8o pb8o Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it's that. Try it with the microvm.py patch I mentioned in #4731 (comment)

To reproduce locally try this:

BUILDKITE_PULL_REQUEST=true BUILDKITE_PULL_REQUEST_BASE_BRANCH=main ./tools/devtool test -- -n8 --dist worksteal integration_tests/security/test_vulnerabilities.py::test_check_vulnerability_files_ab

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try applying the patch to verify. Trying the command you give has the tests fail with a different error:

framework/microvm.py:206: in __init__
    assert fc_binary_path.exists()
E   AssertionError

I don't know exactly where I've gone wrong that the firecracker binary is not in the right place in the test framework. I'll try from scratch, and in the meantime I'll submit the suggested patch to see if it works in CI

Copy link
Author

@andrewla andrewla Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A clean build does not fix this; I get this error when running from scratch. All of the old firecracker binaries are present under build/img/x86_64/firecracker; I have no idea why this doesn't work, or why even in CI some of the ab tests work.

The other failing test is the spectre/meltdown test which also appears to have an ab mode based on whether it is a PR or not, so likely failing for the same reason.

Is there anything I have to do to get buildkite to run the CI? After pushing changes it goes into a "blocked" mode; a 24 hr turnaround on CI is useless when I can't repro the failure locally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, this is because devtool test apparently only tried to build the "B" binary, but never the "A" binary. In CI, it works because there's a separate step that builds the binary so that they can be shared across all runners. Can you try running ./tools/devtool build --rev main and then rerun the test command pablo posted? That should create the firecracker binaries compiled from main in the locations where the test script looks for them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it can find the binaries (under build/main/..., I see now) but it fails elsewhere still. Error looks like

framework/microvm.py:678: in _wait_create
  os.stat(self.jailer.api_socket_path())
E   FileNotFoundError: [Errno 2] No such file or directory: '/srv/jailer/firecracker/594eec23-7064-4306-9acd-6a6e137c3d30/root/run/firecracker.socket'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it needs to be ./tools/devtool build --rev main --release (although it should also work without --release, wonder why that breaks 🤔), sorry!

I've also opened a PR to make devtool do that automatically + add some documentation: #4998

pb8o
pb8o previously requested changes Jan 13, 2025
docs/snapshotting/network-for-clones.md Show resolved Hide resolved
@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from eb815a2 to 97838cb Compare January 13, 2025 21:37
tests/framework/microvm.py Outdated Show resolved Hide resolved
@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from 97838cb to 07a8237 Compare January 14, 2025 18:13
In some scenarios it is not possible to use the jailer, especially in
limited privilege environments where the security is external to
firecracker itself. But in these cases a snapshot may have to use a
different tap device than the one that it was using when it was
snapshotted.

Signed-off-by: Andrew Laucius <[email protected]>
Test that we can correctly parse configuration and API calls in a
backwards compatible way.

Signed-off-by: Andrew Laucius <[email protected]>
Documenting the ability to rename network interfaces on snapshot
restore.

Signed-off-by: Andrew Laucius <[email protected]>
@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from 07a8237 to 837e744 Compare January 14, 2025 21:18
Copy link

codecov bot commented Jan 15, 2025

Codecov Report

Attention: Patch coverage is 21.42857% with 11 lines in your changes missing coverage. Please review.

Project coverage is 83.03%. Comparing base (3fb06e9) to head (c979583).

Files with missing lines Patch % Lines
src/vmm/src/persist.rs 15.38% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4731      +/-   ##
==========================================
- Coverage   83.06%   83.03%   -0.04%     
==========================================
  Files         244      244              
  Lines       26658    26671      +13     
==========================================
+ Hits        22144    22146       +2     
- Misses       4514     4525      +11     
Flag Coverage Δ
5.10-c5n.metal 83.54% <21.42%> (-0.04%) ⬇️
5.10-m5n.metal 83.53% <21.42%> (-0.03%) ⬇️
5.10-m6a.metal 82.73% <21.42%> (-0.05%) ⬇️
5.10-m6g.metal 79.40% <21.42%> (-0.04%) ⬇️
5.10-m6i.metal 83.51% <21.42%> (-0.04%) ⬇️
5.10-m7g.metal 79.40% <21.42%> (-0.04%) ⬇️
6.1-c5n.metal 83.54% <21.42%> (-0.03%) ⬇️
6.1-m5n.metal 83.52% <21.42%> (-0.05%) ⬇️
6.1-m6a.metal 82.73% <21.42%> (-0.04%) ⬇️
6.1-m6g.metal 79.39% <21.42%> (-0.05%) ⬇️
6.1-m6i.metal 83.51% <21.42%> (-0.05%) ⬇️
6.1-m7g.metal 79.40% <21.42%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from 7e2c75f to 5bcdf11 Compare January 15, 2025 17:58
@pb8o pb8o dismissed their stale review January 15, 2025 20:19

changes done

Comment on lines +420 to +430
let net_devices = &mut microvm_state.device_states.net_devices;
if let Some(device) = net_devices
.iter_mut()
.find(|x| x.device_state.id == entry.iface_id)
{
device
.device_state
.tap_if_name
.clone_from(&entry.host_dev_name);
} else {
return Err(SnapshotStateFromFileError::UnknownNetworkDevice.into());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are uncovered from unit tests? Can we add one (or extend an existing) to try to override a non-existing interface?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm investigating this (once I get the tests to pass) because this is really the core of the change, and if the tests are actually running then this code should be exercised. I'm concerned that my test is accidentally a noop but I need to validate this.

Passing in the new flag breaks tests that compare behavior to main.

Signed-off-by: Andrew Laucius <[email protected]>
@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from 5bcdf11 to ff7fabd Compare January 16, 2025 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants