Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#4547: Refactor vmm builder code to simplify logic that creates the microVM to boot #4910

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

tommady
Copy link
Contributor

@tommady tommady commented Nov 13, 2024

fix(4547): Refactor vmm builder code to simplify logic

Changes

Refactoring Vmm builder code

Reason

to address issue #4547

...

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

pass vm_config to eliminate two extra arguments derived from it

Signed-off-by: tommady <tommady@users.noreply.github.com>
remove cfg_attr and extract create_vcpus from create_vmm_and_vcpus

Signed-off-by: tommady <tommady@users.noreply.github.com>
@tommady tommady force-pushed the fix-4547 branch 2 times, most recently from fce4afd to 38e9537 Compare November 15, 2024 21:38
extract codes into two architecture specific modes

Signed-off-by: tommady <tommady@users.noreply.github.com>
eliminate the unnecessary usage of the event_manager argument

Signed-off-by: tommady <tommady@users.noreply.github.com>
@tommady tommady marked this pull request as ready for review November 15, 2024 21:56
Copy link
Contributor

@ShadowCurse ShadowCurse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tommady, thx for doing the PR. Overall changes seems solid, but I left some small notes here and there. Also can you change names of the commits to refactor(builder): .... This is usually how we treat refactor commits.

Comment on lines 722 to 723
// Make stdout non blocking.
set_stdout_nonblocking();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to belong to the serial device init, so can be moved back to pio device manager init code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aarch64::attach_legacy_devices function also relies on this functionality. Moving it into the PIO device manager would make implementation significantly more challenging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about move this into the setup_serial_device? I see that we always call set_stdout_nonblocking before attaching the serial.

src/vmm/src/builder.rs Outdated Show resolved Hide resolved
@@ -308,11 +644,23 @@ pub fn build_microvm_for_boot(
}

#[cfg(target_arch = "aarch64")]
attach_legacy_devices_aarch64(event_manager, &mut vmm, &mut boot_cmdline).map_err(Internal)?;
aarch::attach_legacy_devices_aarch64(event_manager, &mut vmm, &mut boot_cmdline)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/aarch/aarch64 in the module name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant aarch::attach_legacy_devices -> aarch64::attach_legacy_devices since aarch64 is the module name.

src/vmm/src/builder.rs Outdated Show resolved Hide resolved
@@ -168,7 +168,9 @@ pub mod aarch64 {
if cmdline_contains_console {
// Make stdout non-blocking.
set_stdout_nonblocking();
let serial = setup_serial_device(event_manager, std::io::stdin(), std::io::stdout())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last "fix: Refactor vmm builder code to simplify logic" should be squashed with previous one.

pb8o and others added 6 commits November 22, 2024 03:14
Add the SVE CPU template as a valid template in 5.10 since it works.

Signed-off-by: Pablo Barbáchano <pablob@amazon.com>
Update release policy to v1.10.1 patch

Signed-off-by: Jack Thomson <jackabt@amazon.com>
`TcpIPv4Handler` for MMDS network stack preallocates several buffers
whose sizes are saved into a snapshot as `max_connections` and
`max_pending_resets` in `MmdsNetworkStackState`. But they are always the
same constant hardcoded values (`DEFAULT_MAX_CONNECTIONS` and
`DEFAULT_MAX_PENDING_RESETS`) as of today, which means there is no need
to save them into a snapshot. Even if we change the hardcoded sizes
across Firecracker versions, that should not be a problem. This is
because the snapshot feature does not support migration of network
connections and those buffers are initialized with empty on snapshot
restoration. When migrating from a Firecracker version with larger
buffers to another version with smaller ones, guest workloads that
worked previously might start to fail due to the less buffer spaces.
However, the issue is not a problem of the snapshot feature and it
should also occur even on a purely booted microVM (not restored from a
snapshot). Thus, it is fine to remove those fields from a snapshot.

Since this is a breaking change of the snapshot format, bumps the major
version.

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
There is no need to use MmdsNetworkStack::new() instead of
MmdsNetworkStack::new_with_defaults() in tests that pass the same
default values.

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
We bumped the snapshot version up twice recently, requiring users to
regenerate their snapshot, but the user action isn't clearly stated.

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
eliminate the unnecessary usage of the event_manager argument and
fix up aarch64 attach_legacy_devices_aarch64 fn

Signed-off-by: tommady <tommady@users.noreply.github.com>
remove the aarch64 suffix from the attach_legacy_devices_aarch64
function and ensure that aarch64 smt is always set to false int the
configure_system_for_boot function

Signed-off-by: tommady <tommady@users.noreply.github.com>
@tommady
Copy link
Contributor Author

tommady commented Nov 22, 2024

Hi @ShadowCurse

I’ve addressed your comments, with one note regarding the set_stdout_nonblocking function. Please review and provide your guidance when you have a moment.

Thank you!

Copy link
Contributor

@ShadowCurse ShadowCurse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you need to make sure the changes you made pass our CI. You can look at the buildkite/firecracker-pr line at the bottom of the PR and click details to look at the CI status. Start with making sure changes do compile. I can see that on x86_86 only errors are absence of doc comments for public functions. For aarch64, well, it does not even compile. If you don't have access to arm system, you can try to use cross to cross compile.
After everything compiles, please move all changes to corresponding commits.
Also please rebase the PR so it only includes your commits.

zulinx86 and others added 3 commits November 23, 2024 06:34
`TcpIPv4Handler` for MMDS network stack preallocates several buffers
whose sizes are saved into a snapshot as `max_connections` and
`max_pending_resets` in `MmdsNetworkStackState`. But they are always the
same constant hardcoded values (`DEFAULT_MAX_CONNECTIONS` and
`DEFAULT_MAX_PENDING_RESETS`) as of today, which means there is no need
to save them into a snapshot. Even if we change the hardcoded sizes
across Firecracker versions, that should not be a problem. This is
because the snapshot feature does not support migration of network
connections and those buffers are initialized with empty on snapshot
restoration. When migrating from a Firecracker version with larger
buffers to another version with smaller ones, guest workloads that
worked previously might start to fail due to the less buffer spaces.
However, the issue is not a problem of the snapshot feature and it
should also occur even on a purely booted microVM (not restored from a
snapshot). Thus, it is fine to remove those fields from a snapshot.

Since this is a breaking change of the snapshot format, bumps the major
version.

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
We bumped the snapshot version up twice recently, requiring users to
regenerate their snapshot, but the user action isn't clearly stated.

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
let the x86_64 and aarch64 architectures code can compile and without
warnning

Signed-off-by: tommady <tommady@users.noreply.github.com>
@tommady
Copy link
Contributor Author

tommady commented Nov 23, 2024

Hi @ShadowCurse

Thank you for your patience and thorough review—it has been an incredible learning experience for me!
I’ve revisited your feedback and ensured that the entire architecture compiles cleanly, with no documentation warnings.
Please take a look whenever you have the time. 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants