Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows Please] #909

Open
jhay06 opened this issue Jun 18, 2022 · 65 comments
Open

[Windows Please] #909

jhay06 opened this issue Jun 18, 2022 · 65 comments
Labels
enhancement New feature or request expert help wanted Extra attention is needed platform/Windows roadmap Roadmap

Comments

@jhay06
Copy link

jhay06 commented Jun 18, 2022

Description

Hi ,
This is such a great work , hoping for the Windows version

WSL is good but very slow in most cases , i prefer to have this in windows , :) hope that this will be available soon.

Thanks

@AkihiroSuda AkihiroSuda added enhancement New feature or request help wanted Extra attention is needed expert labels Jun 18, 2022
@jandubois
Copy link
Member

WSL is good but very slow in most cases

I don't see how we could improve on the performance of WSL and Hyper-V on Windows. So unless somebody else has ideas, and the ability to implement them, this is unlikely to happen.

@afbjorklund
Copy link
Member

I think you can use the QEMU binaries with the WHPX acceleration, along with any available ISO, to see the "baseline" performance. If that is not enough, I guess this is more a request for running Windows containers (like Docker) ?

https://docs.microsoft.com/en-us/virtualization/windowscontainers/about/

I would be OK with being able to run VMs the same way on Windows, that is available on Mac and Linux today (i.e. QEMU). We had this with docker-machine and podman-machine, so it should be "possible" also for containerd-machine (lima)

But I don't run Windows myself, and when I do it is with something like MSYS

@afbjorklund
Copy link
Member

afbjorklund commented Jun 19, 2022

There are some GOOS=windows compilation issues on master, but those should be easy to fix:

# github.com/lima-vm/lima/pkg/lockutil
pkg/lockutil/lockutil.go:34:27: undefined: unix.LOCK_EX
pkg/lockutil/lockutil.go:38:28: undefined: unix.LOCK_UN
pkg/lockutil/lockutil.go:48:10: undefined: unix.Flock
pkg/lockutil/lockutil.go:49:27: undefined: unix.EINTR
note: module requires Go 1.18
# github.com/lima-vm/lima/pkg/networks
pkg/networks/validate.go:76:25: undefined: syscall.Stat_t
note: module requires Go 1.18

The "lockutil" are just missing some nerdctl code available. The syscall needs wrapping...

https://github.com/containerd/nerdctl/tree/master/pkg/lockutil


EDIT: added:

make GOOS=windows
_output/bin/limactl: PE32+ executable (console) x86-64 (stripped to external PDB), for MS Windows

@afbjorklund
Copy link
Member

afbjorklund commented Jun 19, 2022

Something like: (see MSYS2, and https://www.alpinelinux.org/downloads/)

$ /c/Program\ Files/qemu/qemu-system-x86_64 -m 512 -smp 1 \
  -accel whpx,kernel-irqchip=off -cdrom alpine-virt-3.16.0-x86_64.iso
Windows Hypervisor Platform accelerator is operational

The display looks broken (no input), but -serial stdio almost works.

ISOLINUX 6.04 6.04-pre1  Copyright (C) 1994-2015 H. Peter Anvin et al
boot:


   OpenRC 0.44.10 is starting up Linux 5.15.41-0-virt (x86_64)

 * /proc is already mounted
 * Mounting /run ... * /run/openrc: creating directory
 * /run/lock: creating directory
 * /run/lock: correcting owner
 * Caching service dependencies ... [ ok ]
 * Remounting devtmpfs on /dev ... [ ok ]
 * Mounting /dev/mqueue ... [ ok ]
 * Mounting modloop  ... * Verifying modloop
 [ ok ]
 * Mounting security filesystem ... [ ok ]
 * Mounting debug filesystem ... [ ok ]
 * Mounting persistent storage (pstore) filesystem ... [ ok ]
 * Starting busybox mdev ... [ ok ]
 * Loading hardware drivers ... [ ok ]
 * Loading modules ... [ ok ]
 * Setting system clock using the hardware clock [UTC] ... [ ok ]
 * Checking local filesystems  ... [ ok ]
 * Remounting filesystems ... [ ok ]
 * Mounting local filesystems ... [ ok ]
 * Configuring kernel parameters ... [ ok ]
 * Migrating /var/lock to /run/lock ... [ ok ]
 * Creating user login records ... [ ok ]
 * Cleaning /tmp directory ... [ ok ]
 * Setting hostname ... [ ok ]
 * Starting busybox syslog ... [ ok ]
 * Starting firstboot ... [ ok ]

Welcome to Alpine Linux 3.16
Kernel 5.15.41-0-virt on an x86_64 (/dev/ttyS0)

localhost login: root
root
Welcome to Alpine!

The Alpine Wiki contains a large amount of how-to guides and general
information about administrating Alpine systems.
See <http://wiki.alpinelinux.org/>.

You can setup the system with the command: setup-alpine

You may change this message by editing /etc/motd.

localhost:~#

EDIT: The kernel-irqchip thing was a workaround for a startup error:

whpx: injection failed, MSI (0, 0) delivery: 0, dest_mode: 0, trigger mode: 0, vector: 0, lost (c0350005)

And with "almost works", I mean this console has some weird issues:

localhost:~# apk add containerd
pk add containerd
-ash: pk: not found

EDIT: -display sdl works (better than "gtk")

qemu-whpx-efi-sdl

Here the console interaction works better.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 19, 2022

So lima "works", and qemu "works". Left to do is making them work together, and add some documentation. 😃

Accelerator: https://docs.microsoft.com/en-us/virtualization/api/hypervisor-platform/hypervisor-platform (WHPX)

WSL is good but very slow in most cases , i prefer to have this in windows , :)

The main difference between WSL2 and Lima, is that lima uses a new virtual machine for each instance...
With the Windows Subsystem for Linux, all the system containers share the same VM kernel (like in LXC)

It is possible to start one Linux distribution (like Alpine), and then start system containers for Ubuntu or whatever.
Then the experience should be similar, same goes with sharing files - if opting in to use 9p (same as WSL uses)

@afbjorklund
Copy link
Member

afbjorklund commented Jun 26, 2022

Almost got it to run, final hurdle is converting paths for qemu (dos, argh) and scripts for ssh (don't ask)

"[hostagent] qemu[stderr]: C:\\Program Files\\qemu\\qemu-system-x86_64.exe: cannot create PID file: Failed to create PID file"
"[hostagent] stdout=\"\", stderr=\"command-line line 0: invalid quotes\\r\\n\", err=failed to execute script \"ssh\": stdout=\"\", stderr=\"command-line line 0: invalid quotes\\r\\n\": exit status 255"

Fixes (PRs):

  • lima compiles for GOOS=windows, cross-compiled on linux
  • unittests runs for GOOS=windows, using wine64 on linux

Verified:

  • regular limactl.exe operations (download, etc) works ok on Windows 10
  • starting virtual machine with hardware acceleration works on Windows 10

Fallbacks:

  • fallback to user "lima" using existing code, due to DOMAIN\user
  • use id -u and id -g where available, otherwise fallback uid gid
  • add home directory to the LimaUser, instead of using it "raw"
  • use cygpath $HOME where available, otherwise just use "filepath"
  • use windows paths (filepath) for host home and unix paths (path) for guest home

Workarounds:

User needs to add qemu, and regular tools - either MSYS2 or Git for Windows (MinGW) would work...

It's all normal programs, so it would be possible to install qemu-system-x86_64.exe and ssh.exe etc.
It does not require a Unix environment (like Cygwin) or other emulator, besides the regular QEMU (and Lima).

Using the "whpx" accelerator requires Windows with Hyper-V (Pro?), falling back to "haxm" would be possible.


Will make a PR for the fallbacks, but the rest needs a design decision - or to wait for AF_UNIX support ?

At this point it is just a proof-of-concept or technical demo, users are still recommended to use WSL2.

Note: this does not improve the performance (with Hyper-V), but it should be on par with the Mac version ?

I assume that all developers will be using Unix, and will not set up anything for PowerShell or DOS etc.

@arixmkii
Copy link
Contributor

AF_UNIX support ?

@afbjorklund are you aware if there is any activity enabling AF_UNIX for windows builds on qemu side? This could benefit other projects as well. Like podman providing podman machine with MacOS like behavior instead as an alternative to WSL2 option.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

Sorry, I don't know anything about it. The information I stumbled upon so far looked more like "gross hacks" than anything else.

https://cygwin.com/pipermail/cygwin/2020-June/245088.html

https://stackoverflow.com/questions/23086038/what-mechanism-is-used-by-msys-cygwin-to-emulate-unix-domain-sockets

I will assume that Unix sockets are unavailable on Windows

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

Afaik, Podman only uses Unix sockets for legacy (pre 18.09) Docker clients ? The other clients use SSH directly

@arixmkii
Copy link
Contributor

Podman machine uses unix socket for qmp at least

-qmp unix://var/folders/<redacted>/T/podman/qmp_podman-machine-default.sock,server=on,wait=off 

And looks like for virtio-serial device

-device virtio-serial -chardev socket,path=/var/folders/<redacted>/T/podman/podman-machine-default_ready.sock,server=on,wait=off,id=podman-machine-default_ready 

These are extracts from podman machine start command line running on MacOS.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

Oh, I thought you meant for the podman connection...
(Formerly known as CONTAINER_HOST or PODMAN_USER/PODMAN_HOST/PODMAN_PORT)

@arixmkii
Copy link
Contributor

arixmkii commented Jun 27, 2022

No. I was talking about podman machine command and framework specifically. Having AF_UNIX in QEMU windows build could reduce the amount of platform specific branches to implement QEMU backed podman machine command for modern Windows versions.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

For the PoC, I just used -chardev pipe (and mkfifo) for the qemu control.

       -chardev pipe,id=id,path=path
           Create a two-way connection to the guest. The behaviour differs slightly between Windows hosts and other hosts:

           On Windows, a single duplex pipe will be created at \\.pipe\path.

           On other hosts, 2 pipes will be created called path.in and path.out. Data written to path.in will be received by the guest. Data written by the guest can be read
           from path.out. QEMU will not create these fifos, and requires them to be present.

           path forms part of the pipe path as described above. path is required.

Didn't bother creating a qmp Monitor for Unix though, "left as an exercise"

@afbjorklund
Copy link
Member

afbjorklund commented Jun 27, 2022

But otherwise, I would be happy enough if exec.Command actually worked (with filepath)

Note that the examples in this package assume a Unix system. They may not run on Windows, and they do not run in the Go Playground used by golang.org and godoc.org.

https://pkg.go.dev/os/exec#Command

On Windows, processes receive the whole command line as a single string and do their own parsing.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 29, 2022

Turns out that qemu doesn't start up correctly with pipe chardev. Switch them to null, and it works.

The WHPX accelerator is not compatible with -cpu max, so that needs a special case (like -cpu host)

Using Wine is too unstable to do anything but run unit tests, even with -accel tcg there are random failures.

The path issues were related to that os.UserHomeDir value is not compatible with exec.Command...

In case your home directory is C:\Users\AndersBjörklund or something, it fails to encode it properly.

This affects the default $LIMA_HOME and ~/.ssh, so probably needs some $HOME workaround/fallback <sigh>.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 29, 2022

More random whpx failures:

{"level":"debug","msg":"qemu[stdout]: Windows Hypervisor Platform accelerator is operational","time":"2022-06-29T17:02:37+02:00"}
{"level":"debug","msg":"qemu[stderr]: C:\\Program Files\\qemu\\qemu-system-x86_64.exe: WHPX: Failed to emulate MMIO access with EmulatorReturnStatus
: 2","time":"2022-06-29T17:02:37+02:00"}
{"level":"debug","msg":"qemu[stderr]: C:\\Program Files\\qemu\\qemu-system-x86_64.exe: WHPX: Failed to exec a virtual processor","time":"2022-06-29T
17:02:37+02:00"}
{"error":"exit status 3","level":"info","msg":"QEMU has exited","time":"2022-06-29T17:02:37+02:00"}

Come and go, mysteriously...

It seems to mostly affect the default (ubuntu) image, even though only the ISO / URL changes ?.

Not the alpine (alpine-lima) image, but unfortunately it does not include any nerdctl/containerd.

@afbjorklund
Copy link
Member

afbjorklund commented Jun 29, 2022

Unfortunately, the terminal detection and signal handling is all messed up.

time="2022-06-29T17:25:44+02:00" level=info msg="Terminal is not available, proceeding without opening an editor"

If you terminate the limactl shell, then the limactl start kills the qemu.

{"level":"info","msg":"Received SIGINT, shutting down the host agent","time":"2022-06-29T17:23:14+02:00"}

@afbjorklund
Copy link
Member

afbjorklund commented Jul 1, 2022

In theory, this would be the way to fix the home:

\\?\C:\Users\AndersBjörklund (UNC)

In practice, this is the only workaround that works:

C:\Users\ANDERS~1 (DOS)


Probably want to flip these internal paths back to regular again, before displaying them to the user ?

Ironically, this seems to be done using filepath.ResolveSymlinks (which currently breaks LimaDir)

EDIT: Added PR, instead of hardcoded string:

@afbjorklund
Copy link
Member

afbjorklund commented Jul 2, 2022

Basic operation on Windows 10, when using Alpine with a custom containerd + nerdctl installation.

limactl start template://alpine

$ limactl ls
NAME      STATUS     SSH                ARCH      CPUS    MEMORY    DISK      DIR
alpine    Running    127.0.0.1:51129    x86_64    4       4GiB      100GiB    C:\Users\ANDERS~1\.lima\alpine

$ uname
MINGW64_NT-10.0-19044

$ lima uname
Linux

$ lima sudo nerdctl version
Client:
 Version:       v0.21.0
 OS/Arch:       linux/amd64
 Git commit:    9ddf5226eabcbb7b4b43987f3b0f8d53d86d3bca

Server:
 containerd:
  Version:      v1.6.6
  GitCommit:    10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1

Note: no mounts, until the host/guest path situation is sorted out

DEBU[0002] the host home does not seem mounted, so the guest shell will have a different cwd

Note: no virtfs on windows, which means no 9p only sshfs mounts

ERROR: Feature virtfs cannot be enabled: virtio-9p (virtfs) requires Linux or macOS


Ubuntu template is still broken, MSYS2 terminal is still broken ("invalid quotes")

hostagent/useragent uses insecure ports, and qmp/serial sockets are disabled...

@afbjorklund
Copy link
Member

afbjorklund commented Jul 2, 2022

This is the EFI bug, turns out alpine still uses BIOS:

https://gitlab.com/qemu-project/qemu/-/issues/513

Seems like a workaround is to use -bios instead ?

EDIT: Indeed, that was it (with a custom OVMF.fd)

lima-ubuntu-qemu-whpx

So now both images are working OK, with WHPX.


@AkihiroSuda AkihiroSuda added the roadmap Roadmap label Jul 19, 2022
@afbjorklund
Copy link
Member

afbjorklund commented Jul 22, 2022

Fixed the quoting issues for MSYS, so now all three consoles should work (with lima)

  1. MSYS2 (msys64 subsystem)
  2. MinGW64 (Git for Windows)
  3. Command Prompt (cmd.exe)

Will push "port" and "pipe" up as drafts, and rebase and clear up the home directory...

  • port: use tcp sockets instead of unix sockets, for hostagent/guestagent
  • pipe: use named pipes instead of unix sockets, for qemu communication
  • add better handling of the external OVMF_CODE.fd, unlike the internal BIOS

@afbjorklund
Copy link
Member

Typical output:

MSYS2

lima-windows-msys2

MinGW64

lima-windows-mingw64

cmd.exe

lima-windows-cmd

@AkihiroSuda
Copy link
Member

Thanks a lot @afbjorklund

port: use tcp sockets instead of unix sockets, for hostagent/guestagent

This is fine until ssh.exe supports UNIX sockets, but this TCP socket has to be protected with mTLS to avoid potential attacks from malicious web sites via WebSockets.

@afbjorklund
Copy link
Member

afbjorklund commented Jul 22, 2022

This is fine until ssh.exe supports UNIX sockets, but this TCP socket has to be protected with mTLS to avoid potential attacks from malicious web sites via WebSockets.

I know, that is why I left it in draft. It's the same status as Docker's port 2375 - ok for testing development, but needs port 2376 for deployment production. Same thing with the named pipes unfortunately, currently it is using "null" instead of "pipe" in qemu.

       -chardev pipe,id=%s,path=%s
       -chardev socket,id=%s,path=%s,server=on,wait=off

Anyway, I will put the code up there for reading - hopefully there is some reasonable implementation to add tls to it (?), and hopefully there is some easy fix / patch to qemu for windows to allow it to still boot even when given the pipe option.


Investing some weird panic with the dns server as well, commented it out - but need to find out why it won't start...

                logrus.Debugf("Start %v server listening on: %v", network, addr)
                if e := s.ListenAndServe(); e != nil {
                        panic(e)
                }

So it remains in the "proof of concept" status, reason for pushing it is so that any Windows developer can help out.

@arixmkii
Copy link
Contributor

arixmkii commented Jan 7, 2023

Another take on "bad" paths issue

func toCygpath(p string) string {
	if runtime.GOOS == "windows" {
		cp, _ := call([]string{"cygpath", "-u", p}, nil)
		cd := path.Dir(cp)
		cf := path.Base(cp)
		h := sha256.New()
		h.Write([]byte(cd))
		sha256_hash := hex.EncodeToString(h.Sum(nil))
		td := path.Join("/tmp", sha256_hash)
		_, err := call([]string{"test", "-d", td}, nil)
		if err == nil {
			return path.Join(td, cf)
		}
		_, err = call([]string{"ln", "-s", cd, td}, []string{"MSYS=winsymlinks:nativestrict"})
		if err == nil {
			return path.Join(td, cf)
		}
		return cp
	}
	return p
}

This utilized symlinks. So, we will create a symlink of good path to "bad" one. Good is constructed as "/tmp/SHA256VALUEOFBADONE". The problem is that symlink operation on Windwos host requires either elevation or developer mode settings in the OS. And obviously running limactl and all children in elevated mode is not a great idea. So, we are here left with a developer mode, which is kinda okay-ish as I consider Lima as a development tool, but not everyone would agree. The solution would be to provide the script, which user could run elevated, which will create all symlinks for the specific user is advance and then the code if detects one will just use that link via path substitution. Would be much better if ssh ControlPath would just accept all valid paths (I think it is the only place, where it rejects paths with whitespaces, not so sure with full unicode support).

Update 1: the good part is that we only need this magic around SSH commands, all other tools should work with Windows paths.

@arixmkii
Copy link
Contributor

arixmkii commented Jan 8, 2023

Status updates:

  1. SSHFS works in RW mode
  2. 9pfs works in RO mode
    2.1. contacted developer of the patchset on some hints about RW mode
    2.2 reported 2 bugs to the developer - current implementation can't handle directories with symlinks or Unix domain sockets.
  3. The sample above, I consider a reasonable implementation for fixing path issues. At least for the moment.

Regarding 9pfs. It did work only RO, when I tried it with Podman, so, it is consistent at least, but let's see what response I will get about that issue, because annotation to the patchset mentions Write as supported.

Will work on creating separate issues for the code parts, which needs improvements to support Windows.

@arixmkii
Copy link
Contributor

Tested this workaround #909 (comment) with cygwin. This works with cygwin symlinks, which doesn't require any sort of elevation or development mode. So, in this aspect cygwin has its edge over msys2 option.

@arixmkii
Copy link
Contributor

Published QEMU build with 9pfs and pflash/UEFI patches (functionally equal to what I used for my experiments, but now built with CI) https://github.com/arixmkii/qcw/releases/tag/v0.0.8

@subfuzion
Copy link

Hey folks! I'm working on a book, and as someone completely enamored with the Lima experience on the Mac for demonstrating Linux system programming, it would be wonderful to see parity of experience for Windows users. Respect for all the open source contributors here -- the effort I see in this thread alone to make this happen is nothing short of amazing, so how unrealistic am I to hope that this is something that might be achieved by early 2024?

@afbjorklund
Copy link
Member

Another possibility for Windows users, besides using the portable QEMU, would be a Hyper-V driver...
(it would be doable, the VirtualBox driver #1277 was not too much work to set up - for the minimum)

But I have not payed attention to what the current status is, the PoC was "working" but had security issues.

@Anutrix
Copy link

Anutrix commented Jan 21, 2024

Since #1721 got merged, what's the status now? Does the docs mention how to install limactl on Windows or WSL2?

@AkihiroSuda
Copy link
Member

Since #1721 got merged, what's the status now? Does the docs mention how to install limactl on Windows or WSL2?

No docs yet (contribution is wanted), but it is installed and tested in the CI like this

windows:
name: "Windows tests"
runs-on: windows-2022-8-cores
timeout-minutes: 30
steps:
- name: Enable WSL2
run: |
wsl --set-default-version 2
# Manually install the latest kernel from MSI
Invoke-WebRequest -Uri "https://wslstorestorage.blob.core.windows.net/wslblob/wsl_update_x64.msi" -OutFile "wsl_update_x64.msi"
$pwd = (pwd).Path
Start-Process msiexec.exe -Wait -ArgumentList "/I $pwd\wsl_update_x64.msi /quiet"
wsl --update
wsl --status
wsl --list --online
- name: Install WSL2 distro
timeout-minutes: 3
run: |
# FIXME: At least one distro has to be installed here,
# otherwise `wsl --list --verbose` (called from Lima) fails:
# https://github.com/lima-vm/lima/pull/1826#issuecomment-1729993334
# The distro image itself is not consumed by Lima.
# ------------------------------------------------------------------
# Ubuntu-22.04: gets stuck in some infinite loop during adduser
# OracleLinux_9_1: almostly silently fails, and just prints "Usage: adduser [options] LOGIN"
wsl --install -d openSUSE-Leap-15.5
wsl --list --verbose
- name: Set gitconfig
run: |
git config --global core.autocrlf false
git config --global core.eol lf
- uses: actions/checkout@v4
with:
fetch-depth: 1
- uses: actions/setup-go@v5
with:
go-version: 1.21.x
- name: Unit tests
run: go test -v ./...
- name: Make
run: make
- name: Smoke test
# Make sure the path is set properly and then run limactl
run: |
$env:Path = 'C:\Program Files\Git\usr\bin;' + $env:Path
Set-ItemProperty -Path 'Registry::HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Environment' -Name PATH -Value $env:Path
.\_output\bin\limactl.exe start template://experimental/wsl2
# TODO: run the full integration tests
- name: Debug
if: always()
run: type C:\Users\runneradmin\.lima\wsl2\ha.stdout.log
- name: Debug
if: always()
run: type C:\Users\runneradmin\.lima\wsl2\ha.stderr.log

@arixmkii
Copy link
Contributor

I returned to experimenting with Lima + QEMU on Windows. Previous big no-goes were lack of mTLS (because of intermediated TCP between cygwinish runtimes and native Windows) and mux behaving differently (breaking some commands and taking it ugly). There were hopes that AF_UNIX would come to OpenSSH Windows builds, but there was barely any progress and mux support was not even in the first batch to be added.

Since then I changed concept how this could be achieved. Instead of cygwin/msys2/git shell going full Linux with a minimalist Alpine based VM with WSL2 mirrored networking mode (to have full localhost magic). This small distro will cover all networking stuff and utilities (like id, wslpath, ssh, ssh-keygen), while QEMU will run actual workloads.

Why WSL2 VM instead of cygwin:

  • it is simpler to create a distribution off Alpine then to bundle everything needed from msys2 (somehow achievable via chroot like options for msys2 package manager, but definitely more involved and Alpine feels more controllable);
  • OpenSSH just works as on other platforms - no need to workaround discrepancies;
  • reverse-sshfs was an issue with some Windows paths (issues, when username had spaces or non ASCII characters;
  • smaller code changes comparing to my prior experiments.

It is sort of stupid to have a lightweight VM next to a full sized VM, but if that actually works may be it is not that stupid.

I got some successes - running web server with port forwarding and lima shell operational, now will move to checking if reverse-sshfs will be a troublemaker.

@afbjorklund
Copy link
Member

I think that containers work quite well in the "new" WSL of Windows 11, now that it has both cgroups v2 and systemd*.

But it would still be nice to have QEMU support on all platforms, and some kind of simple Hyper-V VM driver (without WSL)

* it even supports KVM and GUI, which was a bit surprising

Probably need the driver framework to be in place, though?

@arixmkii
Copy link
Contributor

This arixmkii@16467b4 got me usable QEMU setup with port forwarding, 9p, reverse-sshfs. To hide the complexity of WSL hosted tools I wrote this tool https://github.com/arixmkii/go-wsllinks So, under extras directory in bin there are:

  • id.exe
  • realpath.exe
  • sftp-server.exe
  • ssh.exe
  • ssh-keygen.exe
  • sync_lima_file.exe
  • wslpath.exe

The Lima can use them almost the same way as native tools (path translation is added, where required).

I still need to finalized build scripts for WSL distro (for tests I manually imported Alpine and installed all tools).

What still requires work - AF_UNIX socket forwarding. I checked that it is still possible to implement through intermediate TCP transport, but this will not be good w/o mTLS, I hope to figure out another way.

I also need to test WSL driver support, I definitely broke this, so, this will need fixing.

I plan to finalize WSL distro stuff and then setup CI to create at least one test build for sharing. Then I will do another round evaluating options for AF_UNIX forwarding support.

@arixmkii
Copy link
Contributor

I think that containers work quite well in the "new" WSL of Windows 11,

No doubts here. From my point of view the powerful VM provisioning provided by Lima is as great as containers experience it gives. Having this option available would be beneficial.

@arixmkii
Copy link
Contributor

I built very first artifact version from my experimental code with CI and it is available here: https://github.com/arixmkii/qcw/releases/tag/v0.0.28 They are highly experimental and for evaluation purposes only, don't try it on your production/important systems. People interested should consult the README file for instructions. The list of code changes applied on top of Lima sources is arixmkii@f97d2c5

@arixmkii
Copy link
Contributor

arixmkii commented Jan 31, 2025

Resolved blocking issues with WSL2 machine type in my rebuilds (containerd is experiencing issues after setup, this is yet to be investigated). Updated versions will be published here https://github.com/arixmkii/qcw/releases/tag/v0.0.29 It is important to delete previous version of lima-infra WSL instance and install the new one. The build is still for evaluation purposes only and not production ready.

Submitted some quick win fixes:

And created a backlog for other required changes:

Some other planned activities - test/evaluate with msys2 userland instead on WSL2, add at least basic testing to the current CI builds (CI part basics are done for WSL flavor, I will add some QEMU variants, when the thing runs against upstream QEMU - needs #3176).

@arixmkii
Copy link
Contributor

arixmkii commented Feb 12, 2025

I published the updated build https://github.com/arixmkii/qcw/releases/tag/v0.0.30 This passes integration test for default.yaml locally running against unmodified QEMU (it is impossible to run it in GH actions, because no server Windows flavor supports mirrored networking mode for WSL2). Changes could be checked in this branch https://github.com/arixmkii/lima/tree/qemu-tools-windows-preview4

My next TODO steps:

  • evaluate msys2/cygwin userland
  • fully restore WSL2 yaml support
  • investigate possibilities for NAT networking mode support (at least that it could run in GH action)
  • start tidying up patches to create PRs for review

2 more issues added to backlog:

@arixmkii
Copy link
Contributor

I tried to use msys2 userland apps (I expect no significant differences with cygwin) and I'm back to errors in ssh like

mux_master_process_new_session: failed to receive fd 0 from client
mm_send_fd: sendmsg(2): Connection reset by peer
mux_client_request_session: send fds failed
mux_client_request_session: read from master failed: Connection reset by peer
Failed to connect to new control master

Some old links saying that mux is not working on cygwin, probably nothing changed in that regard

In the past I tried to workaround this, but I see no point as it either redo everything for all platforms or have significantly different internals in Lima for msys2/cygwin scenario, while the support with WSL requires minimal alterations.

I will continue with WSL userland and focus on getting it working to some extent on NAT and mirrored modes.

@arixmkii
Copy link
Contributor

arixmkii commented Feb 15, 2025

🎉 First green CI build with int tests on Windows https://github.com/arixmkii/qcw/actions/runs/13348707154/job/37282476154

It tests only default.yaml It uses QEMU distributed with msys2.

Current issue with NAT networking mode is that QEMU is hardcoded to listen for SSH only on localhost and in NAT mode one can't access Host loopback (in mirrored mode it works out of the box)

Current CI is using socat bridge to connect socat TCP-LISTEN:60022,bind=127.0.0.1,reuseaddr,fork EXEC:"/mnt/d/a/_temp/msys64/usr/bin/socat.exe - TCP:127.0.0.1:60022" It is Linux and msys2 socats connected via STDIO and accessing own loopback interfaces. This works, but feels rather complex and janky for such an important part as SSH. I consider adding logic to detect NAT networking mode and then listen for SSH connections on internal WSL IP address instead.

Things still to figure out:

  • Unix port forwarding for QEMU driver
  • WSL2 driver template - the best I could manage so far - make it run, but I need to test networking also

Added backlog tasks:

Bigger TODO:

  • do a write up to document everything - overall design, important implementation details, specifics of every supported mode, etc.

@AkihiroSuda
Copy link
Member

Thanks!

I consider adding logic to detect NAT networking mode and then listen for SSH connections on internal WSL IP address instead.

👍

@arixmkii
Copy link
Contributor

arixmkii commented Feb 22, 2025

🎉 The first green build with integration test with QEMU and with WSL2 https://github.com/arixmkii/qcw/actions/runs/13474629971/job/37652601743

List of disabled tests for WSL2:

  • proxy settings (needs fix)
  • systemd (looks like an issue with the template base OS)
  • port forwarding - port forwarding with WSL2 is tricky as WSL2 has its own port forwarding for loopback (and not only with mirrored mode)

List of newly reported issues:

List of major TODOs:

  • figure out the way to forward Unix sockets

@arixmkii
Copy link
Contributor

arixmkii commented Feb 24, 2025

I published a fresh build of Lima + WSL2 companion. I do somewhat regular releases of a number of tools for Windows and now this will also include Lima rebuilds. All builds will be going through integration tests with QEMU and WSL2.

WSL2 networking / Lima machine / Forwarder Status
NAT / WSL2 machine / SSH Forwarder Works, integration tests in CI (some to be fixed), port forwarding to 127.0.0.1 and WSL2 machine external IP address
NAT / WSL2 machine / gRPC Forwarder Not tested
NAT / QEMU machine / SSH Forwarder Works, integration tests in CI, port forwarding to 127.0.0.1 and WSL2 machine external IP address
NAT / QEMU machine / gRPC Forwarder Works, integration tests checked by dev, port forwarding to 127.0.0.1 and local machine external IP address
mirrored / WSL2 machine / Any Forwarder Not tested
mirrored+hostAddressLoopback / WSL2 machine / Any Forwarder Not tested
mirrored / QEMU machine / SSH Forwarder Works, integration tests checked by dev, port forwarding to 127.0.0.1
mirrored / QEMU machine / gRPC Forwarder Works, integration tests checked by dev, port forwarding to 127.0.0.1 and local machine external IP address
mirrored+hostAddressLoopback / QEMU machine / SSH Forwarder Works, integration tests checked by dev, port forwarding to 127.0.0.1 and local machine external IP address
mirrored+hostAddressLoopback / QEMU machine / gRPC Forwarder Works, integration tests checked by dev, port forwarding to 127.0.0.1 and local machine external IP address
  • gRPC forwarder not tested in CI to save time, only default configuration is covered, when it will become default it will take the SSH forwarder place;
  • mirrored configurations can't be tested on Windows Servers (even latest 2025 doesn't allow mirrored mode);
  • only default and experimental/wsl2 templates are tested;
  • ipv6 is available only with gRPC forwarder.

I will update the link to the up to date release in this post and will only post new ones, when there are reasonable advancements in how this progresses to prevent further notification spamming.

@arixmkii
Copy link
Contributor

I published the first PR, where QEMU integration tests are run on Windows: #3255

The feature toggle feature is requested to be rework to make it work with other VM types, so, there is some more work to do. I also have tests passed (with still high failure rates) with sshfs running against Git installation. But they are extremely unstable and in I'm yet to see success with them in gh actions. I'm yet to see what could be done to make them more stable (if anything) (9pfs works better, but unfortunately it is not known if this support will ever mature to stable QEMU on Windows).

@arixmkii
Copy link
Contributor

arixmkii commented Mar 30, 2025

I prepared build for Windows, which is capable of running QEMU VMs w/ Git shell tools https://github.com/arixmkii/qcw/releases/tag/v0.0.34

It includes these PRs on top of original revision:

And additionally a prototype implementation of

AF_UNIX port forwarding is still missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request expert help wanted Extra attention is needed platform/Windows roadmap Roadmap
Projects
None yet
Development

No branches or pull requests

8 participants