Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V3 Filesystem Monitoring #7458

Open
1 of 6 tasks
Byron opened this issue Feb 28, 2025 · 0 comments
Open
1 of 6 tasks

V3 Filesystem Monitoring #7458

Byron opened this issue Feb 28, 2025 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@Byron
Copy link
Collaborator

Byron commented Feb 28, 2025

With such a system, keeping the app in sync with what's in disk will have higher performance than it could have if it would do a whole git status each time
something changes on disk.

Requirements

  • portable
    • Support for Linux, Windows and MacOS
  • Multi-Worktree
    • One 'system' can handle many worktrees. So if it's a separate binary, it should be able to deal with worktrees of many Git repositories at once.
  • Poll-fallback
    • Not all filesystems support efficient watching, but there should still be a way to learn about changes, possibly by implementing the essence of the watching by polling.
  • Support for .git changes
    • Allow to get notified if a ref changes, and if the index changes, or anything else that is relevant to what's shown in the UI.

Verdict of Research

Even though one could spend time on making the builtin Git daemon usable so that we can…

  • …start it on demand
  • …find its socket and communicate with it

…it won't actually do anything that our own file system watcher isn't doing already. Further, it won't give any paths that are inside the .git directory.
I'd rather spend the time making our own filesystem monitor better (possibly after even by contributing fixes to the upstream project) than dealing with Gits monitoring daemon that we can't control nor patch easily.

Something we can certainly learn from is its event handling, it's well thought out and made for robustness.

Also there is always a variant of this 'subscription' system that needs to work without the monitor for network filesystems, so I don't think using Git fsmonitor saves any time (quite the opposite). Let's trust in Rust.

Tasks

  • evaluate git fsmonitor
  • sketch crate API
  • sketch fallback
  • implementation
  • cli status -w to watch in the CLI
  • add a setting to disable the watcher mechanism entirely
    • This is a setting that might be good to have in Git so it can be layered.

Research

git fsmonitor--daemon

Here are the docs.

  • supports external 'hooks' watcher processes or built-in implementation
  • there are three protocols, two for the hooks (V1 and V2) with marginal differences (timestamp vs opaque token), and a more efficient one (IPC) used for the internal implementation.
  • The Git daemon's memory footprint is minimal (2x 5MB)
  • on WebKit, when using the Git daemon, git status takes 600ms to ~900ms, without it it takes 2.8s. But it's far from instant.
  • A hook-based monitor is of the watchman variant, which is very portable, but also a memory hog. Also it's definitely not installed and an installation is isn't always easy as it contains C extensions.
  • The built-in monitor daemon is available since Git 2.36 or so
  • By default, the fsmonitor seems to just speed up the non-untracked-files part of the git status operation, and to speed up untracked files one needs to enable the untracked cache as well, at least for Git to make use of it fully.
  • The daemon can be started on demand, which happens with git status or whatever client we know.
  • the IPC protocol uses packetlines, and the idea is to connect to the named pipe or socket, send a request and receive response packet lines with a flush packet signalling the response end.
  • Shortcomings about the Git fsdaemon
    • no notion of a submodule, so reports these changes as happening in the superrepo. Client has to handle that.
    • doesn't usually work correctly on network mounted filesystems, but can be forced to try anyway with fsmonitor.allowRemote=true
    • the Unix Domain Socket is placed in the .git/ directory where the daemon is available, and that isn't supported on every filesystem the repository could be stored on. If that's the case, the socket is created in $HOME/git-fsmonitor-* or else the location of fsmonitor.socketDir is used. Finding the right socket isn't entirely trivial but there are sources for that.
  • builtin Git fsmonitor communication
    • Code for finding the domain socket path: fsmonitor_ipc__get_path
    • refresh_fsmonitor
    • It's notable that the backend would need a way to store the token to prevent asking for old files each time. Also we'd either let it up and running, or try to shut it down ourselves if we know we started it. And determining this might not be trivial.
    • all the code in compat/fsmonitor/

For later

  • provide patch for a change
  • locking information
@Byron Byron added the bug Something isn't working label Feb 28, 2025
@Byron Byron self-assigned this Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant