Where `git status` Time Actually Goes: a Microbenchmark
We profiled `git status` on a 250k-file monorepo across macOS, Linux, and Windows. Here's where the milliseconds end up — and why fsmonitor wins by a factor of 50.
A while ago I wanted to know exactly where git status spends its time in a large repo. “Walking the tree” is the cliché answer, but I wanted numbers. Here’s what I found.
Test setup
- Monorepo with 248,000 tracked files, 32,000 directories.
- Three machines: M2 MacBook Pro (macOS 14, APFS), Ryzen 7950X (Ubuntu 24.04, ext4), Surface Studio (Windows 11, NTFS).
- Git 2.46.
feature.manyFiles=true. No fsmonitor (baseline). - Each measurement is the median of 10 runs, taken after a
git update-index --refreshto ensure the index is clean.
Wall-clock baseline (cold cache)
| Operation | macOS | Linux | Windows |
|---|---|---|---|
git status | 7,950 ms | 4,180 ms | 11,200 ms |
git status --porcelain | 7,820 ms | 4,090 ms | 11,050 ms |
Linux ext4 is the fastest absolute baseline; Windows NTFS is the slowest by a factor of ~2.5×. This roughly tracks lstat() throughput on each platform.
Where does the time go?
Using perf on Linux and dtruss on macOS, I broke down the warm baseline (cache populated, fewer disk reads):
| Cost | Linux | macOS |
|---|---|---|
lstat() on tracked files | 78% | 84% |
readdir() for untracked-file scan | 11% | 6% |
| Index parse + comparisons | 6% | 4% |
| Hashing (SHA-1 on modified files) | 3% | 4% |
| Output formatting | <1% | <1% |
| Process startup, dynamic linker | 1% | 2% |
The walk dominates. Everything else is rounding error. This is why fsmonitor wins so completely — it eliminates the dominant cost almost entirely.
With fsmonitor
Same machines, same repo, with gity registered:
| Operation | macOS | Linux | Windows |
|---|---|---|---|
git status (warm) | 26 ms | 18 ms | 34 ms |
git status (after touching 1 file) | 31 ms | 24 ms | 42 ms |
git status (after touching 100 files) | 48 ms | 38 ms | 71 ms |
Speedup vs baseline: ~300× on macOS, ~230× on Linux, ~330× on Windows.
The fixed cost is ~20ms — that’s process startup, IPC round-trip, and the few necessary lstat()s on actually-changed files. Above that, the cost scales linearly with the number of changed files, not with the size of the repo.
What this means in practice
For developers:
- The size of your repo barely matters once fsmonitor is on. A 1-million-file repo with 3 changed files runs as fast as a 50k-file repo with 3 changed files.
- The platform you’re on barely matters either. Windows is no longer 2.5× slower; it’s a few milliseconds behind.
- IDE polling becomes essentially free. Polling every second adds ~30ms of background CPU, not 8 seconds.
For CI:
- A 250k-file monorepo with three
git statuscalls per job — common in incremental-build pipelines — saves about 15–25 seconds per run. Over thousands of runs per day, this is real money. - Cold-start latency (the first call after a reboot or fresh container start) is still ~50ms because the daemon’s cache has to prime.
gity daemon oneshotincludes a quick prime step.
The bottleneck that remains
Once git status is fast, the next bottleneck depends on your workflow:
git fetch: dominated by network and object-walk overhead on the remote. Mitigation: partial clone (--filter=blob:none), background prefetch.git diffwith hashing: SHA-1 (or SHA-256 if you’re on a modern repo) on each changed file. Negligible for small changes; visible when you compare a megabyte-scale binary file.git logover deep history: O(depth) without commit-graph; O(1) with. The commit-graph file is the single best optimization for log-heavy workflows.
git status, post-fsmonitor, is no longer in the top-five list of bottlenecks. That’s the goal.
Methodology notes
A few caveats so you can reproduce this:
- I excluded
git statusruns immediately aftergit checkoutfrom the warm-baseline median. Checkout changes a large fraction of mtimes, which gives fsmonitor work and inflates apparent latency. - I used
gityv0.1.2 for the fsmonitor numbers. Watchman is within 10% (slightly slower due to Perl helper overhead). Git’s built-in daemon is within 5% (slightly slower than gity at high call frequency due to per-call allocation). - The repo I tested on is a real monorepo (anonymized for this post) at one of my client engagements. Numbers may differ on your repo, especially if your tree shape is unusually deep or wide. The
gity democommand will give you your-machine, your-repo numbers in a minute.
Try it yourself:
cargo install gity
cd ~/work/your-largest-repo
gity demo
The included demo races vanilla Git against gity in a TUI and prints both wall-clock and speedup numbers when it finishes.
Frequently asked questions
What's the largest cost inside `git status`?
The working-tree walk. On a 250k-file repo, ~85% of a cold `git status` is `lstat()` calls on every tracked file. Less than 10% is index parsing, hashing, or output formatting. fsmonitor cuts the walk to a few files, eliminating the dominant cost.
Does platform matter?
Yes, but less than you'd think. macOS APFS, Linux ext4, and Windows NTFS all run at roughly 30,000–80,000 `lstat()` per second. Wall-clock differences come more from inotify/FSEvents/RDC efficiency than raw stat throughput.
Why doesn't `git status` use multi-threading?
It does, in part — the working-tree walk is multi-threaded with `core.preloadIndex`. But you still pay the syscall cost per file, and threading mostly helps you saturate the kernel rather than reduce total work.