Background `git maintenance`, Explained

Git ships a scheduler for prefetch, commit-graph, and incremental repack. Here's what each task does, when it pays off, and how gity makes it CPU- and battery-aware.

  • #git maintenance
  • #prefetch
  • #commit-graph
  • #performance

git maintenance is one of the most quietly useful features Git has shipped in the past five years. It does several things at once, all aimed at keeping a repo fast as it grows. This article walks through each task, what problem it solves, and the tradeoffs.

The maintenance tasks

Run git maintenance run (or just let it run on its scheduled cadence) and Git executes some subset of:

prefetch

Downloads new objects from configured remotes into a hidden ref namespace (refs/prefetch/...). When you later run git fetch or git pull, the objects are already local — only the refs change.

Win: foreground fetches drop from “download 50MB of pack” to “rotate a few refs,” which is instant.

Default schedule: hourly.

When it pays off: any repo where you git fetch regularly, especially monorepos with active CI pushing commits constantly.

commit-graph

Maintains a binary .git/objects/info/commit-graph file that accelerates reachability queries. Git uses it for git log, git fetch, git push, and many other operations.

Win: git log queries that previously walked through tens of thousands of commits become O(1) lookups into the graph.

Default schedule: hourly.

When it pays off: every repo over ~10,000 commits. Below that, the savings are negligible; above that, they compound.

The --changed-paths variant (enabled by git maintenance set --changed-paths or by feature.commitGraphChangedPaths) additionally accelerates git log -- path/, which is the bottleneck for incremental-build tools.

incremental-repack

Periodically merges small pack files into larger ones, without touching the existing large packs. The result: fewer pack files to consult on every object lookup.

Win: object lookups stay fast even after months of activity. Without this, a long-lived repo grows a “tail” of small packs that slows everything down.

Default schedule: daily.

When it pays off: any repo that lives more than a few weeks.

gc (auto)

The classic Git garbage collector. Removes unreachable objects (e.g., commits from deleted branches) once they’ve passed the gc.pruneExpire window.

Win: reclaims disk; keeps total object count bounded.

Default schedule: weekly, conservative.

When it pays off: long-running repos that accumulate dangling refs from automated workflows.

pack-refs

Combines individual loose ref files into a packed-refs file. Helps repos with many branches.

Default schedule: weekly.

loose-objects

Packs loose objects into a pack file when there are too many.

Default schedule: hourly.

Turning it on

The simplest path:

git maintenance start

This registers a systemd timer (Linux), launchd job (macOS), or Task Scheduler entry (Windows). Default cadence is hourly for the fast tasks, daily/weekly for the slow ones.

To customize what runs:

git maintenance set --task=prefetch --schedule=hourly
git maintenance set --task=commit-graph --schedule=hourly
git maintenance set --task=incremental-repack --schedule=daily

To see what’s scheduled:

git maintenance list

To remove:

git maintenance stop

Where Git’s scheduler falls short

For a server or a build farm, cron-style scheduling is fine. For a developer laptop, it has real downsides:

  1. No load awareness. Maintenance runs at the scheduled hour regardless of whether you’re in the middle of a build, a CI compile, or a video call.
  2. No battery awareness. It runs the same on battery as plugged in — meaning maintenance can quietly drain your battery while you’re at a café.
  3. No multi-repo coordination. If you’ve enabled maintenance on five repos, they may all fire at the same hour. The system load spikes for a few minutes; on slower machines, you feel it.
  4. No backoff. If a task fails (e.g., network unreachable during prefetch), the scheduler doesn’t back off — the same task fails again at the next slot.

These are minor inconveniences for a server. For a developer machine, they add up.

How gity does it differently

When you gity register a repo, gity adds maintenance to its own scheduler. The key differences:

  • CPU-aware. Maintenance is paused if the system’s 1-minute load average is above a configurable threshold (default: 1.5× CPU count). Resumes when the system is idle.
  • Battery-aware. On laptops, maintenance is paused on battery and resumes when plugged in. Configurable.
  • Coordinated across repos. If twenty repos are registered, gity runs their maintenance one at a time — no thundering herd.
  • Backoff and retry. Failed tasks (e.g., transient network error during prefetch) back off exponentially and are retried; persistent failures are surfaced via gity health.
  • Inspectable. gity health <path> shows what’s scheduled, when it last ran, and how it went.

For most developers, these distinctions are invisible — maintenance just works. For laptop users and people with many registered repos, the load-awareness and battery-awareness make the difference between “maintenance helps” and “maintenance is annoying.”

What maintenance won’t do for you

A few things it can’t fix:

  • A 500 MB binary committed two years ago still gets cloned on every fresh checkout. git filter-repo is the tool for that.
  • Pathologically deep history (millions of commits along a single line). Commit-graph helps but doesn’t make the underlying repo small.
  • A bloated .git/lfs/ cache. LFS has its own GC (git lfs prune).

For everything else, the answer is: yes, just run maintenance.

Try it

# Native Git approach:
git maintenance start

# Or, let gity handle it:
cargo install gity
gity register .

Either way, six months from now your repo will still be fast. Without maintenance, it won’t.

Frequently asked questions

What does `git maintenance` actually do?

It schedules background tasks that keep your repo fast: prefetch (download new objects so foreground fetches are smaller), commit-graph (accelerate reachability queries), incremental-repack (keep pack files compact), gc (garbage collect dangling objects), pack-refs and loose-objects.

Is `git maintenance` safe to run on production repos?

Yes. Maintenance tasks are designed to run concurrently with normal Git operations. The exception is `gc` with `--auto=true`, which still uses a lock — but the incremental-repack task is lock-free.

Why use gity for maintenance instead of Git's built-in scheduler?

Git's scheduler is cron-based: it runs at fixed intervals regardless of CPU load or battery state. gity's scheduler watches load and battery, throttles when the system is busy, and pauses entirely on battery for laptop users. For developer machines, this matters more than people expect.