Anatomy of the gity Daemon

A walk through the gity codebase — the eight Rust crates, the IPC protocol, the storage layer, and the design choices that make a 12 MB binary serve a 250k-file monorepo.

  • #architecture
  • #rust
  • #internals
  • #gity

This article is for the engineering-curious. If you ever want to fork gity or contribute, here’s the lay of the land. The code is on GitHub; this is the map.

The crates

gity is a Cargo workspace with eight crates:

CrateRoleLOC
gityBinary. CLI parsing, command dispatch, daemon launcher.~1,800
gity-cliCLI argument parsing, formatting helpers.~1,200
gity-daemonCore daemon engine — scheduler, supervisor, lifecycle.~4,200
gity-watchFile watcher implementation atop the notify crate.~2,400
gity-gitGit operations — fsmonitor protocol, prefetch, maintenance.~3,100
gity-ipcAsync-nng wire protocol definitions.~1,500
gity-storageSled-backed persistent state + replication.~2,300
gity-trayOptional system tray UI (GTK on Linux, native elsewhere).~1,600

Each crate has a single responsibility. Inter-crate boundaries are typed (no serde_json::Value-style stringly-typed APIs anywhere). This is deliberate — small focused crates compile fast and refactor easily.

The IPC layer

gity-ipc defines the wire protocol between the CLI and the daemon. Three transports:

  • Linux/macOS: Unix domain socket at ~/.local/share/gity/socket.
  • Windows: named pipe at \\.\pipe\gity.
  • fsmonitor v2 helper: a separate listener on a path Git can discover via .git/config.

The CLI↔daemon channel uses async-nng (nanomsg’s modern Rust port). Each request is a Serde-serialized struct framed by nng. Round-trip latency in our measurements is ~120 microseconds on Linux.

The fsmonitor v2 channel is different — it speaks Git’s wire protocol directly (binary framing, null-terminated paths). This bypasses Serde entirely; the bytes that come off the socket are the bytes that go to Git. Avg latency: ~85 microseconds.

The IDE subscribe channel is a third path — also nng, but pub/sub instead of request/response. Subscribers get a stream of structured events (file-changed, prefetch-complete, etc.) without polling.

The watcher

gity-watch wraps the notify crate and adds:

  • Snapshot management: each registered repo gets a baseline snapshot at registration time. The watcher’s job is to maintain the delta against this snapshot.
  • Overflow detection: when the OS reports a queue overflow (IN_Q_OVERFLOW on Linux, ENOSPC on macOS, ERROR_NOTIFY_ENUM_DIR on Windows), the watcher invalidates the snapshot and forces a re-prime.
  • Backpressure: per-repo event queues are bounded. When a registered repo produces 50,000 events in a second (e.g., a checkout), the watcher coalesces them rather than queuing.
  • Per-repo resource accounting: file descriptors used, watcher tokens issued, queue depth.

The notify crate handles the per-platform API differences (FSEvents / inotify / ReadDirectoryChangesW). We chose it over rolling our own because it has years of bug-bounty-grade testing across platforms.

The storage layer

gity-storage is sled-backed with three logical key spaces:

  • Registrations: repo_id -> RegistrationState (when registered, watcher token, scheduler settings).
  • Cache: (repo_id, slot_id) -> snapshot serialized bytes. The fsmonitor cache lives here.
  • Metrics: rolling counters of cache hits, misses, CPU usage, RSS, file descriptors per repo.

Sled was chosen over RocksDB, SQLite, and LMDB for three reasons:

  1. Pure Rust: no FFI, no separate C++ build dependency.
  2. Transactional: writes survive crashes; the cache stays consistent.
  3. Fast for our access pattern: tiny values, point queries dominate.

For multi-worktree cache sharing, gity uses the rykv crate to replicate hot keys across the storage layer for related repos (sharing an object store). This is the mechanism behind cross-worktree cache warming.

The scheduler

The scheduler in gity-daemon is one of the more interesting parts. It runs prefetch and maintenance tasks across all registered repos, with three constraints:

  1. Never run two prefetches simultaneously (the network bottleneck).
  2. Never run maintenance when system load is over a threshold.
  3. Never run anything while on battery (configurable; default on for laptops).

Implementation is a priority queue with backoff. Tasks are kept ordered by next-due-time and pulled when:

  • The 1-minute load average is below the threshold.
  • A previous task of the same kind has finished.
  • The system is on AC power (or battery-mode is disabled).

Failed tasks (e.g., transient network error during prefetch) get exponential backoff with jitter. After three failures in a row, the task surfaces via gity health as “needs attention.”

This is roughly 600 lines of Rust. The complexity is in correctness — handling clock skew, system suspend/resume, and the various failure modes of git fetch — not in algorithms.

The lifecycle

A gity invocation does one of two things:

  • CLI mode: parses arguments, opens an IPC connection to the daemon (or starts the daemon if not running), sends a request, prints the response, exits.
  • Daemon mode: starts the supervisor, opens the IPC sockets, restores state from sled, attaches watchers to every registered repo, starts the scheduler, blocks on a graceful-shutdown channel.

The daemon is supervised by tokio::task::spawn. Crashes in one task don’t take down the others; the supervisor restarts crashed tasks with exponential backoff and surfaces persistent crashes in gity health.

What we don’t ship

A few deliberate non-features in the codebase:

  • No HTTP API. Everything is local async-nng. We don’t want to be a network service.
  • No plugin system. Plugins are a maintenance liability; users who need extensions can fork.
  • No telemetry. No phone-home, no opt-in usage counters. State stays local.
  • No --config files. Configuration is exposed via the CLI; defaults are right.

These are the constraints from the founding doc, enforced in code.

How to contribute

The codebase is intentionally small and approachable. If you want to start contributing:

  1. Clone https://github.com/neul-labs/gity.
  2. cargo build --release — builds the workspace in a few minutes.
  3. Run the test suite: cargo test --workspace.
  4. Pick a “good first issue” labeled task on GitHub.

Areas where contributions are especially welcome: Windows-specific watcher edge cases (RDC buffer tuning), the optional tray UI, additional IPC client libraries (Python, JavaScript), and benchmarks on a wider variety of monorepos.

If you have a deeper change in mind — new task type for the scheduler, new IDE subscribe channel format — open an issue first so we can sketch the design together. Most invasive changes are easier to merge if the design conversation happens before the code.

Frequently asked questions

How big is gity, source-wise?

About 18,000 lines of Rust across eight crates as of v0.1.2. Significantly less than Watchman or Microsoft's Scalar. Most of the size comes from cross-platform watcher code and the fsmonitor v2 wire protocol; the actual scheduling and caching logic is small.

What database does gity use?

sled, an embedded transactional key-value store written in Rust. Keys are repo IDs + cache slot identifiers; values are serialized snapshot states. The database lives at `~/.local/share/gity/state.db` (or the OS-equivalent).

How does the IPC layer work?

async-nng (the Rust bindings for nanomsg-next-generation). Lightweight binary protocol over Unix domain sockets on Linux/macOS and named pipes on Windows. Used for both CLI↔daemon RPC and the optional IDE pub/sub channel.