"The Guided Tour"

Author: npub1cgppglfhgq0...
Published:
Format: Markdown (kind 30023)
Identifier:
naddr1qvzqqqr4gupzpsszz37nwsqljzg5jmsnj5t0yjwhrgs2zlm597gav6vh3w72242xqq2rvdp3xykhg6r994nh26tyv4jz6ar0w4eqzzw3kw

The Guided Tour

Expert walkthroughs — a senior developer guiding a newcomer through the codebase, explaining not just what the code does but why it's structured that way, what the traps are, where the technical debt lives — are among the most effective onboarding tools. They're also the most expensive. The expert's time is valuable, the walkthrough is one-to-one, and the knowledge transferred evaporates if the newcomer doesn't retain it. When the expert leaves the organization, the walkthrough leaves with them.

Code tours are a lightweight persistence format: annotated sequences of file locations with explanations, like a museum audio guide for a codebase. Each stop in the tour points to a specific line or region of code and provides context — what this function does, why this pattern was chosen, what would break if you changed it. The tour is a document that survives the expert's departure and can be reused for every new hire.

The AI challenge: generating code tours automatically. An LLM can read the codebase and generate explanations, but the hard part is not explanation — it's curation. Which parts of the codebase matter for a newcomer? In what order should they be encountered? What needs explanation versus what is self-evident? The curation requires understanding not just the code but the newcomer's likely confusions, which depend on their background and the codebase's idiosyncratic decisions.

LACY simulates expert mentoring by generating code tours that mimic the structure of human-created tours: starting with high-level architecture, descending into critical paths, highlighting non-obvious design decisions, and connecting implementation details to requirements. The simulation uses the codebase's git history (what changed together reveals architectural coupling), issue tracker (what confused previous newcomers), and test coverage (what's fragile) as proxies for expert knowledge.

The through-claim: the expert's value during onboarding is not their knowledge of the code — an LLM has that too — but their model of the newcomer's confusion. The hard part of a code tour is predicting what won't be obvious, and this prediction requires not code understanding but empathy with a specific reader's state of knowledge. LACY's proxies (git history, issues, tests) approximate this empathy with evidence of what actually confused people before.

Comments (0)

No comments yet.