logs.gokuls.in

3 pull requests merged across 1 repo

bahdotsh/indxr

  • Adds wiki_contribute MCP tool that lets agents write knowledge back into the wiki during conversations
  • Adds wiki_generate and wiki_update MCP tools so agents can bootstrap and maintain the wiki autonomously via MCP (no CLI needed)
  • Closes the feedback loop: agents can now generate a wiki, read it, synthesize answers, and file insights back as new or updated pages
  • Implements the key missing piece for Karpathy's LLM Wiki pattern — "good answers can be filed back into the wiki as new pages"

New tools (6 wiki tools total)

ToolWhat it does
wiki_generateFull wiki generation from codebase via LLM. Bootstraps the wiki autonomously.
wiki_updateIncremental update — only regenerates pages whose source files changed.
wiki_contributeAgent writes pages directly (no LLM cost). Create or update with auto cross-referencing.
wiki_search(existing) Search by keyword/concept
wiki_read(existing) Read a page by ID
wiki_status(existing) Health check

The compounding loop this enables

1. Agent generates wiki via wiki_generate

2. Agent explores code via find/summarize/read

3. Agent checks existing knowledge via wiki_search/wiki_read

4. Agent synthesizes an answer for the user

5. Agent files the insight back via wiki_contribute with [[cross-links]]

6. After code changes, agent runs wiki_update to keep wiki current

7. Next conversation, that knowledge is already compiled in the wiki

Test plan

  • cargo fmt --check — clean
  • cargo clippy --features wiki — clean
  • cargo clippy (no features) — clean
  • cargo test --features wiki — 401 tests pass (394 unit + 7 integration)
  • cargo test (no features) — 358 tests pass
  • 8 tests for wiki_contribute: create, update, missing params, title required for new pages, invalid page ID, default page_type, cross-link extraction, tool listed in definitions
  • 3 new MCP tools (wiki_search, wiki_read, wiki_status) expose generated wiki content to AI agents during sessions, feature-gated behind --features wiki
  • Enhanced indxr wiki status with staleness tracking (commits behind HEAD), affected pages preview, and source file coverage percentage
  • Wiki store loaded at MCP server startup (both stdio and HTTP transports), with graceful fallback when no wiki exists

Details

MCP tools

  • wiki_search(query, limit?) — keyword search across page titles, covers, content, and source files with multi-signal scoring
  • wiki_read(page) — read by exact ID, case-insensitive match, or partial title search with helpful "not found" listing
  • wiki_status() — page count, type breakdown, staleness, and coverage stats

Enhanced status

  • Commits behind HEAD count via git rev-list
  • Affected pages preview (which pages would need updating)
  • Coverage percentage with uncovered file listing

Wiring

  • WikiStoreOption type alias for clean conditional compilation
  • Early-dispatch pattern in handle_tools_call (matches existing regenerate_index/get_diff_summary pattern)
  • Threaded through full call chain: run_mcp_serverhandle_stdin_lineprocess_jsonrpc_messageprocess_jsonrpc_requesthandle_tools_call
  • HTTP transport (AppState) updated with wiki store field

Test plan

  • cargo fmt — clean
  • cargo clippy --features wiki — clean
  • cargo clippy (no features) — clean
  • cargo test --features wiki — 390 tests pass (including 12 new wiki tool tests + 7 wiki integration tests)
  • cargo test (no features) — 358 tests pass
  • cargo check --features "wiki,http" — compiles
  • MCP tool list verification — wiki tools appear in tools/list response
  • Adds an LLM-powered wiki layer that compiles codebase understanding into persistent, interlinked markdown pages — knowledge that compounds across sessions instead of being re-derived every time
  • Feature-gated behind --features wiki (adds reqwest + tokio) so the default binary stays lean
  • Provider-agnostic LLM client supporting Claude (ANTHROPIC_API_KEY) and any OpenAI-compatible endpoint (OPENAI_API_KEY)

What's included (Phase 1: Foundation)

New modules:

  • src/llm/ — LLM client abstraction with Claude and OpenAI-compatible backends
  • src/wiki/page.rsWikiPage, Frontmatter, PageType with YAML frontmatter parse/render
  • src/wiki/store.rsWikiStore for loading/saving/querying wiki pages from .indxr/wiki/
  • src/wiki/generate.rs — 3-stage generation engine: plan structure → generate pages → build cross-reference index
  • src/wiki/prompts.rs — LLM prompt templates optimized for codebase documentation

CLI commands:

indxr wiki generate              # generate wiki from scratch
indxr wiki generate --dry-run    # plan only, no LLM calls
indxr wiki update                # update affected pages (Phase 2 - currently falls back to full regen)
indxr wiki status                # show page count, coverage, staleness
indxr wiki --model <model> ...   # override LLM model

Key design decisions:

  • The generation engine sends compact structural summaries (~2K tokens) to the LLM instead of raw source (~20K), leveraging indxr's existing index
  • Wiki pages use YAML frontmatter tracking source files, git ref, cross-references, and covered declarations — enabling surgical incremental updates via the structural diff system
  • The wiki is stored as plain markdown in .indxr/wiki/ — any agent can read it, and it can be committed to version control

What's next (Phase 2 & 3)

  • Phase 2: Incremental updates — use StructuralDiff to identify which wiki pages are affected by code changes and update only those
  • Phase 3: MCP integration — wiki_search, wiki_read, wiki_status tools so agents can query the wiki during sessions

Test plan

  • cargo build --features wiki compiles cleanly
  • cargo build (without wiki) still compiles — no leakage
  • cargo test --features wiki — all 362 tests pass (4 new wiki tests)
  • cargo test — all 358 tests pass (wiki tests excluded by cfg)
  • indxr wiki --help shows correct subcommands
  • End-to-end: ANTHROPIC_API_KEY=... indxr wiki generate on a real codebase