AhaDiff User Guide · v1.3.8

AhaDiff User Guide

AhaDiff turns a git diff into a code-grounded lesson, explicit claim states, a quiz, review cards, a concept graph and a quality ratchet. It is local-first by default: each repository keeps its own AhaDiff state under .ahadiff/.

Video tutorials:

English walkthrough (YouTube);

Chinese walkthrough (Bilibili). Each video uses the matching UI language with burned-in subtitles; intermediate subtitle source files are not shipped in the public docs tree. Install with pip install ahadiff (we now recommend the isolated pipx install ahadiff — see Install below). The English walkthrough shows this; the Chinese Bilibili cut is being refreshed to match.

Getting started

1. Quick Start

Install the published CLI from PyPI. The isolated pipx install ahadiff (or uv tool install ahadiff) is recommended and needs no git clone. Contributors can still run from a source checkout when developing AhaDiff itself.

Install

Install the CLI and bundled WebUI in an isolated environment with pipx (recommended). uv tool install ahadiff works the same way; neither needs a git clone. pip install ahadiff is fine inside a venv or conda env. On a Homebrew or system Python a bare pip install trips externally-managed-environment, so use pipx or uv tool instead.

pipx install ahadiff
ahadiff --version

Develop from source (contributors)

Sync the locked dev environment, then confirm the module runs.

uv sync --locked --dev
uv run python -m ahadiff --version
uv run python -m ahadiff doctor

First-time use in a repository

When you first use AhaDiff inside a repo, initialize it and confirm the environment.

cd /path/to/your/repo
ahadiff init
ahadiff doctor
ahadiff config show --resolved

Run artifacts and review state live in this repo. Provider config can stay repo-local or global.

your-repo/ └─ .ahadiff/ repo-local state · per-repo, not global ├─ config.toml provider, capture, privacy ├─ review.sqlite SRS, result history, learning signals ├─ runs/ one folder per learn run └─ concepts.jsonl accumulated concept graph

The storage gate accepts these SQLite builds

3.51.3+ 3.50.4+ (backport) 3.44.6+ (backport)

uv tool ships SQLite 3.50.4 on Python 3.11.14. Already accepted.

Run ahadiff doctor before your first learn session to confirm your SQLite build clears the gate; it reports the detected version and the required threshold, and fails closed when the build is too old. A higher version number is not automatically accepted. The gate requires SQLite 3.51.3 or newer, or a backported 3.50.4+ / 3.44.6+ build, so 3.51.0-3.51.2 fall below the bar and are not accepted. If ahadiff doctor flags your build, switch to a Python whose bundled SQLite meets the bar, for example uv tool, the python.org installer, Homebrew, or conda.

Welcome: hero tagline, intro paragraph, and the first-run quick start. — Welcome: the first screen after you open the WebUI.

Setup

2. Configure a Provider

Configuring one LLM provider is the only setup step before ahadiff learn. There are two paths. From the CLI, the key goes in an environment variable and you hand AhaDiff only the variable name. From the WebUI Settings, you paste the plaintext key and AhaDiff stores it for you (see "How the WebUI stores your key" below). Either way, no key ever lands in Git, the README, manifests, or checked-in scripts.

What `provider test` does

Probe the model with a small request.

On success, save the provider to .ahadiff/config.toml.

Fail closed if the key env-var is unset.

Accepted key env-var names in repo config:

AHADIFF_PROVIDER_API_KEY OPENAI_API_KEY ANTHROPIC_API_KEY GEMINI_API_KEY AZURE_OPENAI_API_KEY

Name

--api-key-env is the variable NAME. An identifier-shaped value fails closed when the variable is unset, so a name is never sent as a bearer token by accident.

Secret

On this CLI path the real key stays in your shell environment and never goes in a flag. The WebUI path is different: it writes the plaintext key to .ahadiff/.env (see below). Both ways keep the key out of Git.

Provider setup

Run ahadiff serve and add a provider in Settings (the card previews model limits before you save, with no remote call and no key read), or run one provider test from the matching tab below.

Export your key and provider base URL, then run one probe.

export OPENAI_API_KEY="<your-provider-api-key>"
export AHADIFF_PROVIDER_BASE_URL="<provider-base-url>"
ahadiff provider test \
  --name default \
  --provider-class openai \
  --base-url "$AHADIFF_PROVIDER_BASE_URL" \
  --api-key-env OPENAI_API_KEY

Read the key without echoing it, then probe a Responses/API provider example. On macOS / Linux:

export AHADIFF_PROVIDER_BASE_URL="<provider-base-url>"
read -rsp "Provider API key: " AHADIFF_PROVIDER_API_KEY; export AHADIFF_PROVIDER_API_KEY; printf '\n'

ahadiff provider test \
  --name gpt55 \
  --provider-class openai_responses \
  --base-url "$AHADIFF_PROVIDER_BASE_URL" \
  --model gpt-5.5 \
  --api-key-env AHADIFF_PROVIDER_API_KEY \
  --privacy-mode explicit_remote

On Windows PowerShell, read it with Read-Host -AsSecureString:

$env:AHADIFF_PROVIDER_BASE_URL="<provider-base-url>"
$secure = Read-Host "Provider API key" -AsSecureString
$env:AHADIFF_PROVIDER_API_KEY = [System.Net.NetworkCredential]::new("", $secure).Password

ahadiff provider test `
  --name gpt55 `
  --provider-class openai_responses `
  --base-url $env:AHADIFF_PROVIDER_BASE_URL `
  --model gpt-5.5 `
  --api-key-env AHADIFF_PROVIDER_API_KEY `
  --privacy-mode explicit_remote

Use openai_compat for the official DeepSeek API so structured output uses JSON object mode. DeepSeek v4 flash/pro also expose the Thinking Level control after Settings preview confirms support; none disables thinking, while low, medium, and high all map to the API's high effort. A global-scope DeepSeek provider is valid in Settings Save, learn, and improve preflight without copying it into each repo.

export DEEPSEEK_API_KEY="<your-deepseek-api-key>"

ahadiff provider test \
  --name deepseek \
  --provider-class openai_compat \
  --base-url https://api.deepseek.com \
  --model deepseek-v4-flash \
  --api-key-env DEEPSEEK_API_KEY \
  --privacy-mode explicit_remote

Point at a local OpenAI-compatible server. Ollama and LM Studio expose their own base URLs.

ahadiff provider test \
  --name local \
  --provider-class lmstudio \
  --base-url "$LOCAL_PROVIDER_BASE_URL" \
  --api-key-env AHADIFF_PROVIDER_API_KEY

Settings: providers, privacy, audit, preferences, and project-level AI tool guidance. — Settings: configuration and diagnostics.

How the WebUI stores your key

In Settings you paste the plaintext key directly into the API Key field; you do not type a variable name there. AhaDiff then:

Generates a unique reference name shaped like AHADIFF_<ALIAS>_KEY. Uniqueness avoids names already in your system environment and names already in .ahadiff/.env, adding a numeric suffix on a clash (for example AHADIFF_DEMO_2_KEY).
Writes the plaintext key only into the repo-local .ahadiff/.env file. On POSIX the file is chmod 0600; on Windows chmod is not equivalent to a POSIX owner-only ACL, so it is best-effort and not strong isolation.
Makes sure the secret patterns (.env, .env.*, audit.private.jsonl, *.lock, and *.log) stay git-ignored: it creates .ahadiff/.gitignore if it is missing, and if you already have one it appends only the missing secret lines (your existing lines are preserved). If that .gitignore is a symlink, hardlink, reparse point, or cannot be opened safely, saving is rejected instead of writing the key. Either way the key file is ignored by a normal git add (a forced git add -f could still override it).
Stores only the reference name in config.toml (api_key_env equals AHADIFF_<ALIAS>_KEY); the plaintext is never written to config.toml.
When you save with an API key in the field, probes the provider to validate connectivity. This is best-effort: a failed probe does not block the save, and the UI shows the validation result. Updating an existing provider with the key field left blank keeps the current key and skips the probe (verification is empty).

On startup, both serve and the CLI load .ahadiff/.env into the process environment. If a variable with the same name already exists in the system environment, the system value takes precedence and is not overwritten.

Deleting a provider cleans up its entry in .ahadiff/.env, but only unreferenced entries that use the AhaDiff-reserved AHADIFF_ prefix. System or shared env names (such as OPENAI_API_KEY), and names still referenced by another provider, are left untouched. Cleanup keys off the AHADIFF_ prefix and the reference count, not a record of which line AhaDiff authored, so keep that prefix reserved for AhaDiff and avoid hand-authoring your own AHADIFF_* variables in .ahadiff/.env.

The CLI path is unchanged: ahadiff provider test --api-key-env NAME still uses an environment-variable name, and it can now also resolve a reference name from .ahadiff/.env. An identifier-shaped value that is not a set environment variable fails closed (it is not sent as a literal key); a value containing dashes is used as a literal key.

Windows note: the local .ahadiff/.env is protected by NTFS folder permissions rather than POSIX 0600. For stricter handling, point api_key_env at a real OS environment variable (it takes precedence and is never written to .ahadiff/.env). For at-rest protection, use full-disk encryption such as BitLocker (or FileVault on macOS).

Registry context example

openai

≈400k

context budget for ordinary openai access

openai_responses / API

≈1.05M

context budget on Responses / API access

For gpt-5.5, the bundled registry keeps these two context profiles. A trustworthy live probe can override either profile when the endpoint reports a real total context.

Once configured, your single repo or global provider (or the generate provider/model you pick in Settings) is used by ahadiff learn automatically. A repo provider with the same alias overrides the global alias. Pass --provider / --model only to override one run.

ahadiff learn --last

How a saved `max_output_tokens` is treated

Empty → Auto

No value means AhaDiff sizes the output limit for you.

Trusted hard max

A known, trusted ceiling is clamped on save and returns a warning.

Unknown / low-confidence / route-specific / local-runtime

These stay warnings only. They are not treated as a hard guarantee.

Local server reports the wrong JSON capability?

Set only known boolean overrides in .ahadiff/config.toml. NewAPI defaults to supports_native_json_schema=false; if your gateway supports native JSON schema, add the override.

[providers.local.capability_overrides]
supports_native_json_schema = false

Workflow

3. Run a Learn Session

Pick a diff source

AhaDiff has 10 capture modes. Pick the tile that matches what you changed. Release validation records capture coverage and live LLM lesson runs separately; the v1.3.8 audit records the global-provider save regression and a live DeepSeek run.

Generate the lesson and claim states

AhaDiff captures the diff, scans for safety issues, and produces claims, lesson, quiz, score and other artifacts.

Open the WebUI to learn

Read the Lesson for the explanation, jump to Diff for the evidence, take the Quiz to check understanding, and use Review for spaced repetition.

Run the source that matches what you changed. The Dashboard's New Run dialog mirrors these same groups.

Quick (working tree)

Last commitThe most recent commit.ahadiff learn --last

StagedWhat is staged for the next commit. Stage with git add first.ahadiff learn --staged

UnstagedEdits not staged yet; add untracked.ahadiff learn --unstaged --include-untracked

Git Advanced

Time windowCommits since a point in time.ahadiff learn --since "2 hours ago"

Revision rangeAn explicit commit range.ahadiff learn HEAD~1..HEAD

Patch

Patch fileA unified diff on disk.ahadiff learn --patch change.diff

Patch stdinA diff piped from another tool.ahadiff learn --patch -

Patch URLA remote unified diff.ahadiff learn --patch-url URL

File Compare

Compare filesTwo file versions, no git.ahadiff learn --compare old.py new.py

Compare dirsTwo directory trees, no git.ahadiff learn --compare-dir old/ new/

10 modes → capture coverage v1.3.8 live DeepSeek global-provider run recorded Path scope: working-tree modes only · repo-relative paths

Add --against-spec SPEC.md to any of these to check the diff against a spec; add --spec-semantic-review for the semantic pass.

`--since` alone vs `--since --author`

--since

A multi-commit time window. Every commit in the window is captured.

--since --author

Exactly one matching commit. It skips other authors' commits in the window.

Patch and URL captures carry caveats:

No symbol index → claims may be weak (lesson still made) Remote URL: same safety checks · loopback / private rejected compare-dir: POSIX-only · fails closed elsewhere

The WebUI patch field accepts pasted unified diff text up to 65536 bytes; use the CLI for patch files, stdin, or larger patches. CLI patch files resolve inside the repo root; pass --patch - for externally generated patches.

Long-running CLI commands show Rich status while they run. Treat that as terminal feedback, not a machine-readable output contract.

Interface

4. Use the WebUI

Run one of these to open the local React SPA. serve opens a browser by default; add --no-browser to keep it headless, or --watch to auto-learn when working-tree files change.

ahadiff serve
ahadiff serve --port 8765 --no-browser
ahadiff serve --watch   # file watcher, included by default

Workflow at a glance

DiffCapture the code change

ClaimsBind file:line evidence

LessonGenerate the explanation

Quiz/ReviewTest recall with SRS

RatchetScore history, improve preview, and export

Page map

Page	What it's for
`/`	Dashboard: latest runs, KPIs, ratchet trends, and review activity.
`/run/:runId`	Run detail: Overview, Score, Judge, Artifacts. If the optional LLM judge fails, the Judge tab shows a redacted failure panel instead of raw provider output.
`/run/:runId/lesson`	Lesson body, claim summary, evidence, and knowledge notes.
`/run/:runId/diff`	Unified/Split diff, claim dots, and the ClaimInspector.
`/run/:runId/quiz`	Guided / Recall / Transfer questions. The mode badge says whether the current question is a one-time Socratic quiz or an SRS review card.
`/review`	FSRS review queue with Again / Hard / Good / Easy grading.
`/concepts`	Concept ledger, concept graph, and Graphify source.
`/ratchet`	Result history, benchmark summary, improve preview, and export entry points.
`/settings`	Providers, capture, privacy, audit, preferences, and project-level AI tool guidance. AI Tool Guidance groups targets as CLI / IDE / CI, shows usage hints, and includes a provider-free built-in demo.
`/guide`	Daily commands and the 15 project AI tool guidance targets. It shows category filters, usage hints, and what each target would write before you apply changes; writing and removal stay in Settings → AI Tool Guidance.

About a dozen AI-tool targets write tool-native files. Settings → AI Tool Guidance lists them all and previews each file before writing; they stay local unless you commit them. For example:

.claude/skills/ahadiff/SKILL.md
.agents/skills/ahadiff/SKILL.md
.gemini/skills/ahadiff/SKILL.md
.agents/skills/ahadiff-antigravity/SKILL.md
.agents/skills/ahadiff-antigravity-cli/SKILL.md
.agents/rules/ahadiff.md
.github/instructions/ahadiff.instructions.md
.opencode/agents/ahadiff.md
.clinerules/ahadiff.md
.continue/rules/ahadiff.md
.cursor/rules/ahadiff.mdc
.roo/rules/ahadiff.md
.windsurf/rules/ahadiff.md

Repo guidance sections still live in user-managed files such as CLAUDE.md, AGENTS.md, GEMINI.md, and .github/copilot-instructions.md. This repository ignores generated .agents/ installs so local Codex / Antigravity skill output is not committed by default. Common Commands covers the hooks target and MCP registration.

Dashboard: recent runs, scores, pass rate, ratchet trends, and review activity. — Dashboard: runs, ratchet, and learning signals at a glance.

Claims and the diff

Every claim links to file:line evidence and carries one of five badges. Glyphs and semantics are the product's real ones; the hues are mapped to this page's palette.

✓ verified verified evidence at file:line backs the claim

◆ weak weak partial or indirect evidence

○ not_proven not_proven no evidence found either way

✕ contradicted contradicted evidence conflicts with the claim

⊘ rejected rejected claim discarded

Lesson: claim status, claim summary, knowledge notes, and the evidence sidebar. — Lesson: the claim-state learning note generated from a diff.

Diff: diff content, ClaimInspector, and the matching evidence lines. — Diff: code changes with inline claim evidence linking.

Quiz and review

Review grades each card; the grade drives the next due date.

Againforgot it; see it soon

Hardrecalled with effort

Goodrecalled cleanly

Easytrivial; push it out

Quiz: recall and transfer questions, evidence reveal, and common misconceptions. — Quiz: active-recall questions drawn from claim states.

Review: FSRS queue with Again / Hard / Good / Easy grading. — Review: spaced-repetition queue sorted by forgetting curve.

Concepts

Concept graph: Graphify source, concept nodes, and graph controls. — Concepts graph: the imported knowledge graph view.

Reference

5. Common Commands

Daily work runs through learn, quiz, and review. The clusters below cover everything else: serving the UI and managing local data.

Daily loop

Learn and verify

Generate a run, then check, grade, and improve it.

ahadiff learn --last
ahadiff quiz RUN_ID
ahadiff review
ahadiff claims RUN_ID
ahadiff verify RUN_ID
ahadiff score RUN_ID
ahadiff improve-run RUN_ID

Serve

Open the local WebUI. --watch adds auto-learn on working-tree changes.

ahadiff serve
ahadiff serve --port 8765 --no-browser
ahadiff serve --watch   # file watcher, included by default

Watch

Auto-learn on working-tree changes. ahadiff serve --watch runs the same watcher with the WebUI attached.

ahadiff watch
ahadiff watch --debounce 2.0 --cooldown 30.0
ahadiff watch --dry-run

improve-run RUN_ID regenerates a lesson and keeps the new copy only when the deterministic score strictly improves, saving it as a separate run and leaving the original untouched. Use --candidates N (default 3, range 1-10) to control how many regeneration attempts are tried; a higher value improves the chance of beating the score but costs more tokens and time. It works in any install, including pip. The separate improve command tunes AhaDiff's own generation prompts and only runs inside an AhaDiff source checkout.

Maintenance clusters

Database

Run the SQLite integrity gate over review.sqlite.

ahadiff db check

Graph

Inspect and re-import the Graphify artifact.

ahadiff graph status
ahadiff graph import
ahadiff graph refresh

Concepts

List, verify, and lint the concept ledger.

ahadiff concepts list
ahadiff concepts verify
ahadiff concepts lint

Export

Write results to disk for sharing (see Export & Share).

ahadiff export-results
ahadiff export preview RUN_ID

Challenge

Opt-in. Build a challenge from a run, then check state.

ahadiff challenge build RUN_ID
ahadiff challenge status

MCP

AhaDiff ships a read-only MCP server with 7 tools: runs, run summaries, due cards, search, concepts, stats, and lesson Q&A. Register it once per repo:

# Claude Code
claude mcp add ahadiff -- ahadiff mcp-server --repo-root <path>
# Codex CLI
codex mcp add ahadiff -- ahadiff mcp-server --repo-root <path>

Git hooks

ahadiff install hooks writes reminder-only git hooks. After each commit the post-commit hook prints one line: run ahadiff learn HEAD~1..HEAD to learn back this commit. Before each push the pre-push hook prints a verify reminder. The hooks never run learn, never block a commit, and swallow their own failures. Add --auto-learn for hands-free mode: the post-commit hook runs ahadiff learn --last in the background and appends output to .ahadiff/hooks.log. Commits from GUI clients work too: the hook falls back to the ahadiff path recorded at install time, and appends a skip line to the log instead of blocking the commit if it still cannot find ahadiff. The learnability gate still applies, so trivial commits skip before any LLM call. When two commits land in quick succession, the per-repo write lock stops the second learn; check the log or re-run ahadiff learn --last yourself. Re-run ahadiff install hooks with or without the flag to switch modes in place. macOS/Linux only.

ahadiff install hooks
ahadiff install hooks --auto-learn

Refresh vs learn: what touches Graphify

graph refresh / WebUI refresh

Import an existing artifact only. It does not run graphify update.

learn run

Runs graphify update first when the Graphify CLI is present, then imports.

Refresh timeout 600s

Output

6. Export & Share

The Ratchet / Export entry points in the WebUI offer four formats. The CLI covers the same ground.

ahadiff export-results
ahadiff export preview RUN_ID --out .ahadiff/export-preview

Static preview

A strict-local static bundle with a deterministic zip. Nothing leaves the machine.

TSV

Results as tab-separated values for spreadsheets.

JSON

Results as JSON for scripts and tooling.

Anki `.apkg`

Active review cards as an Anki deck.

default

No extra install needed.

APKG export works by default with a standard AhaDiff install because genanki is bundled.

Capabilities

7. Capabilities: Defaults, Opt-ins, and Dependencies

Capability pills distinguish default behavior from opt-in flows and dependency-backed exports. Open Details on any card for the exact behavior.

Default Works after install

Lesson / Claims / Quiz / Score

The default pipeline, produced after ahadiff learn.

Details

Which artifacts get created depends on the diff and its learnability. Patch-only captures can use weak diff-anchored claims when symbol-level proof is missing. S1 semantic entailment remains private shadow-only measurement, not semantic proof. In Quiz, source evidence stays locked until you answer.

Optional LLM judge

Advisory only; the deterministic score stays primary.

Details

Successful judge runs write judge.json. Failures write a bounded, redacted judge_failure.json with provider, model, error type, and a safe message. Missing judge artifacts return 404 rather than crashing the API.

Quiz question count

Fixed at 3 by default; adaptive is opt-in.

Details

Fixed mode accepts 1-30 questions. In Settings, or with --quiz-mode auto, AhaDiff uses diff stats to choose a bounded count; the default adaptive range is 3-12.

Structured JSON output

JSON object mode with one validation retry.

Details

Public artifacts stay the same. Native schema is used only when the provider capability reports support. Unsupported modes downgrade; truncated or malformed fallback JSON is retried, not accepted.

Adaptive capture limits

Auto for fresh configs; manual once you customize a limit.

Details

Auto mode sizes capture limits from five inputs:

provider probe model registry output reserve safety reserve CJK density

Editing any of these migrates the repo to manual mode:

capture.max_files capture.hard_limit capture.max_patch_bytes

50 MiB patch cap

Provider smart config

Draft model-limit preview in Settings before you save.

Details

The provider card previews limits from the draft provider class, model, base URL, and optional limits profile, with no remote probe on every edit. It shows thinking support, low-confidence warnings, a recommended max-output, and any clamp the save API returns. Official DeepSeek v4 flash/pro expose Thinking Level only when the official API route is detected; none disables thinking, while low, medium, and high map to the API's high effort. Generic OpenAI-compatible routes stay gated by backend metadata. Generate and judge limits are shown separately.

WebUI

Welcome, Dashboard, Diff, and Review open with ahadiff serve.

Details

The Welcome Before/After demo collapses long raw diffs to the lesson height and shows visible / total line counts. Short or empty diffs have no collapse control.

AI Tool Guidance

Settings writes the files; Guide is read-only.

Details

Settings shows for the 15 install targets:

CLI / IDE / CI grouping
Localized quick-start steps
Example prompts and expected behavior
Platform notes
A provider-free local demo

Settings

Writes files with guarded atomic replacement.

Guide

Read-only preview of the same hints and file writes; no write / remove buttons.

Rollback restores content and mode where the platform supports it

Opt-in Off until you turn it on

Challenge

Off by default; enable via config.

Details

The CLI exposes build / status; full progression happens through the WebUI / API.

Spec alignment

Runs only with --against-spec.

Details

Add --spec-semantic-review for the semantic pass. A run with no spec renders this dimension as N/A.

Dependency Needs an extra or external artifact

LLM provider (BYOK)

AhaDiff runs on the LLM provider you configure: any of the 9 supported formats, your choice of model. No specific model is required. Set it up in Configure a Provider.

Details

Set the provider, base URL, model, and the API-key environment variable before use. For DeepSeek BYOK, use openai_compat with https://api.deepseek.com; AhaDiff uses JSON object mode on that route rather than native JSON schema.

APKG export

Works by default; genanki is bundled.

Details

APKG export works by default with a standard AhaDiff install because genanki is bundled. Active review cards export without any extra install.

Serve --watch (serve + auto-learn)

Works by default; the file watcher is bundled.

Details

The file watcher is included by default, so ahadiff serve --watch works out of the box: it serves the WebUI and auto-learns on working-tree changes — the same watcher as ahadiff watch. You can also run ahadiff learn manually.

Watch (auto-learn)

Filesystem-driven auto-learn for working-tree changes.

Details

ahadiff watch watches the working tree (unstaged and untracked files) and runs learn on change. Defaults: 2.0s debounce, 30.0s cooldown. --dry-run captures the diff without generating a lesson; --force-learn bypasses the learnability gate. Run it from any directory. Keep its log file outside the repo: a log inside the watched tree re-triggers the watcher.

Graphify refresh

Imports an existing artifact; capped at 50000 nodes.

Details

It re-imports graphify-out/graph.json into AhaDiff's cache and does not run graphify update. The cap is graph.max_nodes_import (default 50000); over-limit refreshes return GRAPH_NODE_LIMIT.

How a run is scored

Eight dimensions, 100 points. A run can fail on score gates, contradicted claims, evidence coverage, or safety gates.

accuracy20 ptsgate ≥14

evidence18 ptsgate ≥12

diff_coverage14 pts

learnability14 pts

quiz_transfer10 pts

spec_alignment10 pts

conciseness8 pts

safety_privacy6 pts

A run can score high yet still FAIL when a hard gate trips or when critical_safety_findings fire.

Run Detail overview: a single learn run's outcome and metadata.

Run Detail score: the 8-dimension evaluation breakdown.

Help

8. FAQ

Why is there no Lesson / Diff / Quiz?

These pages need a specific run_id. Run ahadiff learn at least once, then open the corresponding run from the Dashboard or from the command's output.

What if the provider fails?

Run the diagnostics and re-test the provider. On Windows PowerShell, set the variables with the $env: form and pass --base-url $env:AHADIFF_PROVIDER_BASE_URL, as shown in Configure a Provider.

ahadiff doctor
ahadiff config show --resolved

Then rerun the matching provider command from Configure a Provider.

What if the lesson gets skipped?

The diff might be too small, learnability might be too low, or there aren't enough usable claim states. If you're sure this is the change you want to study, re-run with --force-learn.

How do I change quiz question count?

Use Settings → Preferences. Fixed mode always asks the configured number of questions, from 1 to 30. Adaptive mode uses the captured diff stats; its default range is 3-12, and old runs without those stats fall back to the fixed count.

How are capture limits chosen?

New repos use auto capture sizing on a priority ladder:

Live provider probe data.

The bundled model registry.

Conservative defaults.

Generate and judge limits resolve separately. The output reserve is subtracted from the input budget once. Only explicit split-envelope rows, such as Gemini, store input plus output as total context.

Manual mode keeps the numbers you set in config or Settings. The Settings provider form previews these limits from the unsaved draft, with no remote probe.

Does the LLM judge decide the final verdict?

No. The final verdict comes from deterministic score.json and hard gates. judge.json is advisory; it helps explain model feedback, but it does not override the deterministic verdict. If a dimension is not applicable, such as spec_alignment on a no-spec run, Score and Judge render it as N/A / 0/0.

Why can Accuracy / Evidence gates change?

The base thresholds hold:

Accuracy 14/20 Evidence 12/18

Large diffs can use an adaptive threshold from visible files, hunks, and changed lines, with ratio, regime, and basis written to hard_gates.*.policy. This never forgives bad evidence. The run still fails when:

high rejected-claim ratio contradicted claims safety issues genuinely missing evidence

How is Diff Coverage judged?

Diff Coverage uses only the visible files and hunks in the persisted line_map.json:

visible-files-only denominator changed-line weighting hunk-count floor adaptive anchor threshold

Omitted files do not enter the denominator. The hunk-count floor keeps tiny hunks counting. The hard gate writes its ratio, regime, and visible basis into the gate detail.

What safety findings fail a run?

An unmitigated Critical safety finding fails the run. A Critical secret finding is treated as mitigated only when the local capture record shows a complete redaction shape, including the rule, hash, source, line, and column.

Can you guarantee there are no bugs?

No. AhaDiff reduces risk by keeping data local, linking lesson claims to diff evidence, and exposing deterministic score and hard-gate details. You should still review generated lessons and run your normal test suite before relying on a change. For release status, use the current GitHub Actions and release notes instead of this static guide.

Notes

9. About

Screenshots in this guide are examples only. They carry no real provider credentials, repository contents, or user data.

Platform support

macOS is the local validation platform for v1.3.8. Windows path and global-config behavior are covered by targeted unit tests; real Windows runtime status belongs to the release workflow. Antigravity stays Unknown/blocked until a real Antigravity check is run. Two features stay POSIX-only.

Feature	macOS · Linux	Windows
Core CLI	✓	✓
`serve` + WebUI	✓	✓
`--compare-dir`	✓	✗
`hooks` install target	✓	✗
Install rollback	✓	partial

--compare-dir needs the secure directory file descriptor available on macOS / Linux only, and fails closed elsewhere. The hooks target uses POSIX shell hooks. Hooks are reminder-only by default; see the Git hooks card for the --auto-learn variant. Rollback restores mode before atomic replace on POSIX, and uses a best-effort mode restore after replace where fchmod is missing.

Validation snapshot

Local checks recorded for the current docs snapshot. For v1.3.8, the recorded local gate covers backend unit/integration/eval, ruff/format/pyright, viewer typecheck/Vitest/build, i18n parity, real serve save reproduction, and a live DeepSeek learn run. Release and CI status belong in GitHub Actions.

Validated

backend unit ruff / format / pyright viewer Vitest + build i18n parity macOS local v1.3.8 gate recorded live DeepSeek global-provider run Antigravity Unknown/blocked

The 10-mode learn claim in this guide is capture coverage unless the validation audit lists each command, run_id, exit code, and lesson/claims/quiz/card artifacts. The v1.3.8 audit records the DeepSeek global-provider regression separately from the older 10-mode matrix; GitHub Release and PyPI publish are tracked by the release workflow and release notes.

AhaDiff User Guide

1. Quick Start

Install

Develop from source (contributors)

First-time use in a repository

The storage gate accepts these SQLite builds

2. Configure a Provider

What provider test does

Name

Secret

Provider setup

How the WebUI stores your key

Registry context example

openai

openai_responses / API

How a saved max_output_tokens is treated

Empty → Auto

Trusted hard max

Unknown / low-confidence / route-specific / local-runtime

3. Run a Learn Session

Pick a diff source

Generate the lesson and claim states

Open the WebUI to learn

--since alone vs --since --author

--since

--since --author

4. Use the WebUI

Workflow at a glance

Page map

Claims and the diff

Quiz and review

Concepts

5. Common Commands

Daily loop

Learn and verify

Serve

Watch

Maintenance clusters

Database

Graph

Concepts

Export

Challenge

MCP

Git hooks

Refresh vs learn: what touches Graphify

graph refresh / WebUI refresh

learn run

6. Export & Share

Static preview

TSV

JSON

Anki .apkg

7. Capabilities: Defaults, Opt-ins, and Dependencies

Default Works after install

Lesson / Claims / Quiz / Score

Optional LLM judge

Quiz question count

Structured JSON output

Adaptive capture limits

Provider smart config

WebUI

AI Tool Guidance

Settings

Guide

Opt-in Off until you turn it on

Challenge

Spec alignment

Dependency Needs an extra or external artifact

LLM provider (BYOK)

APKG export

Serve --watch (serve + auto-learn)

Watch (auto-learn)

Graphify refresh

How a run is scored

8. FAQ

9. About

Platform support

Validation snapshot

Validated

What `provider test` does

How a saved `max_output_tokens` is treated

`--since` alone vs `--since --author`

Anki `.apkg`