Docstring Coverage & Staleness
A cross-language tool for measuring documentation coverage and tracking docstring freshness across the codebase.
Quick start
# Coverage report (all languages)
python3 scripts/development/lint/docstring_coverage.py
# Verbose — show each undocumented item with file:line
python3 scripts/development/lint/docstring_coverage.py -v
# Staleness analysis — compare @verified tags against git history
python3 scripts/development/lint/docstring_coverage.py --staleness
# CI gate — fail if coverage drops below threshold
python3 scripts/development/lint/docstring_coverage.py --fail-under 80
What it scans
| Language | Directories | Scanner |
|---|---|---|
| Python | api/app/ |
interrogate (AST), or built-in ast fallback |
| TypeScript | web/src/, cli/src/ |
Built-in regex |
| Rust | graph-accel/core/src/, graph-accel/ext/src/ |
Built-in regex |
What counts as "documentable"
| Language | Items | Doc style |
|---|---|---|
| Python | def, async def, class (public + module-level) |
"""...""" |
| TypeScript | export function, export const, export class, export interface, export type |
/** ... */ |
| Rust | pub fn, pub struct, pub enum, pub trait, pub type, pub const |
/// |
Exclusions
- Python:
__init__.py, test files, private helpers (_name) - TypeScript:
*.test.ts,*.spec.ts,*.d.ts, re-exports - Rust:
#[cfg(test)]modules, benchmark crate
Staleness tracking
The --staleness flag enables docstring freshness analysis using
@verified tags and git history.
The problem
Docstrings drift from reality. Code changes, docstrings don't. Without a mechanism to detect this, documentation silently becomes misleading.
The approach: three dates, no computation
For each documented item, the tool knows three things:
- Current date — when you're running the check (now)
- Docstring date — the commit stamped in the
@verifiedtag - File last commit — from
git log
This produces a tristate:
| Status | Meaning |
|---|---|
| current | @verified commit matches or postdates the file's last commit |
| stale | File was modified after the @verified commit (with drift in days) |
| unverified | No @verified tag — docstring exists but freshness is unknown |
No hashing, no sidecar files, no derived state. Git is the database.
Adding @verified tags
When you write or review a docstring, stamp it with the current short commit hash. The tag format is the same across all three languages:
Python:
def calculate_grounding(concept_id: str) -> float:
"""Calculate grounding strength for a concept.
Uses the two-tier cache (vocabulary + per-concept) to avoid
redundant graph queries.
@verified a1b2c3f
"""
TypeScript (JSDoc):
/**
* Execute a saved query by replaying its Cypher statements.
*
* @verified a1b2c3f
*/
export function executeSavedQuery(query: QueryDefinition): void {
Rust:
/// Perform BFS neighborhood traversal from a concept node.
///
/// @verified a1b2c3f
pub fn neighborhood(start: &str, max_depth: usize) -> Vec<Edge> {
Get the current short hash with:
Reading the staleness report
=== Staleness Report ===
api/app/lib/graph_facade.py (last commit: today)
GraphFacade ✓ current @verified a1b2c3f
neighborhood ⚠ stale @verified 77e2876 — 12d drift
find_path · unverified
- current: The docstring was verified at or after the file's last change. It reflects the current code.
- stale (Nd drift): The file changed N days after the
@verifiedcommit. The docstring may no longer be accurate. - unverified: No
@verifiedtag. The docstring might be fine, but nobody has explicitly confirmed it.
Workflow
The staleness report is a self-maintaining todo list:
- Run the coverage tool with
--staleness - Stale items surface automatically — code changed, docstring didn't
- Review the docstring against the current code
- If accurate, update the
@verifiedhash to the current commit - If inaccurate, fix the docstring and stamp it
For incremental docstring passes with coding agents, this gives a natural priority queue: missing docstrings first, then stale ones, then unverified ones for spot-checking.
CLI reference
usage: docstring_coverage.py [-h] [-v] [--fail-under FAIL_UNDER]
[--python-only] [--ts-only] [--rust-only]
[--staleness] [--no-color]
options:
-v, --verbose Show each undocumented item with file:line
--fail-under N Exit non-zero if overall coverage < N%
--python-only Run only Python scan
--ts-only Run only TypeScript scan
--rust-only Run only Rust scan
--staleness Analyze docstring freshness via @verified tags
--no-color Disable ANSI color output
Notes
--stalenessforces the Python scanner to use AST mode (bypassinginterrogate) because staleness analysis requires reading docstring content, whichinterrogatedoesn't expose.- The tool finds the project root by looking for
CLAUDE.mdor.git. Run from anywhere within the repo. - Install
interrogate(pip install interrogate) for more accurate Python coverage. The tool falls back toast.parse()gracefully.
Architecture
docstring_coverage.py
├── Shared
│ └── _extract_verified(doc_text) ← single @verified regex for all languages
├── Python scanner
│ ├── interrogate (subprocess) ← preferred, AST-based
│ └── ast.parse fallback ← extracts docstring text for @verified
├── TypeScript scanner
│ └── _extract_jsdoc_above(lines, i) ← returns JSDoc text or None
├── Rust scanner
│ └── _extract_rust_doc_above(lines, i) ← returns /// text or None
├── Git helpers
│ ├── _git_file_last_commit(path) ← one call per file
│ └── _git_commit_timestamp(hash) ← one call per unique @verified hash
└── Staleness analysis
├── compute_staleness(results) ← batch resolve, build entries
└── print_staleness_report(entries) ← tristate report
Each language scanner follows the same pattern:
1. Extract the full doc comment text (language-specific parsing)
2. Pass it to _extract_verified() (shared regex)
3. Store the result on DocItem.verified_commit
This keeps the @verified tag format consistent across languages
while letting each scanner handle its own doc comment syntax.