ADR-200 Phase 3a: Annealing Controls Exercise Report
Date: 2026-01-30 Epoch range: 0 → 13 Ontologies tested: distributed-systems, database-architecture, consensus-theory (created and dissolved)
Setup
Two ontologies created with intentional thematic overlap:
- distributed-systems — consensus protocols, fault tolerance, distributed storage, eventual consistency, service discovery, network partitions, plus ADRs (072, 014, 200)
- database-architecture — storage engines, query optimization, replication/transactions, indexing, vacuum/bloat, distributed databases, plus ADRs (016, 024, 040)
Bridging concepts expected: replication, consistency, WAL, LSM-trees, CockroachDB, Raft consensus.
Operations Log (Chronological)
- Created ontologies:
distributed-systems,database-architectureviakg ontology create - Ingested 3 seed documents into each ontology (consensus protocols, fault tolerance, distributed storage / storage engines, query optimization, replication) via MCP
ingesttool withaction="text"(MCP file allowlist not yet configured) - Scored both ontologies → Round 1 at epoch 0 (3 docs each, ~21 concepts each)
- Ingested 4 more documents into distributed-systems (eventual consistency, service discovery, network partitions, vector clocks) and 3 into database-architecture (indexing, vacuum/bloat, distributed databases)
- Scored both → Round 2 at epoch 7 (7/6 docs, ~44/40 concepts)
- Discovered epoch counter bug:
document_ingestion_counternever incremented by ingestion worker. Fixed by addingincrement_counter('document_ingestion_counter')call at job completion iningestion_worker.py - Ingested 3 ADR documents into each ontology (ADR-072, ADR-014, ADR-200 into distributed-systems; ADR-016, ADR-024, ADR-040 into database-architecture) — again via
action="text"workaround - Scored both → Round 3 at epoch 13 (10/9 docs, ~64/63 concepts)
- Configured MCP file allowlist:
kg mcp-config init-allowlist+kg mcp-config allow-dirfor project directory - Reassign: moved 3 ADR sources (ADR-072, ADR-014, ADR-200) from distributed-systems → database-architecture. Required raw SQL to discover source IDs (gap #245). Re-scored both.
- Split: created
consensus-theoryontology, reassigned 2 consensus-related sources from distributed-systems. Scored the new ontology. - Dissolve: dissolved
consensus-theoryback into distributed-systems. Verified scores returned to pre-split values. - Checked cross-ontology affinity at each round (3 → 4 → 5 shared concepts)
Scoring Progression
distributed-systems
| Metric | Round 1 (ep 0, 3 docs) | Round 2 (ep 7, 7 docs) | Round 3 (ep 13, 10 docs) | Post-reassign (ep 13, 7 docs) |
|---|---|---|---|---|
| Concepts | 21 | 44 | 64 | 44 |
| Mass | 0.322 | 0.501 | 0.583 | 0.501 |
| Coherence | 0.604 | 0.590 | 0.565 | 0.590 |
| Exposure | 0.000 | 0.123 | 0.206 | 0.206 |
| Protection | 0.228 | 0.231 | 0.208 | 0.179 |
database-architecture
| Metric | Round 1 (ep 0, 3 docs) | Round 2 (ep 7, 6 docs) | Round 3 (ep 13, 9 docs) | Post-reassign (ep 13, 12 docs) |
|---|---|---|---|---|
| Concepts | 21 | 40 | 63 | 63+ |
| Mass | 0.327 | 0.495 | 0.611 | 0.662 |
| Coherence | 0.602 | 0.588 | 0.577 | 0.564 |
| Exposure | 0.000 | 0.123 | 0.206 | 0.206 |
| Protection | 0.229 | 0.226 | 0.227 | 0.248 |
Observations
Mass (Michaelis-Menten saturation)
Working as designed. Mass climbs steeply in the early phase (0.32 → 0.50 on doubling content), then begins to saturate (0.50 → 0.58 on another ~50% increase). The curve flattens as ontologies grow, which is the intended behavior — raw size matters less as substance accumulates.
Coherence (mean pairwise cosine similarity)
Gradual decline with growth: 0.604 → 0.590 → 0.565. This correctly reflects that broader topic coverage dilutes semantic focus. An ontology covering "consensus protocols" alone is tighter than one also covering "network partitions" and "service discovery." The decline is gentle, not catastrophic — the topics are related enough to maintain reasonable coherence.
Notably, the small consensus-theory ontology (2 sources, split off then dissolved) scored 0.663 coherence — the highest observed. Small + focused = high coherence. This is the signal a annealing worker would use to identify "nuclei" vs "crossroads."
Exposure (epoch-based pressure)
Appeared once the epoch counter started incrementing (fixed during this session — ingestion worker wasn't calling increment_counter). At epoch 13, both ontologies show ~0.21 raw exposure. With a half-life of 50 epochs, this is still in the gentle ramp-up phase. The formula is working but needs more epoch depth to see meaningful pressure differentiation.
Protection (composite score)
Protection stayed relatively flat across all rounds (0.228 → 0.231 → 0.208 → 0.179 for distributed-systems). Mass gains are being offset by exposure pressure. This is arithmetically correct per the formula, but raises a question: should young ontologies with rapidly growing mass feel this much exposure pressure?
At epoch 13 with mass 0.50, a protection score of 0.18 feels low. The ontology is healthy and growing. A possible tuning: reduce exposure weight for ontologies below a mass threshold, or adjust the sigmoid steepness in the protection formula so that mass gains have more impact in the 0.3–0.6 range.
Not actionable yet — we need to see how the curve behaves at epoch 50+ with mature ontologies. Logging this observation for Phase 3b tuning.
Cross-Ontology Affinity
Affinity tracked shared concepts accurately:
| Round | Shared Concepts | Total | Affinity % |
|---|---|---|---|
| Round 1 | 3 | 23 | 13.0% |
| Round 2 | 4 | 44 | 9.1% |
| Round 3 | 5 | 64 | 7.8% |
Absolute overlap grew (3 → 5 shared concepts), but proportional affinity declined as ontologies accumulated more distinct content. This is the expected behavior — casual overlap becomes proportionally smaller relative to domain-specific content.
The AnnealingTest ontologies from earlier testing also appeared at 1.6% affinity after ADR-200 was ingested — the annealing ontologies ADR created a tiny bridge.
Operations Exercised
Reassign (sources between ontologies)
Moved 3 ADR sources from distributed-systems to database-architecture. Results:
- Donor (distributed-systems): mass dropped 0.583 → 0.501, coherence rose 0.565 → 0.590
- Recipient (database-architecture): mass rose 0.611 → 0.662, coherence dropped 0.577 → 0.564
Mass follows source count. Coherence inversely reflects breadth — losing broad content tightens focus, gaining it dilutes. Both responses are correct and expected.
Split (create + reassign)
Created consensus-theory, moved 2 consensus-related sources into it:
- consensus-theory: mass 0.225, coherence 0.663, protection 0.188
- Highest coherence observed — small + focused ontologies are semantically tight
This is exactly the signal Phase 3b's annealing worker would look for: high coherence suggests a natural cluster, but low mass and protection make it a demotion candidate unless it grows.
Dissolve (non-destructive absorption)
Dissolved consensus-theory back into distributed-systems. Scores returned to pre-split values exactly. The round-trip is clean — no data lost, no score artifacts.
Bugs Found and Fixed
-
Epoch counter not incrementing —
document_ingestion_counteringraph_metricswas never incremented by the ingestion worker. Addedincrement_counter('document_ingestion_counter')call at job completion. Fixed in this session. -
MCP file allowlist not configured — MCP ingest with
action="file"failed silently because~/.config/kg/mcp-allowed-paths.jsondidn't exist. Fixed by runningkg mcp-config init-allowlistandkg mcp-config allow-dir.
Gaps Identified
-
Source ID discoverability (#245) — The files endpoint, documents endpoint, and CLI don't expose source IDs needed for reassignment. Had to use raw SQL to find them. The reassign workflow needs a self-contained path from "show me what's in this ontology" to "here are the IDs to move."
-
Epoch tagging on concepts — Concepts don't carry a
created_at_epochproperty. This would provide a causal timeline of graph growth and enable questions like "what appeared after epoch N?" and "is this ontology's concept-per-epoch ratio declining?" The data for a derived ratio already exists (concept_creation_counter / document_ingestion_counter), but per-concept tagging would be more granular. -
Protection sensitivity — Protection stays flat as mass grows because exposure pressure offsets gains. Needs observation at higher epoch depth before tuning.
Conclusions
The Phase 3a annealing controls work as designed. Scoring responds to real structural changes. Reassignment and dissolution are clean operations with predictable score impacts. The tools are ready for Phase 3b automation — the manual controls prove the primitives are sound.
Key validation: the system correctly distinguishes focused clusters (high coherence, low mass) from broad collections (lower coherence, high mass). This is the foundation for automated promotion/demotion decisions.