Database Schema
Relational schema for the Kappa Graph control plane. The knowledge graph itself (concepts, sources, instances, and their typed edges) lives in the Apache AGE knowledge_graph graph; the tables below hold operational state, authorization, and observability around it.
Backed by PostgreSQL 18 with Apache AGE 1.7.0. This page is generated from schema/00_baseline.sql and schema/migrations/*.sql; do not edit it by hand.
Schemas
| Schema | Purpose | Tables |
|---|---|---|
public |
Cross-schema bookkeeping (migration tracking). | 2 |
kg_api |
API operational state: jobs, sessions, vocabulary, ontology. | 40 |
kg_auth |
Authentication and authorization (dynamic RBAC). | 14 |
kg_logs |
Observability: audit trails, metrics, health. | 4 |
public
Cross-schema bookkeeping (migration tracking).
graph_metrics
Change counters for triggering periodic epistemic status measurement
| Column | Type | Constraints | Description |
|---|---|---|---|
metric_name |
character varying(255) |
PK; NOT NULL | Unique metric identifier (e.g., vocabulary_change_counter, concept_count) |
counter |
bigint |
NOT NULL; DEFAULT 0 | Increments on every change (create/delete/consolidate) - never decrements |
last_measured_counter |
bigint |
NOT NULL; DEFAULT 0 | Counter value when epistemic status was last measured |
last_measured_at |
timestamp without time zone |
Timestamp when epistemic status was last measured | |
updated_at |
timestamp without time zone |
DEFAULT CURRENT_TIMESTAMP | Timestamp of last counter increment |
notes |
text |
schema_migrations
Tracks applied schema migrations for safe schema evolution - ADR-040
| Column | Type | Constraints | Description |
|---|---|---|---|
version |
integer |
PK; NOT NULL | Sequential migration number (001, 002, 003, ...) |
name |
text |
NOT NULL | Descriptive migration name (e.g., baseline, add_embedding_config) |
applied_at |
timestamp without time zone |
NOT NULL; DEFAULT now() | Timestamp when migration was applied |
kg_api
API operational state: jobs, sessions, vocabulary, ontology.
Relationships
erDiagram
query_definitions ||--o{ artifacts : "query_definition_id"
users ||--o{ artifacts : "owner_id"
ontology_versions ||--o{ concept_version_metadata : "created_in_version"
ontology_versions ||--o{ concept_version_metadata : "last_modified_version"
graph_epoch_kinds ||--o{ graph_epochs : "kind"
artifacts ||--o{ jobs : "artifact_id"
users ||--o{ jobs : "user_id"
users ||--o{ query_definitions : "owner_id"
relationship_vocabulary ||--o{ synonym_clusters : "representative_type"
embedding_generation_jobs ||--o{ system_initialization_status : "initialization_job_id"
aggressiveness_profiles
| Column | Type | Constraints | Description |
|---|---|---|---|
profile_name |
character varying(50) |
PK; NOT NULL | |
control_x1 |
double precision |
NOT NULL | |
control_y1 |
double precision |
NOT NULL | |
control_x2 |
double precision |
NOT NULL | |
control_y2 |
double precision |
NOT NULL | |
description |
text |
||
is_builtin |
boolean |
DEFAULT false | |
created_at |
timestamp without time zone |
DEFAULT now() | |
updated_at |
timestamp without time zone |
DEFAULT now() |
Table constraints:
CONSTRAINT aggressiveness_profiles_control_x1_check CHECK (((control_x1 >= (0.0)::double precision) AND (control_x1 <= (1.0)::double precision)))CONSTRAINT aggressiveness_profiles_control_x2_check CHECK (((control_x2 >= (0.0)::double precision) AND (control_x2 <= (1.0)::double precision)))
ai_extraction_config
AI extraction provider configuration for runtime-switchable models - ADR-041
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.ai_extraction_config_id_seq'::regclass) | |
provider |
character varying(50) |
NOT NULL; UNIQUE | AI provider: openai, anthropic, ollama, or vllm |
model_name |
character varying(200) |
NOT NULL | Model identifier (e.g., gpt-4o, claude-sonnet-4-20250514) |
supports_vision |
boolean |
DEFAULT false | Whether the model supports vision/image inputs |
supports_json_mode |
boolean |
DEFAULT true | Whether the model supports JSON mode for structured outputs |
max_tokens |
integer |
Maximum token limit for the model | |
created_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_by |
character varying(100) |
||
active |
boolean |
DEFAULT true | Only one config can be active at a time (enforced by unique index) |
base_url |
character varying(255) |
Base URL for local providers (e.g., http://localhost:11434 for Ollama) | |
temperature |
double precision |
DEFAULT 0.1 | Sampling temperature (0.0-1.0, lower = more consistent). Used by local providers. |
top_p |
double precision |
DEFAULT 0.9 | Nucleus sampling threshold (0.0-1.0). Used by local providers. |
gpu_layers |
integer |
DEFAULT '-1'::integer | GPU layers for inference: -1 = auto, 0 = CPU only, >0 = specific layer count (llama.cpp) |
num_threads |
integer |
DEFAULT 4 | CPU threads for inference (used by local CPU-based providers) |
thinking_mode |
character varying(20) |
DEFAULT 'off'::character varying | Thinking mode for reasoning models (Ollama 0.12.x+): off, low, medium, high. GPT-OSS: off=low, others pass through. Standard models: off=disabled, low/medium/high=enabled. |
max_concurrent_requests |
integer |
DEFAULT 4 | Maximum number of concurrent API requests allowed for this provider. Limits parallelism to prevent rate limit errors and resource thrashing. Recommended: OpenAI=8, Anthropic=4, Ollama=1 |
max_retries |
integer |
DEFAULT 8 | Maximum number of retry attempts for rate-limited requests (429 errors). Uses exponential backoff with jitter: 1s, 2s, 4s, 8s, 16s, 32s, 64s, ... Higher values provide more resilience with multiple workers. Recommended: 8 for cloud providers, 3 for local |
Table constraints:
CONSTRAINT ai_extraction_config_max_concurrent_requests_check CHECK (((max_concurrent_requests >= 1) AND (max_concurrent_requests <= 100)))CONSTRAINT ai_extraction_config_max_retries_check CHECK (((max_retries >= 0) AND (max_retries <= 20)))CONSTRAINT ai_extraction_config_provider_check CHECK (((provider)::text = ANY ((ARRAY['openai'::character varying, 'anthropic'::character varying, 'ollama'::character varying, 'openrouter'::character varying, 'llamacpp'::character varying])::text[])))CONSTRAINT ai_extraction_config_temperature_check CHECK (((temperature >= (0.0)::double precision) AND (temperature <= (1.0)::double precision)))CONSTRAINT ai_extraction_config_thinking_mode_check CHECK (((thinking_mode)::text = ANY ((ARRAY['off'::character varying, 'low'::character varying, 'medium'::character varying, 'high'::character varying])::text[])))CONSTRAINT ai_extraction_config_top_p_check CHECK (((top_p >= (0.0)::double precision) AND (top_p <= (1.0)::double precision)))
ai_vision_config
Active vision (image->prose) provider selection — ADR-802 / #378. Selection-only; connectivity reused from per-provider config.
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.ai_vision_config_id_seq'::regclass) | |
provider |
character varying(50) |
NOT NULL; UNIQUE | Provider performing image->prose description |
model_name |
character varying(200) |
NOT NULL; DEFAULT ''::character varying | Vision model id; '' resolves from the catalog supports_vision rows |
max_tokens |
integer |
||
temperature |
double precision |
||
created_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_by |
character varying(100) |
||
active |
boolean |
DEFAULT true | Only one vision config active at a time (enforced by partial unique index) |
Table constraints:
CONSTRAINT ai_vision_config_provider_check CHECK (((provider)::text = ANY ((ARRAY['openai'::character varying, 'anthropic'::character varying, 'ollama'::character varying, 'openrouter'::character varying, 'llamacpp'::character varying])::text[])))CONSTRAINT ai_vision_config_temperature_check CHECK (((temperature IS NULL) OR ((temperature >= (0.0)::double precision) AND (temperature <= (1.0)::double precision))))
annealing_options
Tunable parameters for ontology annealing cycles (ADR-200 Phase 3b). Code defaults apply when a key is absent; database values override.
| Column | Type | Constraints | Description |
|---|---|---|---|
key |
character varying(100) |
PK; NOT NULL | |
value |
text |
NOT NULL | |
description |
text |
||
updated_at |
timestamp with time zone |
DEFAULT now() |
annealing_pressure_history
One row per annealing cycle: ecological snapshot + Bezier pressure read-out (#249, ADR-206 §Phase 3). Drives the web admin "pressure" panel and the future trend chart.
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.annealing_pressure_history_id_seq'::regclass) | |
epoch |
integer |
NOT NULL | |
total_ontologies |
integer |
NOT NULL | |
total_concepts |
integer |
NOT NULL | |
avg_concepts_per_ontology |
double precision |
NOT NULL | |
pressure_score |
double precision |
NOT NULL | |
pressure_zone |
character varying(20) |
NOT NULL | |
pressure_recommendation |
jsonb |
NOT NULL; DEFAULT '{}'::jsonb | |
recorded_at |
timestamp with time zone |
NOT NULL; DEFAULT now() |
Table constraints:
CONSTRAINT annealing_pressure_history_pressure_score_check CHECK (((pressure_score >= (0.0)::double precision) AND (pressure_score <= (1.0)::double precision)))
annealing_proposals
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.annealing_proposals_id_seq'::regclass) | |
proposal_type |
character varying(20) |
NOT NULL | |
ontology_name |
character varying(200) |
NOT NULL | |
anchor_concept_id |
character varying(100) |
||
target_ontology |
character varying(200) |
||
reasoning |
text |
NOT NULL | |
mass_score |
numeric(10,4) |
||
coherence_score |
numeric(10,4) |
||
protection_score |
numeric(10,4) |
||
status |
character varying(20) |
NOT NULL; DEFAULT 'pending'::character varying | |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
created_at_epoch |
integer |
NOT NULL; DEFAULT 0 | |
reviewed_at |
timestamp with time zone |
||
reviewed_by |
character varying(100) |
||
reviewer_notes |
text |
||
expires_at |
timestamp with time zone |
DEFAULT (now() + '7 days'::interval) | |
executed_at |
timestamp with time zone |
||
execution_result |
jsonb |
||
suggested_name |
character varying(200) |
||
suggested_description |
text |
||
proposal_kind |
character varying(20) |
NOT NULL; DEFAULT 'ontology'::character varying | |
params |
jsonb |
Table constraints:
CONSTRAINT annealing_proposals_proposal_kind_check CHECK (((proposal_kind)::text = ANY ((ARRAY['ontology'::character varying, 'control'::character varying])::text[])))CONSTRAINT annealing_proposals_proposal_type_check CHECK (((proposal_type)::text = ANY ((ARRAY['CLEAVE'::character varying, 'DISSOLVE'::character varying, 'MERGE'::character varying, 'RENAME'::character varying, 'NO_ACTION'::character varying, 'ESCALATE'::character varying, 'ADJUST_CONTROL'::character varying, 'promotion'::character varying, 'demotion'::character varying])::text[])))CONSTRAINT annealing_proposals_status_check CHECK (((status)::text = ANY ((ARRAY['pending'::character varying, 'approved'::character varying, 'rejected'::character varying, 'expired'::character varying, 'executing'::character varying, 'executed'::character varying, 'failed'::character varying])::text[])))
artifacts
Computed artifact metadata with Garage blob pointers (ADR-083)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.artifacts_id_seq'::regclass) | |
artifact_type |
character varying(50) |
NOT NULL | Type of computation: polarity_analysis, projection, etc. |
representation |
character varying(50) |
NOT NULL | Source UI/tool: polarity_explorer, cli, mcp_server, etc. |
name |
character varying(200) |
||
owner_id |
integer |
FK → kg_auth.users(id) | |
graph_epoch |
bigint |
NOT NULL | graph_change_counter at creation for freshness validation |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
expires_at |
timestamp with time zone |
||
parameters |
jsonb |
NOT NULL | |
metadata |
jsonb |
||
inline_result |
jsonb |
Small results (<10KB) stored inline | |
garage_key |
character varying(200) |
Pointer to Garage blob for large results | |
query_definition_id |
integer |
FK → kg_api.query_definitions(id) | |
ontology |
character varying(200) |
||
concept_ids |
text[] |
Concept IDs involved in this artifact |
Table constraints:
CONSTRAINT has_content CHECK (((inline_result IS NOT NULL) OR (garage_key IS NOT NULL)))CONSTRAINT valid_artifact_type CHECK (((artifact_type)::text = ANY ((ARRAY['polarity_analysis'::character varying, 'projection'::character varying, 'query_result'::character varying, 'graph_subgraph'::character varying, 'vocabulary_analysis'::character varying, 'epistemic_measurement'::character varying, 'consolidation_result'::character varying, 'search_result'::character varying, 'connection_path'::character varying, 'report'::character varying, 'stats_snapshot'::character varying])::text[])))CONSTRAINT valid_representation CHECK (((representation)::text = ANY ((ARRAY['polarity_explorer'::character varying, 'embedding_landscape'::character varying, 'block_builder'::character varying, 'edge_explorer'::character varying, 'vocabulary_chord'::character varying, 'force_graph_2d'::character varying, 'force_graph_3d'::character varying, 'report_workspace'::character varying, 'cli'::character varying, 'mcp_server'::character varying, 'api_direct'::character varying])::text[])))
catalog_edge
ADR-501: parent->child membership edges projecting canonical :SCOPED_BY (ontology<-document) and :HAS_SOURCE/:APPEARS (document<-concept). A concept may have many parent documents (DAG).
| Column | Type | Constraints | Description |
|---|---|---|---|
parent_kind |
character varying(16) |
PK; NOT NULL | |
parent_id |
text |
PK; NOT NULL | |
child_kind |
character varying(16) |
PK; NOT NULL | |
child_id |
text |
PK; NOT NULL | |
graph_epoch |
bigint |
NOT NULL |
catalog_node
ADR-501: materialized identity/metadata for catalog nodes (ontology/document/concept). Source of truth is the AGE graph; rebuilt on graph epoch advance.
| Column | Type | Constraints | Description |
|---|---|---|---|
kind |
character varying(16) |
PK; NOT NULL | |
node_id |
text |
PK; NOT NULL | |
name |
text |
NOT NULL | |
name_lower |
text |
NOT NULL | |
child_count |
integer |
NOT NULL; DEFAULT 0 | Number of direct children (documents-in-ontology, concepts-in-document); 0 for leaf concepts. |
content_type |
character varying(32) |
||
properties |
jsonb |
NOT NULL; DEFAULT '{}'::jsonb | |
graph_epoch |
bigint |
NOT NULL | graph_change_counter at build time; compared to kg_api.get_graph_epoch() for staleness. |
indexed_at |
timestamp with time zone |
NOT NULL; DEFAULT now() |
concept_access_stats
Node-level access patterns for caching - ADR-025
| Column | Type | Constraints | Description |
|---|---|---|---|
concept_id |
character varying(100) |
PK; NOT NULL | |
access_count |
integer |
DEFAULT 0 | |
last_accessed |
timestamp with time zone |
||
avg_query_time_ms |
numeric(10,2) |
||
queries_as_start |
integer |
DEFAULT 0 | |
queries_as_result |
integer |
DEFAULT 0 |
concept_version_metadata
| Column | Type | Constraints | Description |
|---|---|---|---|
concept_id |
character varying(100) |
PK; NOT NULL | |
created_in_version |
integer |
FK → kg_api.ontology_versions(version_id) | |
last_modified_version |
integer |
FK → kg_api.ontology_versions(version_id) |
edge_usage_stats
Performance tracking for graph traversals - ADR-025
| Column | Type | Constraints | Description |
|---|---|---|---|
from_concept_id |
character varying(100) |
PK; NOT NULL | |
to_concept_id |
character varying(100) |
PK; NOT NULL | |
relationship_type |
character varying(100) |
PK; NOT NULL | |
traversal_count |
integer |
DEFAULT 0 | |
last_traversed |
timestamp with time zone |
||
avg_query_time_ms |
numeric(10,2) |
embedding_config_legacy
Resource-aware embedding configuration for local and remote models - ADR-039. Includes preset for nomic-embed-text-v1.5.
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.embedding_config_id_seq'::regclass) | |
provider |
character varying(50) |
NOT NULL | Embedding provider: local (sentence-transformers) or openai |
model_name |
character varying(200) |
NOT NULL | Model identifier (HuggingFace ID for local, OpenAI model name for remote) |
embedding_dimensions |
integer |
NOT NULL | |
precision |
character varying(20) |
NOT NULL | |
max_memory_mb |
integer |
Maximum RAM allocation for local model (local provider only) | |
num_threads |
integer |
CPU threads for inference (local provider only) | |
device |
character varying(20) |
Compute device: cpu, cuda, or mps (local provider only) | |
batch_size |
integer |
DEFAULT 8 | Batch size for embedding generation |
max_seq_length |
integer |
||
normalize_embeddings |
boolean |
DEFAULT true | |
created_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_by |
character varying(100) |
||
active |
boolean |
DEFAULT true | Only one config can be active at a time (enforced by unique constraint) |
delete_protected |
boolean |
DEFAULT false | Prevents deletion without first removing protection (default configs) |
change_protected |
boolean |
DEFAULT false | Prevents changing provider/dimensions without explicit unlock (safety) |
Table constraints:
CONSTRAINT embedding_config_device_check CHECK (((device)::text = ANY ((ARRAY['cpu'::character varying, 'cuda'::character varying, 'mps'::character varying])::text[])))CONSTRAINT embedding_config_precision_check CHECK ((("precision")::text = ANY ((ARRAY['float16'::character varying, 'float32'::character varying])::text[])))CONSTRAINT embedding_config_provider_check CHECK (((provider)::text = ANY ((ARRAY['local'::character varying, 'openai'::character varying])::text[])))
embedding_generation_jobs
ADR-045: Tracks embedding generation jobs for audit trail and progress monitoring
| Column | Type | Constraints | Description |
|---|---|---|---|
job_id |
uuid |
PK; NOT NULL; DEFAULT gen_random_uuid() | |
job_type |
character varying(50) |
NOT NULL | |
target_types |
character varying(100)[] |
||
target_count |
integer |
||
status |
character varying(20) |
NOT NULL; DEFAULT 'pending'::character varying | |
processed_count |
integer |
DEFAULT 0 | |
failed_count |
integer |
DEFAULT 0 | |
embedding_model |
character varying(100) |
||
embedding_provider |
character varying(50) |
||
created_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
started_at |
timestamp with time zone |
||
completed_at |
timestamp with time zone |
||
duration_ms |
integer |
||
result_summary |
jsonb |
||
error_message |
text |
Table constraints:
CONSTRAINT embedding_generation_jobs_job_type_check CHECK (((job_type)::text = ANY ((ARRAY['cold_start'::character varying, 'vocabulary_update'::character varying, 'model_migration'::character varying, 'batch_regeneration'::character varying])::text[])))CONSTRAINT embedding_generation_jobs_status_check CHECK (((status)::text = ANY ((ARRAY['pending'::character varying, 'running'::character varying, 'completed'::character varying, 'failed'::character varying, 'cancelled'::character varying])::text[])))
embedding_profile
Unified embedding profile with text + image model slots. Replaces embedding_config.
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.embedding_profile_id_seq'::regclass) | |
name |
character varying(200) |
NOT NULL | |
vector_space |
character varying(100) |
NOT NULL | Compatibility key for the universal TEXT/prose space (concepts, edges, docs, image-prose). Profiles with the same text vector_space produce comparable text embeddings. Image embeddings are independent — see image_vector_space (ADR-803). |
multimodal |
boolean |
DEFAULT false | When true, the text model also handles image embeddings (e.g. SigLIP 2) |
text_provider |
character varying(50) |
NOT NULL | |
text_model_name |
character varying(200) |
NOT NULL | |
text_loader |
character varying(50) |
NOT NULL | How to load text model: sentence-transformers, transformers (AutoModel), or api |
text_revision |
character varying(200) |
||
text_dimensions |
integer |
NOT NULL | |
text_precision |
character varying(20) |
DEFAULT 'float16'::character varying | |
text_trust_remote_code |
boolean |
DEFAULT false | |
image_provider |
character varying(50) |
||
image_model_name |
character varying(200) |
||
image_loader |
character varying(50) |
How to load image model: sentence-transformers, transformers (AutoModel), or api | |
image_revision |
character varying(200) |
||
image_dimensions |
integer |
||
image_precision |
character varying(20) |
DEFAULT 'float16'::character varying | |
image_trust_remote_code |
boolean |
DEFAULT false | |
device |
character varying(20) |
DEFAULT 'cpu'::character varying | |
max_memory_mb |
integer |
||
num_threads |
integer |
||
batch_size |
integer |
DEFAULT 8 | |
max_seq_length |
integer |
||
normalize_embeddings |
boolean |
DEFAULT true | |
active |
boolean |
DEFAULT false | |
delete_protected |
boolean |
DEFAULT false | |
change_protected |
boolean |
DEFAULT false | |
created_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_by |
character varying(100) |
||
text_query_prefix |
character varying(200) |
Prefix prepended for search queries (e.g. search_query: ) | |
text_document_prefix |
character varying(200) |
Prefix prepended for stored documents (e.g. search_document: ) | |
image_vector_space |
character varying(100) |
Independent vector_space of the image (modality) embedding index (ADR-803). NULL for text-only / multimodal profiles. Never compared to text vector_space. |
Table constraints:
CONSTRAINT chk_image_loader CHECK ((((image_loader)::text = ANY ((ARRAY['sentence-transformers'::character varying, 'transformers'::character varying, 'api'::character varying])::text[])) OR (image_loader IS NULL)))CONSTRAINT chk_multimodal_no_image CHECK (((NOT multimodal) OR ((image_provider IS NULL) AND (image_model_name IS NULL) AND (image_loader IS NULL) AND (image_dimensions IS NULL))))CONSTRAINT chk_text_loader CHECK (((text_loader)::text = ANY ((ARRAY['sentence-transformers'::character varying, 'transformers'::character varying, 'api'::character varying])::text[])))
graph_epoch_kinds
ADR-203: Discriminator for graph_epochs.kind. semantic_wallclock distinguishes events whose occurred_at is semantically primary (ingestion, edit) from those where it is forensic-only (reasoning, annealing).
| Column | Type | Constraints | Description |
|---|---|---|---|
kind |
text |
PK; NOT NULL | |
semantic_wallclock |
boolean |
NOT NULL | When TRUE, occurred_at is the meaningful timestamp for downstream consumers. When FALSE, occurred_at is recorded for audit/forensics but should not drive time-based queries on the resulting graph state. |
description |
text |
graph_epochs
ADR-203: Monotonic event log of graph mutations. Distinct from graph_change_counter (ADR-079) which is a composite cache-invalidation checksum.
| Column | Type | Constraints | Description |
|---|---|---|---|
event_id |
bigint |
PK; NOT NULL; DEFAULT nextval('kg_api.graph_epochs_event_id_seq'::regclass) | Monotonic logical-time id. Foreign-keyed by Instance.created_at_event_id. |
occurred_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
kind |
text |
NOT NULL; FK → kg_api.graph_epoch_kinds(kind) | ingestion | reasoning | annealing | edit. Determines whether occurred_at is semantically meaningful for the rows attributable to this event. |
actor |
text |
||
counter_after |
bigint |
||
metadata |
jsonb |
NOT NULL; DEFAULT '{}'::jsonb | |
status |
text |
NOT NULL; DEFAULT 'in_progress'::text | ADR-207/#384: in_progress (set at record_graph_epoch) | completed | failed. Only in_progress blocks the committed watermark — both completed and failed count toward it (per-chunk commits mean a failed job may have mutated the graph). The completed/failed split is for analytics (drop zero-instance jobs from hot/stale signals), not for freshness. |
Table constraints:
CONSTRAINT graph_epochs_status_check CHECK ((status = ANY (ARRAY['in_progress'::text, 'completed'::text, 'failed'::text])))
jobs
Unified job queue for all background tasks (ingestion, backup, vocab, scheduled)
| Column | Type | Constraints | Description |
|---|---|---|---|
job_id |
text |
PK; NOT NULL | Unique job identifier (UUID) |
job_type |
text |
NOT NULL | Type of job: ingestion, restore, backup, vocab_refresh, vocab_consolidate |
content_hash |
text |
SHA256 hash for deduplication (used with ontology to detect duplicates) | |
ontology |
text |
Target ontology for the job | |
status |
text |
NOT NULL | Job status: pending_approval, approved, running, completed, failed, cancelled |
progress |
text |
Progress message for UI display | |
result |
text |
Final result data (JSON) | |
error |
text |
Error message if failed | |
created_at |
timestamp without time zone |
NOT NULL; DEFAULT now() | |
started_at |
timestamp without time zone |
||
completed_at |
timestamp without time zone |
||
job_data |
jsonb |
NOT NULL | Job-specific parameters (JSON) |
analysis |
text |
Pre-approval analysis (cost/time estimates) | |
approved_at |
timestamp without time zone |
When job was approved by user | |
approved_by |
text |
Who approved the job | |
expires_at |
timestamp without time zone |
When pending approval expires | |
processing_mode |
text |
DEFAULT 'serial'::text | Execution mode: serial or parallel |
job_source |
character varying(50) |
DEFAULT 'user_cli'::character varying | Source of job creation: user_cli, user_api, scheduled_task, system |
created_by |
character varying(100) |
DEFAULT 'unknown'::character varying | User or system identifier that created the job |
is_system_job |
boolean |
DEFAULT false | True for system-scheduled jobs (cannot be deleted by users) |
user_id |
integer |
NOT NULL; FK → kg_auth.users(id) | User who submitted the job (FK to kg_auth.users.id) |
source_filename |
text |
Display name for source: filename, "stdin", or MCP session ID (best-effort metadata) | |
source_type |
text |
Ingestion method: file (CLI file), stdin (pipe), mcp (Claude), api (direct) - enables source-aware queries | |
source_path |
text |
Full filesystem path for file ingestion (null for stdin/mcp/api) - helps identify exact source file | |
source_hostname |
text |
Hostname where ingestion initiated (CLI only, null for MCP/API) - useful for distributed deployments | |
artifact_id |
integer |
FK → kg_api.artifacts(id) | Artifact created by this job (ADR-083). NULL for jobs that do not produce artifacts. |
priority |
integer |
NOT NULL; DEFAULT 0 | |
claimed_by |
text |
||
claimed_at |
timestamp with time zone |
||
cancelled |
boolean |
NOT NULL; DEFAULT false | |
retries |
integer |
NOT NULL; DEFAULT 0 | |
max_retries |
integer |
NOT NULL; DEFAULT 3 |
Table constraints:
CONSTRAINT chk_source_type CHECK (((source_type IS NULL) OR (source_type = ANY (ARRAY['file'::text, 'stdin'::text, 'mcp'::text, 'api'::text]))))
ontology_tombstones
Positive operator-intent signal that an ontology was deliberately removed and must not be silently recreated by a subsequent ingest (#402 Defect B2). Operator-initiated delete writes a row; annealing dissolution does not.
| Column | Type | Constraints | Description |
|---|---|---|---|
name |
character varying(200) |
PK; NOT NULL | |
removed_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
removed_by |
character varying(100) |
||
reason |
text |
ontology_versions
Formal ontology versioning with immutable snapshots - ADR-026
| Column | Type | Constraints | Description |
|---|---|---|---|
version_id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.ontology_versions_version_id_seq'::regclass) | |
version_number |
character varying(20) |
NOT NULL; UNIQUE | |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
created_by |
character varying(100) |
||
change_summary |
text |
||
is_active |
boolean |
DEFAULT true | |
vocabulary_snapshot |
jsonb |
NOT NULL | |
types_added |
text[] |
||
types_aliased |
jsonb |
||
types_deprecated |
text[] |
||
backward_compatible |
boolean |
DEFAULT true | |
migration_required |
boolean |
DEFAULT false |
platform_config
Platform lifecycle configuration for operator control plane (ADR-061)
| Column | Type | Constraints | Description |
|---|---|---|---|
key |
character varying(100) |
PK; NOT NULL | |
value |
text |
NOT NULL | |
description |
text |
||
updated_at |
timestamp with time zone |
DEFAULT now() | |
updated_by |
character varying(100) |
DEFAULT 'system'::character varying |
provider_model_catalog
Cached model catalog per AI provider with curation and pricing (ADR-800)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.provider_model_catalog_id_seq'::regclass) | |
provider |
character varying(50) |
NOT NULL | |
model_id |
character varying(300) |
NOT NULL | |
display_name |
character varying(300) |
||
category |
character varying(50) |
NOT NULL | |
context_length |
integer |
||
max_completion_tokens |
integer |
||
supports_vision |
boolean |
DEFAULT false | |
supports_json_mode |
boolean |
DEFAULT false | |
supports_tool_use |
boolean |
DEFAULT false | |
supports_streaming |
boolean |
DEFAULT true | |
price_prompt_per_m |
numeric |
||
price_completion_per_m |
numeric |
||
price_cache_read_per_m |
numeric |
||
enabled |
boolean |
DEFAULT false | |
is_default |
boolean |
DEFAULT false | |
sort_order |
integer |
DEFAULT 0 | |
upstream_provider |
character varying(100) |
||
raw_metadata |
jsonb |
||
fetched_at |
timestamp with time zone |
||
created_at |
timestamp with time zone |
DEFAULT now() | |
updated_at |
timestamp with time zone |
DEFAULT now() |
Table constraints:
UNIQUE (provider, model_id, category)
pruning_recommendations
Pending vocabulary management actions - ADR-032
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.pruning_recommendations_id_seq'::regclass) | |
relationship_type |
character varying(100) |
NOT NULL | |
target_type |
character varying(100) |
||
action_type |
character varying(50) |
NOT NULL | |
review_level |
character varying(20) |
NOT NULL | |
reasoning |
text |
NOT NULL | |
similarity |
numeric(4,3) |
||
value_score |
numeric(10,2) |
||
metadata |
jsonb |
||
status |
character varying(50) |
NOT NULL; DEFAULT 'pending'::character varying | |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
reviewed_at |
timestamp with time zone |
||
reviewed_by |
character varying(100) |
||
reviewer_notes |
text |
||
executed_at |
timestamp with time zone |
||
expires_at |
timestamp with time zone |
Table constraints:
CONSTRAINT pruning_recommendations_action_type_check CHECK (((action_type)::text = ANY ((ARRAY['merge'::character varying, 'prune'::character varying, 'deprecate'::character varying, 'skip'::character varying])::text[])))CONSTRAINT pruning_recommendations_review_level_check CHECK (((review_level)::text = ANY ((ARRAY['none'::character varying, 'ai'::character varying, 'human'::character varying])::text[])))CONSTRAINT pruning_recommendations_status_check CHECK (((status)::text = ANY ((ARRAY['pending'::character varying, 'approved'::character varying, 'rejected'::character varying, 'executed'::character varying, 'expired'::character varying])::text[])))
query_definitions
Saved query recipes that can be re-executed (ADR-083)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.query_definitions_id_seq'::regclass) | |
name |
character varying(200) |
NOT NULL | |
definition_type |
character varying(50) |
NOT NULL | Type of query: block_diagram, cypher, search, polarity, connection, exploration, program |
definition |
jsonb |
NOT NULL | Query parameters/structure as JSON |
owner_id |
integer |
FK → kg_auth.users(id) | |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
updated_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
metadata |
jsonb |
DEFAULT '{}'::jsonb | Optional metadata (nodeCount, edgeCount, description, etc.) |
Table constraints:
CONSTRAINT valid_definition_type CHECK (((definition_type)::text = ANY ((ARRAY['block_diagram'::character varying, 'cypher'::character varying, 'search'::character varying, 'polarity'::character varying, 'connection'::character varying, 'exploration'::character varying, 'program'::character varying])::text[])))
rate_limits
| Column | Type | Constraints | Description |
|---|---|---|---|
client_id |
character varying(100) |
PK; NOT NULL | |
endpoint |
character varying(200) |
PK; NOT NULL | |
window_start |
timestamp with time zone |
PK; NOT NULL | |
request_count |
integer |
NOT NULL; DEFAULT 0 |
relationship_vocabulary
Canonical relationship types with embeddings - ADR-025, ADR-032
| Column | Type | Constraints | Description |
|---|---|---|---|
relationship_type |
character varying(100) |
PK; NOT NULL | |
description |
text |
||
category |
character varying(50) |
||
added_by |
character varying(100) |
||
added_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
usage_count |
integer |
DEFAULT 0 | |
is_active |
boolean |
DEFAULT true | |
is_builtin |
boolean |
DEFAULT false | |
synonyms |
character varying(100)[] |
||
deprecation_reason |
text |
||
embedding |
jsonb |
Cached embedding vector (JSONB array) for synonym detection (ADR-032) | |
embedding_model |
character varying(100) |
||
embedding_generated_at |
timestamp with time zone |
||
grounding_contribution |
double precision |
ADR-046: Measures impact on concept grounding strength (0.0-1.0). Higher values indicate this edge type significantly affects truth convergence. | |
last_grounding_calculated |
timestamp with time zone |
ADR-046: Timestamp when grounding metrics were last recalculated. Enables staleness detection. | |
avg_confidence |
double precision |
ADR-046: Average confidence score across all edges of this type. Helps identify low-quality edge types. | |
semantic_diversity |
double precision |
ADR-046: Semantic diversity score (0.0-1.0). High diversity may indicate overly broad type; low diversity may indicate well-defined type. | |
embedding_quality_score |
double precision |
ADR-045: Quality score for embedding (based on validation checks like magnitude, dimensionality) | |
embedding_validation_status |
character varying(20) |
ADR-045: Validation status - stale indicates model changed since generation | |
category_source |
character varying(20) |
DEFAULT 'builtin'::character varying | Source of category assignment: builtin (hand-assigned) or computed (ADR-047) |
category_confidence |
double precision |
Confidence score (0.0-1.0) for computed categories based on max similarity to seed types | |
category_scores |
jsonb |
Full category similarity breakdown as JSON: | |
category_ambiguous |
boolean |
DEFAULT false | True if runner-up category score > 0.70 (potential multi-category type) |
direction_semantics |
character varying(20) |
DEFAULT NULL::character varying | LLM-determined direction: outward (from→to), inward (from←to), bidirectional (symmetric). NULL = not yet determined by LLM. |
Table constraints:
CONSTRAINT relationship_vocabulary_avg_confidence_check CHECK (((avg_confidence >= (0.0)::double precision) AND (avg_confidence <= (1.0)::double precision)))CONSTRAINT relationship_vocabulary_embedding_quality_score_check CHECK (((embedding_quality_score >= (0.0)::double precision) AND (embedding_quality_score <= (1.0)::double precision)))CONSTRAINT relationship_vocabulary_embedding_validation_status_check CHECK (((embedding_validation_status)::text = ANY ((ARRAY['pending'::character varying, 'valid'::character varying, 'invalid'::character varying, 'stale'::character varying])::text[])))CONSTRAINT relationship_vocabulary_grounding_contribution_check CHECK (((grounding_contribution >= (0.0)::double precision) AND (grounding_contribution <= (1.0)::double precision)))CONSTRAINT relationship_vocabulary_semantic_diversity_check CHECK (((semantic_diversity >= (0.0)::double precision) AND (semantic_diversity <= (1.0)::double precision)))
scheduled_jobs
Scheduled background jobs: - category_refresh: Re-integrate LLM-generated vocabulary categories (every 6 hours) - vocab_consolidation: Auto-consolidate vocabulary based on hysteresis thresholds (every 12 hours)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.scheduled_jobs_id_seq'::regclass) | |
name |
character varying(100) |
NOT NULL; UNIQUE | Unique identifier for the scheduled job |
launcher_class |
character varying(255) |
NOT NULL | Python class name in launcher registry (e.g., CategoryRefreshLauncher) |
schedule_cron |
character varying(100) |
NOT NULL | Cron expression for schedule (e.g., "0 /6 * * " = every 6 hours) |
enabled |
boolean |
DEFAULT true | Whether this schedule is active (can be disabled on failure) |
max_retries |
integer |
DEFAULT 5 | Max consecutive failures before auto-disabling schedule |
retry_count |
integer |
DEFAULT 0 | Current consecutive failure count (reset on success or skip) |
last_run |
timestamp without time zone |
Last time the schedule was checked (success, skip, or failure) | |
last_success |
timestamp without time zone |
Last time a job was successfully enqueued | |
last_failure |
timestamp without time zone |
Last time the launcher failed with an exception | |
next_run |
timestamp without time zone |
Calculated next run time (from cron expression or backoff) | |
created_at |
timestamp without time zone |
DEFAULT now() | |
updated_at |
timestamp without time zone |
DEFAULT now() |
schema_migrations
Tracks applied database migrations for backup/restore compatibility. Schema version is included in backups to ensure restore compatibility when database schema evolves. See ADR-015 for details.
| Column | Type | Constraints | Description |
|---|---|---|---|
version |
integer |
PK; NOT NULL | Migration number matching schema/migrations/NNN_*.sql files |
description |
text |
NOT NULL | Human-readable description of what this migration does |
applied_at |
timestamp without time zone |
NOT NULL; DEFAULT now() | When this migration was applied to the database |
sessions
| Column | Type | Constraints | Description |
|---|---|---|---|
session_id |
character varying(100) |
PK; NOT NULL | |
user_id |
integer |
||
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
expires_at |
timestamp with time zone |
NOT NULL | |
last_activity |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
metadata |
jsonb |
skipped_relationships
Capture layer for unmatched relationship types - ADR-025
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.skipped_relationships_id_seq'::regclass) | |
relationship_type |
character varying(100) |
NOT NULL | |
from_concept_label |
character varying(500) |
||
to_concept_label |
character varying(500) |
||
job_id |
character varying(50) |
||
ontology |
character varying(200) |
||
first_seen |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
last_seen |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
occurrence_count |
integer |
DEFAULT 1 | |
sample_context |
jsonb |
Table constraints:
UNIQUE (relationship_type, from_concept_label, to_concept_label)
source_embeddings
ADR-068: Embeddings for source text chunks with offset tracking and hash verification
| Column | Type | Constraints | Description |
|---|---|---|---|
embedding_id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.source_embeddings_embedding_id_seq'::regclass) | |
source_id |
text |
NOT NULL | Reference to Source node in Apache AGE graph |
chunk_index |
integer |
NOT NULL | 0-based chunk number within source (e.g., 0, 1, 2...) |
chunk_strategy |
text |
NOT NULL | Chunking strategy used: sentence, paragraph, semantic, or count |
start_offset |
integer |
NOT NULL | Character offset in Source.full_text where chunk starts (0-based) |
end_offset |
integer |
NOT NULL | Character offset in Source.full_text where chunk ends (exclusive) |
chunk_text |
text |
NOT NULL | Actual chunk content stored for verification (should match Source.full_text[start_offset:end_offset]) |
chunk_hash |
text |
NOT NULL | SHA256 hash of chunk_text - verifies chunk integrity |
source_hash |
text |
NOT NULL | SHA256 hash of Source.full_text - detects when source text changes (stale embedding indicator) |
embedding |
bytea |
NOT NULL | Vector embedding bytes (float16 or float32 array, packed as bytea) |
embedding_model |
text |
NOT NULL | Embedding model name (e.g., "nomic-ai/nomic-embed-text-v1.5", "text-embedding-3-small") |
embedding_dimension |
integer |
NOT NULL | Embedding vector dimension (must match system embedding_config for cosine similarity) |
embedding_provider |
text |
||
created_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
updated_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP |
Table constraints:
CONSTRAINT source_embeddings_check CHECK ((end_offset > start_offset))CONSTRAINT source_embeddings_chunk_index_check CHECK ((chunk_index >= 0))CONSTRAINT source_embeddings_chunk_strategy_check CHECK ((chunk_strategy = ANY (ARRAY['sentence'::text, 'paragraph'::text, 'semantic'::text, 'count'::text])))CONSTRAINT source_embeddings_embedding_dimension_check CHECK ((embedding_dimension > 0))CONSTRAINT source_embeddings_start_offset_check CHECK ((start_offset >= 0))UNIQUE (source_id, chunk_index, chunk_strategy)
synonym_clusters
ADR-046: Tracks groups of synonymous edge types discovered through embedding-based semantic similarity (threshold > 0.85)
| Column | Type | Constraints | Description |
|---|---|---|---|
cluster_id |
uuid |
PK; NOT NULL; DEFAULT gen_random_uuid() | |
representative_type |
character varying(100) |
FK → kg_api.relationship_vocabulary(relationship_type) | The canonical type to use when merging cluster members. Usually has highest usage_count or is builtin. |
member_types |
character varying(100)[] |
||
avg_similarity |
double precision |
Average cosine similarity between all pairs of member embeddings. Higher values indicate stronger synonym relationship. | |
cluster_size |
integer |
||
total_usage_count |
integer |
||
detected_at |
timestamp with time zone |
DEFAULT CURRENT_TIMESTAMP | |
detection_method |
character varying(50) |
DEFAULT 'embedding_similarity'::character varying | |
is_active |
boolean |
DEFAULT true | |
merge_recommended |
boolean |
DEFAULT false | |
merge_completed_at |
timestamp with time zone |
Table constraints:
CONSTRAINT synonym_clusters_avg_similarity_check CHECK (((avg_similarity >= (0.0)::double precision) AND (avg_similarity <= (1.0)::double precision)))
system_api_keys
Encrypted system API keys for LLM providers (ADR-031, ADR-041)
| Column | Type | Constraints | Description |
|---|---|---|---|
provider |
character varying(50) |
PK; NOT NULL | Provider name: openai, anthropic |
encrypted_key |
bytea |
NOT NULL | Fernet-encrypted API key (AES-128-CBC + HMAC-SHA256) |
updated_at |
timestamp with time zone |
DEFAULT now() | Last time key was updated |
validation_status |
character varying(20) |
DEFAULT 'untested'::character varying | API key validation state: valid, invalid, or untested |
last_validated_at |
timestamp with time zone |
Timestamp of last validation check (typically at API startup) | |
validation_error |
text |
Error message from last failed validation attempt |
Table constraints:
CONSTRAINT system_api_keys_validation_status_check CHECK (((validation_status)::text = ANY ((ARRAY['valid'::character varying, 'invalid'::character varying, 'untested'::character varying])::text[])))
system_initialization_status
ADR-045: Tracks completion of system initialization tasks like cold start embedding generation
| Column | Type | Constraints | Description |
|---|---|---|---|
component |
character varying(50) |
PK; NOT NULL | |
initialized |
boolean |
DEFAULT false | |
initialized_at |
timestamp with time zone |
||
initialization_job_id |
uuid |
FK → kg_api.embedding_generation_jobs(job_id) | |
version |
character varying(20) |
||
metadata |
jsonb |
||
last_processed_vocab_change_counter |
bigint |
NOT NULL; DEFAULT 0 CONSTRAINT system_initialization_statu_last_processed_vocab_chang_not_null | Snapshot of vocabulary_change_counter at the time this initialization component last completed embedding work. The cold-start and VocabEmbeddingLauncher paths compare current counter vs. this value to detect new work since the last successful run. Replaces the binary initialized flag for embedding components (the flag stays for non-counter-driven components). Default 0 means "no work has completed yet" — correct initial state. |
vocabulary_audit
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.vocabulary_audit_id_seq'::regclass) | |
relationship_type |
character varying(100) |
||
action |
character varying(50) |
NOT NULL | |
performed_by |
character varying(100) |
||
performed_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
details |
jsonb |
vocabulary_config
System configuration for automatic vocabulary management (ADR-032)
| Column | Type | Constraints | Description |
|---|---|---|---|
key |
character varying(100) |
PK; NOT NULL | |
value |
text |
NOT NULL | |
description |
text |
||
updated_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
updated_by |
character varying(100) |
vocabulary_history
Detailed vocabulary change tracking with context (ADR-032)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.vocabulary_history_id_seq'::regclass) | |
relationship_type |
character varying(100) |
NOT NULL | |
action |
character varying(50) |
NOT NULL | |
performed_by |
character varying(100) |
NOT NULL | |
performed_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
target_type |
character varying(100) |
||
reason |
text |
||
metadata |
jsonb |
||
aggressiveness |
numeric(4,3) |
||
zone |
character varying(20) |
||
vocab_size_before |
integer |
||
vocab_size_after |
integer |
Table constraints:
CONSTRAINT vocabulary_history_action_check CHECK (((action)::text = ANY ((ARRAY['added'::character varying, 'merged'::character varying, 'pruned'::character varying, 'deprecated'::character varying, 'reactivated'::character varying])::text[])))
vocabulary_suggestions
LLM-assisted vocabulary curation suggestions - ADR-026
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_api.vocabulary_suggestions_id_seq'::regclass) | |
relationship_type |
character varying(100) |
NOT NULL | |
suggestion_type |
character varying(50) |
NOT NULL | |
confidence |
numeric(3,2) |
NOT NULL | |
suggested_canonical_type |
character varying(100) |
||
suggested_category |
character varying(50) |
||
suggested_description |
text |
||
similar_types |
jsonb |
||
reasoning |
text |
||
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
reviewed |
boolean |
DEFAULT false | |
curator_decision |
character varying(50) |
||
curator_notes |
text |
Table constraints:
CONSTRAINT vocabulary_suggestions_confidence_check CHECK (((confidence >= (0)::numeric) AND (confidence <= (1)::numeric)))CONSTRAINT vocabulary_suggestions_suggestion_type_check CHECK (((suggestion_type)::text = ANY ((ARRAY['alias'::character varying, 'new_type'::character varying])::text[])))
worker_lanes
Worker lane configuration for database-driven job dispatch (ADR-100)
| Column | Type | Constraints | Description |
|---|---|---|---|
name |
text |
PK; NOT NULL | |
job_types |
text[] |
NOT NULL | |
max_slots |
integer |
NOT NULL; DEFAULT 1 | |
poll_interval_ms |
integer |
NOT NULL; DEFAULT 5000 | |
stale_timeout_minutes |
integer |
NOT NULL; DEFAULT 30 | |
enabled |
boolean |
NOT NULL; DEFAULT true | |
updated_at |
timestamp with time zone |
NOT NULL; DEFAULT now() |
worker_status
| Column | Type | Constraints | Description |
|---|---|---|---|
worker_id |
character varying(100) |
PK; NOT NULL | |
last_heartbeat |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
current_job_id |
character varying(50) |
||
status |
character varying(50) |
NOT NULL | |
metadata |
jsonb |
Table constraints:
CONSTRAINT worker_status_status_check CHECK (((status)::text = ANY ((ARRAY['idle'::character varying, 'running'::character varying, 'error'::character varying, 'stopped'::character varying])::text[])))
kg_auth
Authentication and authorization (dynamic RBAC).
Relationships
erDiagram
users ||--o{ groups : "created_by"
oauth_clients ||--o{ oauth_access_tokens : "client_id"
users ||--o{ oauth_access_tokens : "user_id"
oauth_clients ||--o{ oauth_authorization_codes : "client_id"
users ||--o{ oauth_authorization_codes : "user_id"
users ||--o{ oauth_clients : "created_by"
oauth_clients ||--o{ oauth_device_codes : "client_id"
users ||--o{ oauth_device_codes : "user_id"
users ||--o{ oauth_external_provider_tokens : "user_id"
oauth_access_tokens ||--o{ oauth_refresh_tokens : "access_token_hash"
oauth_clients ||--o{ oauth_refresh_tokens : "client_id"
users ||--o{ oauth_refresh_tokens : "user_id"
users ||--o{ resource_grants : "granted_by"
resources ||--o{ resources : "parent_type"
resources ||--o{ role_permissions : "resource_type"
roles ||--o{ role_permissions : "inherited_from"
roles ||--o{ role_permissions : "role_name"
users ||--o{ role_permissions : "created_by"
roles ||--o{ roles : "parent_role"
users ||--o{ roles : "created_by"
groups ||--o{ user_groups : "group_id"
users ||--o{ user_groups : "added_by"
users ||--o{ user_groups : "user_id"
roles ||--o{ user_roles : "role_name"
users ||--o{ user_roles : "assigned_by"
users ||--o{ user_roles : "user_id"
roles ||--o{ users : "primary_role"
groups
Group definitions for collaborative access control (ADR-082)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_auth.groups_id_seq'::regclass) | |
group_name |
character varying(100) |
NOT NULL; UNIQUE | |
display_name |
character varying(200) |
||
description |
text |
||
is_system |
boolean |
DEFAULT false | System groups (public, admins) cannot be deleted |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
created_by |
integer |
FK → kg_auth.users(id) |
oauth_access_tokens
OAuth access tokens issued to clients
| Column | Type | Constraints | Description |
|---|---|---|---|
token_hash |
character varying(255) |
PK; NOT NULL | SHA256 hash of the actual token (tokens are not stored in plaintext) |
client_id |
character varying(255) |
NOT NULL; FK → kg_auth.oauth_clients(client_id) | |
user_id |
integer |
FK → kg_auth.users(id) | NULL for client_credentials grant (machine-to-machine), set for user-delegated grants |
scopes |
text[] |
||
expires_at |
timestamp with time zone |
NOT NULL | Access tokens expire in 1 hour |
revoked |
boolean |
DEFAULT false | |
created_at |
timestamp with time zone |
DEFAULT now() |
oauth_authorization_codes
Temporary authorization codes for OAuth Authorization Code flow (web apps)
| Column | Type | Constraints | Description |
|---|---|---|---|
code |
character varying(255) |
PK; NOT NULL | |
client_id |
character varying(255) |
NOT NULL; FK → kg_auth.oauth_clients(client_id) | |
user_id |
integer |
NOT NULL; FK → kg_auth.users(id) | |
redirect_uri |
text |
NOT NULL | |
scopes |
text[] |
||
code_challenge |
character varying(255) |
PKCE code challenge (hash of code verifier) | |
code_challenge_method |
character varying(10) |
||
expires_at |
timestamp with time zone |
NOT NULL | Authorization codes expire in 10 minutes |
used |
boolean |
DEFAULT false | |
created_at |
timestamp with time zone |
DEFAULT now() |
Table constraints:
CONSTRAINT oauth_authorization_codes_code_challenge_method_check CHECK (((code_challenge_method)::text = ANY ((ARRAY['S256'::character varying, 'plain'::character varying])::text[])))
oauth_clients
OAuth 2.0 client applications registered to use the API
| Column | Type | Constraints | Description |
|---|---|---|---|
client_id |
character varying(255) |
PK; NOT NULL | |
client_secret_hash |
character varying(255) |
||
client_name |
character varying(255) |
NOT NULL | |
client_type |
character varying(50) |
NOT NULL | public = no client secret (CLI, web apps), confidential = has client secret (MCP server) |
grant_types |
text[] |
NOT NULL | Allowed OAuth grant types: authorization_code, urn:ietf:params:oauth:grant-type:device_code, client_credentials, refresh_token |
redirect_uris |
text[] |
||
scopes |
text[] |
||
is_active |
boolean |
DEFAULT true | |
created_by |
integer |
FK → kg_auth.users(id) | |
created_at |
timestamp with time zone |
DEFAULT now() | |
metadata |
jsonb |
DEFAULT '{}'::jsonb |
Table constraints:
CONSTRAINT oauth_clients_client_type_check CHECK (((client_type)::text = ANY ((ARRAY['public'::character varying, 'confidential'::character varying])::text[])))
oauth_device_codes
Device authorization codes for OAuth Device Authorization Grant flow (CLI tools)
| Column | Type | Constraints | Description |
|---|---|---|---|
device_code |
character varying(255) |
PK; NOT NULL | Long code used by device for polling |
user_code |
character varying(50) |
NOT NULL; UNIQUE | Human-friendly code displayed to user (e.g., ABCD-1234) |
client_id |
character varying(255) |
NOT NULL; FK → kg_auth.oauth_clients(client_id) | |
user_id |
integer |
FK → kg_auth.users(id) | |
scopes |
text[] |
||
status |
character varying(50) |
DEFAULT 'pending'::character varying | |
expires_at |
timestamp with time zone |
NOT NULL | Device codes expire in 10 minutes |
created_at |
timestamp with time zone |
DEFAULT now() |
Table constraints:
CONSTRAINT oauth_device_codes_status_check CHECK (((status)::text = ANY ((ARRAY['pending'::character varying, 'authorized'::character varying, 'denied'::character varying, 'expired'::character varying])::text[])))
oauth_external_provider_tokens
OAuth tokens FROM external providers (Google, GitHub, etc.) - not tokens issued by our system
| Column | Type | Constraints | Description |
|---|---|---|---|
token_hash |
character varying(255) |
PK; NOT NULL | |
user_id |
integer |
FK → kg_auth.users(id) | |
provider |
character varying(50) |
||
scopes |
text[] |
||
expires_at |
timestamp with time zone |
NOT NULL |
oauth_refresh_tokens
OAuth refresh tokens for long-lived sessions
| Column | Type | Constraints | Description |
|---|---|---|---|
token_hash |
character varying(255) |
PK; NOT NULL | |
client_id |
character varying(255) |
NOT NULL; FK → kg_auth.oauth_clients(client_id) | |
user_id |
integer |
NOT NULL; FK → kg_auth.users(id) | |
scopes |
text[] |
||
access_token_hash |
character varying(255) |
FK → kg_auth.oauth_access_tokens(token_hash) | |
expires_at |
timestamp with time zone |
NOT NULL | Refresh tokens expire in 7 days (CLI) or 30 days (web) |
revoked |
boolean |
DEFAULT false | |
created_at |
timestamp with time zone |
DEFAULT now() | |
last_used |
timestamp with time zone |
Updated when refresh token is used to obtain new access token |
resource_grants
Instance-level access grants for owned resources (ADR-082)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_auth.resource_grants_id_seq'::regclass) | |
resource_type |
character varying(50) |
NOT NULL | Type: ontology, artifact, report, etc. |
resource_id |
character varying(200) |
NOT NULL | Specific resource identifier |
principal_type |
character varying(20) |
NOT NULL | Grant to user or group |
principal_id |
integer |
NOT NULL | |
permission |
character varying(20) |
NOT NULL | read, write, or admin access |
granted_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
granted_by |
integer |
FK → kg_auth.users(id) |
Table constraints:
CONSTRAINT resource_grants_permission_check CHECK (((permission)::text = ANY ((ARRAY['read'::character varying, 'write'::character varying, 'admin'::character varying])::text[])))CONSTRAINT resource_grants_principal_type_check CHECK (((principal_type)::text = ANY ((ARRAY['user'::character varying, 'group'::character varying])::text[])))UNIQUE (resource_type, resource_id, principal_type, principal_id, permission)
resources
Dynamic resource type registry (ADR-028)
| Column | Type | Constraints | Description |
|---|---|---|---|
resource_type |
character varying(100) |
PK; NOT NULL | |
description |
text |
||
parent_type |
character varying(100) |
FK → kg_auth.resources(resource_type) | |
available_actions |
character varying(50)[] |
NOT NULL | |
supports_scoping |
boolean |
DEFAULT false | |
metadata |
jsonb |
DEFAULT '{}'::jsonb | |
registered_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
registered_by |
character varying(100) |
role_permissions
Dynamic role permissions with scoping (ADR-028)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_auth.role_permissions_id_seq'::regclass) | |
role_name |
character varying(50) |
NOT NULL; FK → kg_auth.roles(role_name) | |
resource_type |
character varying(100) |
NOT NULL; FK → kg_auth.resources(resource_type) | |
action |
character varying(50) |
NOT NULL | |
scope_type |
character varying(50) |
||
scope_id |
character varying(200) |
||
scope_filter |
jsonb |
||
granted |
boolean |
NOT NULL; DEFAULT true | |
inherited_from |
character varying(50) |
FK → kg_auth.roles(role_name) | |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
created_by |
integer |
FK → kg_auth.users(id) |
roles
Dynamic role definitions with inheritance (ADR-028)
| Column | Type | Constraints | Description |
|---|---|---|---|
role_name |
character varying(50) |
PK; NOT NULL | |
display_name |
character varying(100) |
NOT NULL | |
description |
text |
||
is_builtin |
boolean |
DEFAULT false | |
is_active |
boolean |
DEFAULT true | |
parent_role |
character varying(50) |
FK → kg_auth.roles(role_name) | |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
created_by |
integer |
FK → kg_auth.users(id) | |
metadata |
jsonb |
DEFAULT '{}'::jsonb |
user_groups
Group membership assignments (ADR-082)
| Column | Type | Constraints | Description |
|---|---|---|---|
user_id |
integer |
PK; NOT NULL; FK → kg_auth.users(id) | |
group_id |
integer |
PK; NOT NULL; FK → kg_auth.groups(id) | |
added_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
added_by |
integer |
FK → kg_auth.users(id) |
user_roles
User role assignments with optional scoping (ADR-028)
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_auth.user_roles_id_seq'::regclass) | |
user_id |
integer |
NOT NULL; FK → kg_auth.users(id) | |
role_name |
character varying(50) |
NOT NULL; FK → kg_auth.roles(role_name) | |
scope_type |
character varying(50) |
||
scope_id |
character varying(200) |
||
assigned_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
assigned_by |
integer |
FK → kg_auth.users(id) | |
expires_at |
timestamp with time zone |
Table constraints:
UNIQUE (user_id, role_name, scope_type, scope_id)
users
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_auth.users_id_seq'::regclass) | |
username |
character varying(100) |
NOT NULL; UNIQUE | |
password_hash |
character varying(255) |
NOT NULL | |
primary_role |
character varying(50) |
NOT NULL; FK → kg_auth.roles(role_name) | Primary role (backwards compatibility) - user can have additional roles in user_roles table |
created_at |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
last_login |
timestamp with time zone |
||
disabled |
boolean |
DEFAULT false |
kg_logs
Observability: audit trails, metrics, health.
api_metrics
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_logs.api_metrics_id_seq'::regclass) | |
timestamp |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
endpoint |
character varying(200) |
NOT NULL | |
method |
character varying(10) |
NOT NULL | |
status_code |
integer |
NOT NULL | |
duration_ms |
numeric(10,2) |
NOT NULL | |
client_id |
character varying(100) |
||
error_message |
text |
audit_trail
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_logs.audit_trail_id_seq'::regclass) | |
timestamp |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
user_id |
integer |
||
action |
character varying(100) |
NOT NULL | |
resource_type |
character varying(50) |
NOT NULL | |
resource_id |
character varying(200) |
||
details |
jsonb |
||
ip_address |
inet |
||
user_agent |
text |
||
outcome |
character varying(50) |
NOT NULL |
Table constraints:
CONSTRAINT audit_trail_outcome_check CHECK (((outcome)::text = ANY ((ARRAY['success'::character varying, 'denied'::character varying, 'error'::character varying])::text[])))
health_checks
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_logs.health_checks_id_seq'::regclass) | |
timestamp |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
service |
character varying(50) |
NOT NULL | |
status |
character varying(50) |
NOT NULL | |
metrics |
jsonb |
Table constraints:
CONSTRAINT health_checks_status_check CHECK (((status)::text = ANY ((ARRAY['healthy'::character varying, 'degraded'::character varying, 'down'::character varying])::text[])))
job_events
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
integer |
PK; NOT NULL; DEFAULT nextval('kg_logs.job_events_id_seq'::regclass) | |
job_id |
character varying(50) |
NOT NULL | |
timestamp |
timestamp with time zone |
NOT NULL; DEFAULT now() | |
event_type |
character varying(50) |
NOT NULL | |
details |
jsonb |
Migration history
Schema evolves through numbered migrations under schema/migrations/. Each is recorded in public.schema_migrations when applied. The baseline (00_baseline.sql) is a generated checkpoint consolidating migrations 001–080 (see schema/migrations/archived/); migrations after it are applied in order.
No post-checkpoint migrations yet — the baseline is the complete current schema.