Skip to content

Configuration Reference

This page covers all configuration files and environment variables for a Kappa Graph self-hosted deployment.


Configuration files

File Purpose
.env Environment variables — secrets, database, AI provider, job scheduler
.operator.conf Operator settings — container naming, compose overlays, router mode

Both files live in the project root. .env is generated by ./operator.sh init and must never be edited by hand; the operator regenerates secrets correctly. .operator.conf is also generated by init (guided or headless) and may be edited to change overlay settings.


.env — environment variables

Core secrets

Generated during ./operator.sh init. Do not edit.

Variable Purpose
ENCRYPTION_KEY Fernet key — encrypts API keys stored in the database
OAUTH_SIGNING_KEY Signs JWT access tokens
INTERNAL_KEY_SERVICE_SECRET Service-to-service authentication token

Database

Kappa Graph runs PostgreSQL 18 with Apache AGE 1.7.0.

Variable Default Description
POSTGRES_HOST localhost Database host (postgres inside containers)
POSTGRES_PORT 5432 Database port
POSTGRES_DB knowledge_graph Database name
POSTGRES_USER admin Database user
POSTGRES_PASSWORD (generated) Database password

Web and OAuth

Variable Default Description
WEB_HOSTNAME localhost:3000 Public hostname for web access
ACCESS_TOKEN_EXPIRE_MINUTES 60 Token validity period

WEB_HOSTNAME is used to derive OAuth redirect URIs (https://{WEB_HOSTNAME}/callback) and the API URL referenced by the frontend.

AI provider

These variables apply only when DEVELOPMENT_MODE=true. In production, the API loads provider configuration from the database (set via ./operator.sh shellconfigure.py).

Variable Default Description
DEVELOPMENT_MODE false true — load from .env; false — load from database
AI_PROVIDER openai openai, anthropic, or mock
OPENAI_API_KEY OpenAI API key
ANTHROPIC_API_KEY Anthropic API key
OPENAI_EXTRACTION_MODEL gpt-4o Model for concept extraction
OPENAI_EMBEDDING_MODEL text-embedding-3-small Model for embeddings
ANTHROPIC_EXTRACTION_MODEL claude-sonnet-4-20250514 Anthropic extraction model

Object storage (Garage)

Variable Default Description
GARAGE_S3_ENDPOINT http://garage:3900 Garage S3 endpoint
GARAGE_REGION garage Region name
GARAGE_BUCKET kg-storage Default bucket
GARAGE_RPC_SECRET (generated) Cluster coordination secret

Job scheduler

Variable Default Description
JOB_CLEANUP_INTERVAL 3600 Cleanup interval (seconds)
JOB_APPROVAL_TIMEOUT 24 Cancel unapproved jobs after (hours)
JOB_COMPLETED_RETENTION 48 Delete completed jobs after (hours)
JOB_FAILED_RETENTION 168 Delete failed jobs after (hours)
MAX_CONCURRENT_JOBS 4 Maximum parallel ingestion jobs

AMD GPU (optional)

Set only when using AMD GPU with ROCm host mode.

Variable Description
HSA_OVERRIDE_GFX_VERSION Override GPU architecture (e.g., 10.3.0)
ROCR_VISIBLE_DEVICES Limit visible GPUs (e.g., 0)

.operator.conf — operator settings

Generated by ./operator.sh init. Variables in this file are sourced as shell variables when any operator.sh subcommand runs.

Variable Default Description
DEV_MODE false true adds the dev overlay (hot reload, source mounts)
GPU_MODE cpu cpu, nvidia, amd-host, or mac
KG_API_IMAGE_TAG (derived) Image tag for the API container; derived from GPU_MODE
ROUTER_MODE none none or traefik
TLS_MODE none none, selfsigned, manual, or letsencrypt
EXTERNAL_URL Public base URL (e.g., https://kg.example.com)
CONTAINER_PREFIX knowledge-graph Container name prefix
CONTAINER_SUFFIX Container name suffix (e.g., -dev)
COMPOSE_FILE docker-compose.yml Base compose file
IMAGE_SOURCE ghcr local or ghcr

Compose file selection

The operator builds the Docker Compose command by stacking overlays in a fixed order. The table below shows which overlays apply under which conditions.

Overlay Applies when
docker-compose.yml Always (base)
docker-compose.ghcr.yml IMAGE_SOURCE=ghcr
docker-compose.standalone.yml Present in docker/ (curl-installer deployments)
docker-compose.ssl.yml docker/docker-compose.ssl.yml is present
docker-compose.traefik.yml ROUTER_MODE=traefik
docker-compose.traefik-tls.yml ROUTER_MODE=traefik and TLS_MODE is selfsigned, manual, or letsencrypt
docker-compose.traefik-tls-manual.yml ROUTER_MODE=traefik and TLS_MODE=manual
docker-compose.traefik-tls-letsencrypt.yml ROUTER_MODE=traefik and TLS_MODE=letsencrypt
docker-compose.dev.yml DEV_MODE=true
docker-compose.gpu-nvidia.yml GPU_MODE=nvidia
docker-compose.gpu-amd-host.yml GPU_MODE=amd or GPU_MODE=amd-host
docker-compose.override.mac.yml GPU_MODE=mac

Overlays are applied in the order shown. A later overlay's settings take precedence over earlier ones.


Runtime configuration

Runtime settings — AI provider, embedding profile, API keys, admin credentials — are stored encrypted in the database and managed through the operator shell.

./operator.sh shell

AI provider and embedding

# Set the extraction provider and model
configure.py ai-provider anthropic --model claude-sonnet-4

# Store an API key (encrypted in the database; takes effect without restart)
configure.py api-key anthropic --key "sk-ant-..."

# Select an embedding profile
configure.py embedding --provider local

# Show current configuration
configure.py status

API keys and extraction provider changes take effect immediately — the next request picks up the new configuration without an API restart.

Embedding profile changes take effect after a hot reload. Activate the target profile, then reload the model:

# Activate the profile (writes to database)
kg admin embedding activate <profile-id>

# Hot-reload the model (zero-downtime, no API restart)
kg admin embedding reload

Admin credentials

# Set or rotate the admin password (creates the user if absent)
configure.py admin --username admin --password <new-password>

# Prompt for password instead
configure.py admin --username admin

configure.py manages the admin user, AI providers, API keys, the model catalog (models), and OAuth clients (oauth). For broader user and role management, use the API at /users or the web admin UI.