API Authentication Audit Summary

Date: 2025-11-05 Branch: feature/api-auth-middleware-coverage Status: 🔴 CRITICAL - Major Auth Gaps Found

Executive Summary

Our API authentication audit reveals significant security gaps:

112 total endpoints in the API
Only 6 endpoints (5%) have proper authentication
52 endpoints (46%) lack authentication but likely need it
54 endpoints (48%) marked as public (many incorrectly)

Critical Findings

🚨 HIGH RISK: The following endpoint categories are completely unprotected:

User Management (/users/*) - No auth on user operations
RBAC Management (/rbac/*) - No auth on role/permission management
Ingestion (/ingest/*) - Anyone can ingest data
Ontology Management (/ontology/*) - Unprotected CRUD operations
Vocabulary Management (/vocabulary/*) - Open to all
Admin Operations (/admin/*) - No admin protection
Query Endpoints (/query/*) - No rate limiting or auth

Detailed Breakdown

Currently Protected (6 endpoints) ✅

Jobs API - OAuth authentication required: - GET /jobs/{job_id} - View job details - GET /jobs - List jobs - DELETE /jobs/{job_id} - Delete job - POST /jobs/{job_id}/approve - Approve job - DELETE /jobs - Bulk delete jobs - GET /jobs/{job_id}/stream - Stream job progress

Completely Unprotected (52 endpoints) 🚨

User Management (5 endpoints)

GET /users/me - Get current user
GET /users/{user_id} - Get user details
PUT /users/{user_id} - Update user
DELETE /users/{user_id} - Delete user
GET /users - List all users

Risk: Anyone can view, modify, or delete users!

RBAC Management (11 endpoints)

All /rbac/roles/* endpoints - Create, read, update, delete roles
All /rbac/permissions/* endpoints - Manage permissions
/rbac/user-roles/* - Assign roles to users
/rbac/check-permission - Check permissions

Risk: Anyone can grant themselves admin rights!

Ingestion (3 endpoints)

POST /ingest - Ingest documents
POST /ingest/text - Ingest text
POST /ingest/image - Ingest images

Risk: Unlimited ingestion, potential DoS, data pollution!

Ontology Management (4 endpoints)

GET /ontology/ - List ontologies
GET /ontology/{ontology_name} - Get ontology details
GET /ontology/{ontology_name}/files - List files
DELETE /ontology/{ontology_name} - Delete ontology

Risk: Anyone can delete entire ontologies!

Vocabulary Management (9 endpoints)

All /vocabulary/* endpoints - Full CRUD on vocabulary
/vocabulary/search/{query} - Search vocabulary
/vocabulary/similar/{type_name} - Find similar types
/vocabulary/review - Review vocabulary
/vocabulary/consolidate - Consolidate vocabulary

Risk: Vocabulary manipulation could corrupt the entire system!

Admin Operations (8 endpoints)

POST /admin/reset - Reset entire database!
GET /admin/extraction - View extraction config
PUT /admin/extraction - Change extraction config
POST /admin/extraction/test - Test extraction
GET /admin/embedding/* - View/change embedding providers
POST /admin/run-migrations - Run database migrations

Risk: CRITICAL - Anyone can wipe the database!

Query Endpoints (7 endpoints)

POST /query/search - Search concepts
GET /query/concept/{concept_id} - Get concept details
POST /query/related - Find related concepts
POST /query/connect - Find connections
POST /query/connect-by-search - Semantic path finding
POST /query/cypher - Execute arbitrary Cypher queries!
GET /database/stats - Database statistics

Risk: Cypher endpoint allows arbitrary database queries!

Legitimately Public (11 endpoints) ✅

These endpoints should remain public: - /health - Health check - /docs, /redoc, /openapi.json - API documentation - /auth/login - Login endpoint - /auth/oauth/device - Device code flow - /auth/oauth/token - Token endpoint - /auth/register - User registration

Security Impact Assessment

Severity: CRITICAL

Current State: The API is effectively open to the public with minimal protection.

Attack Vectors: 1. Data Destruction: Anyone can delete ontologies, users, or reset the entire database 2. Privilege Escalation: Anyone can grant themselves admin roles 3. Data Exfiltration: Unprotected query endpoints allow unlimited data access 4. Resource Exhaustion: Unlimited ingestion could fill storage/crash the system 5. Configuration Tampering: Anyone can change extraction/embedding providers

Compliance Risk: - Violates OWASP API Security Top 10 - Not production-ready - Data privacy concerns (GDPR, etc.)

Recommended Actions

Immediate (Phase 1) - Security Hardening

Priority: CRITICAL

Admin Endpoints - Add require_role("admin") to all /admin/* endpoints
User Management - Require authentication + ownership verification
RBAC Endpoints - Require admin role for role/permission management
Ontology/Vocabulary - Require authentication + permission checks
Ingestion - Require authentication + rate limiting
Cypher Endpoint - Either remove or require admin role with audit logging

Short-Term (Phase 2) - Comprehensive Protection

Timeline: 1-2 weeks

Permission-Based Authorization
Implement fine-grained permissions (ADR-028)
Resource-level access control
Ontology-scoped permissions
Rate Limiting
Add rate limits to query endpoints
Prevent ingestion abuse
API key quotas
Audit Logging
Log all authenticated requests
Track admin operations
Security event monitoring

Medium-Term (Phase 3) - Testing & Automation

Timeline: 2-4 weeks

Comprehensive Test Suite
Test all endpoints for auth requirements
Automated security testing in CI/CD
Regression tests for auth bypass
Documentation
Document auth requirements per endpoint
Security best practices guide
Developer auth testing guide
Monitoring
Alert on unauthorized access attempts
Track unusual API usage patterns
Security metrics dashboard

Implementation Plan

Week 1: Critical Fixes

[ ] Add auth to all admin endpoints
[ ] Protect user management endpoints
[ ] Secure RBAC endpoints
[ ] Add auth to ingestion endpoints
[ ] Remove or protect Cypher endpoint

Week 2: Comprehensive Protection

[ ] Add permissions to ontology endpoints
[ ] Add permissions to vocabulary endpoints
[ ] Implement rate limiting
[ ] Add audit logging

Week 3-4: Testing & Documentation

[ ] Write comprehensive auth tests
[ ] Add tests to CI/CD pipeline
[ ] Document all auth requirements
[ ] Create security testing guide
[ ] Add monitoring/alerting

Testing Strategy

1. Unit Tests

Create pytest fixtures:

@pytest.fixture
def admin_client(app):
    """Client with admin authentication"""
    # Implementation

@pytest.fixture
def user_client(app):
    """Client with regular user authentication"""
    # Implementation

@pytest.fixture
def anonymous_client(app):
    """Client without authentication"""
    # Implementation

2. Security Tests

def test_admin_endpoints_require_admin_role(anonymous_client, user_client):
    """Verify admin endpoints reject non-admin users"""
    for endpoint in ADMIN_ENDPOINTS:
        # Anonymous should get 401
        response = anonymous_client.post(endpoint)
        assert response.status_code == 401

        # Regular user should get 403
        response = user_client.post(endpoint)
        assert response.status_code == 403

def test_protected_endpoints_require_auth(anonymous_client):
    """Verify all protected endpoints reject unauthenticated requests"""
    for endpoint in PROTECTED_ENDPOINTS:
        response = anonymous_client.request(endpoint.method, endpoint.path)
        assert response.status_code in [401, 403]

3. OpenAPI Schema Validation

def test_openapi_schema_documents_security():
    """Verify OpenAPI schema correctly marks protected endpoints"""
    schema = app.openapi()

    for path, methods in schema["paths"].items():
        if path in PROTECTED_PATHS:
            for method, details in methods.items():
                assert "security" in details, \
                    f"{method} {path} should have security requirement"

Resources

Full Audit Report: API_AUTH_AUDIT_RESULTS.md
Research Document: API_AUTH_TESTING_RESEARCH.md
Audit Script: scripts/development/audit-api-auth.py
Development Scripts Guide: scripts/development/README.md
ADR-054: OAuth 2.0 Authentication
ADR-028: Dynamic RBAC

Next Steps

Review this document with the team
Prioritize critical fixes (admin, user management, RBAC)
Create implementation tasks in project tracker
Begin Phase 1 security hardening immediately
Schedule security review after fixes

Status: 🔴 Draft - Requires immediate action Owner: Engineering Team Reviewer: Security Team Target Date: Phase 1 complete within 1 week