Platform Architecture

System Overview

Enclava is an open source AI platform with multi-provider confidential inference.

┌─────────────────────────────────────────────────────┐
│                   Client Layer                      │
│            Web Dashboard | API Clients              │
└─────────────────────────────────────────────────────┘
                           │
┌─────────────────────────────────────────────────────┐
│                    API Gateway                      │
│     Public API (/api/v1) | Internal API (/api-int)  │
└─────────────────────────────────────────────────────┘
                           │
┌─────────────────────────────────────────────────────┐
│                  Backend Services                   │
│   FastAPI | Plugins | Modules | Background Tasks    │
└─────────────────────────────────────────────────────┘
            │                           │
┌───────────────────────┐   ┌───────────────────────┐
│      LLM Service      │   │    Data Layer         │
│  Provider Routing     │   │ PostgreSQL | Redis    │
│  Model Discovery      │   │ Qdrant | File Storage │
└───────────────────────┘   └───────────────────────┘
            │
┌─────────────────────────────────────────────────────┐
│            TEE Inference Providers                  │
│     Private Mode (EU)  |  RedPill (US)              │
└─────────────────────────────────────────────────────┘

Core Components

Backend (Python/FastAPI)

Framework: FastAPI with async/await support
Architecture: Modular plugin system for extensibility
Security: JWT authentication, API key management, threat detection

Frontend (Next.js/React)

Framework: Next.js 14 with TypeScript
Features: SSR, real-time updates, responsive design
State Management: React Context + SWR for data fetching
UI Components: Tailwind CSS + shadcn/ui

Data Layer

Component	Purpose	Technology
PostgreSQL	Primary database	Relational data, user management
Redis	Caching & sessions	Fast key-value storage
Qdrant	Vector database	RAG embeddings and semantic search
File Storage	Documents	Local filesystem with encryption

Inference Provider Routing

Enclava routes LLM requests through multiple TEE-protected inference providers.

Multi-Provider Architecture

┌─────────────────────────────────────────┐
│              LLM Service                │
│  ┌─────────────────────────────────┐    │
│  │     Provider Router             │    │
│  │  - Model → Provider mapping     │    │
│  │  - Priority-based selection     │    │
│  │  - Chatbot preferences          │    │
│  └─────────────────────────────────┘    │
│                   │                     │
│  ┌────────────────┴────────────────┐    │
│  ▼                                 ▼    │
│ ┌─────────────┐           ┌─────────────┐
│ │PrivateMode  │           │  RedPill    │
│ │ Priority: 1 │           │ Priority: 2 │
│ │ Region: EU  │           │ Region: US  │
│ └─────────────┘           └─────────────┘
└─────────────────────────────────────────┘

Provider Selection

Factor	Description
Model availability	Only providers supporting the requested model
Priority	Lower priority number = preferred provider
Chatbot preferences	Chatbots can specify preferred providers
Health status	Unhealthy providers are skipped
Attestation	RedPill requires valid attestation

Resilience

Each provider has independent resilience settings:

Retry logic - Automatic retries with exponential backoff
Circuit breaker - Disable unhealthy providers temporarily
Timeout handling - Per-provider timeout configuration
Rate limiting - Respect provider rate limits

See Inference Providers for provider-specific details.

API Architecture

Dual API Design

Public API (/api/v1/):
  - Authentication: API Keys
  - Rate Limits: 1000 req/min (standard)
  - Use Case: External integrations
  - OpenAI Compatible: Yes

Internal API (/api-internal/v1/):
  - Authentication: JWT tokens
  - Rate Limits: 300 req/min
  - Use Case: Frontend only
  - Features: Full platform access

OpenAI Compatibility

Full compatibility with OpenAI API format for easy migration:

/chat/completions - Chat endpoint
/embeddings - Embedding generation
/models - List available models

Plugin System

Architecture

class BasePlugin:
    def __init__(self):
        self.name = "plugin_name"
        self.version = "1.0.0"
    
    async def execute(self, context):
        # Plugin logic here
        pass

Features

Auto-discovery from plugins/ directory
Sandboxed execution environment
Configuration via UI or API
Custom API endpoint registration

Deployment Architecture

Container Stack

services:
  backend:
    image: enclava/backend:latest
    replicas: 3
    
  frontend:
    image: enclava/frontend:latest
    replicas: 2
    
  postgres:
    image: postgres:15
    
  redis:
    image: redis:7-alpine
    
  qdrant:
    image: qdrant/qdrant:latest

Infrastructure Requirements

Minimum: 4 CPU cores, 8GB RAM
Recommended: 8 CPU cores, 16GB RAM
Storage: 100GB for documents and vectors
Network: 100Mbps for optimal performance

For API specifications, see API Reference

System Overview​

Core Components​

Backend (Python/FastAPI)​

Frontend (Next.js/React)​

Data Layer​

Inference Provider Routing​

Multi-Provider Architecture​

Provider Selection​

Resilience​

API Architecture​

Dual API Design​

OpenAI Compatibility​

Plugin System​

Architecture​

Features​

Deployment Architecture​

Container Stack​

Infrastructure Requirements​