Platform Architecture
System Overview
Enclava is an open source AI platform with multi-provider confidential inference.
┌─────────────────────────────────────────────────────┐
│ Client Layer │
│ Web Dashboard | API Clients │
└─────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────┐
│ API Gateway │
│ Public API (/api/v1) | Internal API (/api-int) │
└─────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────┐
│ Backend Services │
│ FastAPI | Plugins | Modules | Background Tasks │
└─────────────────────────────────────────────────────┘
│ │
┌───────────────────────┐ ┌───────────────────────┐
│ LLM Service │ │ Data Layer │
│ Provider Routing │ │ PostgreSQL | Redis │
│ Model Discovery │ │ Qdrant | File Storage │
└───────────────────────┘ └───────────────────────┘
│
┌─────────────────────────────────────────────────────┐
│ TEE Inference Providers │
│ Private Mode (EU) | RedPill (US) │
└─────────────────────────────────────────────────────┘
Core Components
Backend (Python/FastAPI)
- Framework: FastAPI with async/await support
- Architecture: Modular plugin system for extensibility
- Security: JWT authentication, API key management, threat detection
Frontend (Next.js/React)
- Framework: Next.js 14 with TypeScript
- Features: SSR, real-time updates, responsive design
- State Management: React Context + SWR for data fetching
- UI Components: Tailwind CSS + shadcn/ui
Data Layer
| Component | Purpose | Technology |
|---|---|---|
| PostgreSQL | Primary database | Relational data, user management |
| Redis | Caching & sessions | Fast key-value storage |
| Qdrant | Vector database | RAG embeddings and semantic search |
| File Storage | Documents | Local filesystem with encryption |
Inference Provider Routing
Enclava routes LLM requests through multiple TEE-protected inference providers.
Multi-Provider Architecture
┌─────────────────────────────────────────┐
│ LLM Service │
│ ┌─────────────────────────────────┐ │
│ │ Provider Router │ │
│ │ - Model → Provider mapping │ │
│ │ - Priority-based selection │ │
│ │ - Chatbot preferences │ │
│ └─────────────────────────────────┘ │
│ │ │
│ ┌────────────────┴────────────────┐ │
│ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐
│ │PrivateMode │ │ RedPill │
│ │ Priority: 1 │ │ Priority: 2 │
│ │ Region: EU │ │ Region: US │
│ └─────────────┘ └─────────────┘
└─────────────────────────────────────────┘
Provider Selection
| Factor | Description |
|---|---|
| Model availability | Only providers supporting the requested model |
| Priority | Lower priority number = preferred provider |
| Chatbot preferences | Chatbots can specify preferred providers |
| Health status | Unhealthy providers are skipped |
| Attestation | RedPill requires valid attestation |
Resilience
Each provider has independent resilience settings:
- Retry logic - Automatic retries with exponential backoff
- Circuit breaker - Disable unhealthy providers temporarily
- Timeout handling - Per-provider timeout configuration
- Rate limiting - Respect provider rate limits
See Inference Providers for provider-specific details.
API Architecture
Dual API Design
Public API (/api/v1/):
- Authentication: API Keys
- Rate Limits: 1000 req/min (standard)
- Use Case: External integrations
- OpenAI Compatible: Yes
Internal API (/api-internal/v1/):
- Authentication: JWT tokens
- Rate Limits: 300 req/min
- Use Case: Frontend only
- Features: Full platform access
OpenAI Compatibility
Full compatibility with OpenAI API format for easy migration:
/chat/completions- Chat endpoint/embeddings- Embedding generation/models- List available models
Plugin System
Architecture
class BasePlugin:
def __init__(self):
self.name = "plugin_name"
self.version = "1.0.0"
async def execute(self, context):
# Plugin logic here
pass
Features
- Auto-discovery from
plugins/directory - Sandboxed execution environment
- Configuration via UI or API
- Custom API endpoint registration
Deployment Architecture
Container Stack
services:
backend:
image: enclava/backend:latest
replicas: 3
frontend:
image: enclava/frontend:latest
replicas: 2
postgres:
image: postgres:15
redis:
image: redis:7-alpine
qdrant:
image: qdrant/qdrant:latest
Infrastructure Requirements
- Minimum: 4 CPU cores, 8GB RAM
- Recommended: 8 CPU cores, 16GB RAM
- Storage: 100GB for documents and vectors
- Network: 100Mbps for optimal performance
For API specifications, see API Reference