Skip to main content

Inference Providers

Enclava routes inference requests through providers that run models inside hardware-protected enclaves. Your prompts and responses stay encrypted during processing - the provider's infrastructure can't read them.

Multi-Provider Routing

Enclava uses a pluggable multi-provider architecture. When you make an inference request, the platform:

  1. Identifies available providers for the requested model
  2. Selects by priority - PrivateMode (EU) is primary, RedPill (US) is secondary
  3. Respects preferences - Chatbots can specify preferred providers
  4. Handles failover - Automatic fallback if a provider is unhealthy

You don't need to specify providers in your API calls - routing is automatic.

Current Providers

ProviderRegionPriorityFocus
Private ModeEU1 (primary)EU data residency, Intel SGX/AMD SEV
RedPillUS2 (secondary)Multi-backend aggregator, confidential-only models

Private Mode

Built by Edgeless Systems, Private Mode runs inference on Intel SGX and AMD SEV hardware in European data centers.

The main selling points:

  • Data never leaves the EU
  • No logging, no storage - stateless by design
  • You can verify the enclave before sending data (remote attestation)

Compliance-wise, they claim GDPR compliance (Articles 25 & 32), HIPAA eligibility, and ISO 27001. SOC 2 Type II is in progress.

Docs: docs.privatemode.ai

RedPill

RedPill aggregates multiple confidential compute backends. Through Enclava, you get access to confidential-only models that run entirely within TEEs.

Confidential model prefixes (exposed through Enclava):

  • phala/* - Phala Network TEE models
  • tinfoil/* - Tinfoil confidential models
  • nearai/* - NEAR AI confidential models

Non-confidential models are filtered out - Enclava only routes to fully TEE-protected backends.

What's interesting:

  • Open source codebase you can audit
  • Anonymity layer for third-party models - OpenAI/Anthropic don't see who you are
  • HIPAA-ready, designed for attorney-client privilege use cases
  • Continuous attestation verification

Docs: docs.redpill.ai

What the TEE protects against

Both providers use the same basic security model:

  • Cloud operators can't see your data - the TEE hardware isolates the enclave
  • The provider can't see your data - end-to-end encryption into the enclave
  • No logs exist to subpoena - zero-retention architecture
  • Network sniffing gets you nothing - TLS plus TEE encryption

You're trusting that Intel/AMD implemented their TEE correctly and that the cryptography holds. You should also verify attestation before sending sensitive data - that's what proves you're actually talking to a real enclave.

Provider Capabilities

CapabilityPrivate ModeRedPill
Chat completionsYesYes
StreamingYesYes
EmbeddingsYesYes
Vision modelsYesYes
Function callingYesYes
TEE protectionYesYes
AttestationOn requestContinuous

Resilience Configuration

Each provider operates independently with its own resilience settings:

SettingPrivate ModeRedPill
Rate limit20 req/min60 req/min
Timeout60 seconds60 seconds
Max retries33
Circuit breaker threshold5 failures5 failures
Circuit breaker reset120 seconds120 seconds

When a provider hits its circuit breaker threshold, it's temporarily disabled. Requests automatically route to the next available provider.

Model Discovery

Enclava dynamically discovers available models from each provider at startup and periodically refreshes the list. To see available models:

curl -X GET http://localhost/api/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"

Response includes model capabilities (chat, vision, embeddings) and which provider serves each model.